Skip to content

Latest commit

 

History

History
45 lines (28 loc) · 1.48 KB

README.md

File metadata and controls

45 lines (28 loc) · 1.48 KB

Web Scraper

Python License Last Commit

A powerful command-line web scraper tool that extracts content from websites and saves it to organized text files.

Web Scraper Demo

Features

  • Scrape content from a single URL or an entire sitemap
  • Group scraped content into separate files based on URL structure
  • Output content to multiple text files, organized by website sections
  • Executable file for easy use without Python installation

Installation

  1. Clone this repository: git clone https://github.com/yourusername/web-scraper.git

  2. Install the required dependencies: pip install -r requirements.txt

Usage

To scrape a single URL: python web_scraper.py https://example.com

To scrape an entire sitemap: python web_scraper.py https://example.com --sitemap

Project Structure

  • web_scraper.py: Main script containing the web scraper logic
  • requirements.txt: List of Python dependencies

Executable

A pre-built executable is available in the dist folder. You can download and run it directly without needing to install Python or any dependencies.

License

This project is open source and available under the MIT License.