IMDb_Scraper

A fun projects made using Scrapy. The Spiders included in this are able to extract Movie, TV-Series, TV-Movies based on year and title type. A lot more to come features ahead

Run

Create and activate virtual env

Python3

>> python3 -m venv venv
>> . ./venv/bin/activate

Anaconda

>> conda create --name venv
>> conda activate venv

Dependencies

Scrapy

Extracted information

IMDb Scraper extracts the following attributes from IMDb websites. Also, have a look at an examplary json and CSV file extracted by IMDb Scraper.

Movie Name
Movie ID
Movie URL
Poster
Year
Genre
RunTime
Certificate
Rating
MetaScore
Plot
Votes
Gross
Director
Director ID
Director URL
Cast
Cast ID
Cast URL

Install dependencies

Use the package manager pip to install following

>> pip install -r requirements.txt
>> pip install scrapy

Anaconda

>> conda install scrapy -y

TitleType was the main parameter to different title alongside release year to sort the release

feature
tv_series
tv_movie
tv_episode
tv_special
tv_miniseries
documentary
video_game
short
video
tv_short

Usage

>> scrapy crawl imdb_year -a title_type=feature -a year=2019

Save the output as a file

>> scrapy crawl imdb_year -a title_type=feature -a year=2019 -o output.csv

>> scrapy crawl imdb_year -a title_type=feature -a year=2019 -o output.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

IMDb_Scraper

Run

Create and activate virtual env

Dependencies

Extracted information

Install dependencies

TitleType was the main parameter to different title alongside release year to sort the release

Usage

Files

README.md

Latest commit

History

README.md

File metadata and controls

IMDb_Scraper

Run

Create and activate virtual env

Dependencies

Extracted information

Install dependencies

TitleType was the main parameter to different title alongside release year to sort the release

Usage