Skip to content

fonsecagabriella/Web_Scraping

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Web Scraping

⚠️ WEB SCRAPING ETIQUETTE⚠️

Always play nice with the websites you're scraping; check out their rules and get the green light if needed. Steer clear of swiping personal stuff and be copyright-conscious. Oh, and stay in the know about the legal side of scraping – we don't want any surprise legal drama, right?

ALWAYS CHECK THE robots.txt file of the website you are scraping, this will show you which pages you can and cannot crawl.

This project includes:

Simple web scraping of a page from the Vegan Society News. I first retrived the news cards on the page, saving them into a CSV file. Next I proceeded by scraping the images on the news, and saving them to my machine.

Popular libraries for web scraping

  1. Scrappy
  2. BeautifulSoup - used in this small project
  3. Selenium

Web scraping Steps

  1. Crawl
  2. Parse and transform
  3. Store

About

Simple Web Scraping Project

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages