Skip to content

VectorLogoZone/wikipedia-infobox-logos

Repository files navigation

Wikipedia InfoBox Logos Wikipedia Logo

release

Extracts SVG logos from Wikipedia InfoBoxes.

I already extract SVG logos from Wikipedia if they have "logo" in the file name, but that there are valid SVG logos. This is a way to get more of them.

Getting the Data

The Wikipedia data is licensed CC-BY-SA.

They provide regular data dumps which can be found on dumps.wikimedia.org. The latest page and there is a dated page (example for 20240920).

Example URLs:

Parsing the Data

Parsing is non-trivial.

Running

python3 -m venv .venv
source .venv/bin/activate
python3 -m pip install -r requirements.txt

Credits

bash Git Github GZip Python Wikipedia