This project parses, cleans, and analyzes the Ford GoBike dataset from Kaggle. The current scope includes analysis and visualization of rider use patterns to understand peak use times and ride duration.
Local deployment can be accomplished by creating a virtual environment using Python 3.11 and installing the necessary dependencies as described below.
Install dependencies from the terminal using the following command: pip install -r requirements.txt
OR
Manually install the following:
- Python 3.11
- Jupyter notebook 6.5.4
The following third-party Python libraries were used:
- NumPy 1.26.4
- Pandas 2.2.1
- Seaborn 0.13.2
- Matplotlib 3.8.3
To run this project, download and unzip the following files:
- 201902-fordgobike-tripdata.csv.zip (Raw dataset)
- expl_plot_df.csv.zip (Clean dataset)
- ebike_use_eda.ipynb (EDA notebook)
- ebike_use_summary.ipynb (Summary notebook)
Open the EDA notebook in Jupyter Notebook and run the code to parse, clean, and analyze the raw data. Open the Summary notebook in Jupyter Notebook and run the code to create explanatory visualizations of the data analyzed in the EDA notebook.
Contributions and suggestions are welcome and may be submitted by forking this project and submitting a pull request through GitHub.