nextTech - Find Your Next Start Up City Here

An interactive way for users to observe trends in major tech hub cities

✨ Visit the Website Here

✔️ Prerequisites

Assuming you have the basics set up, please proceed to pip install the following to your local or virtual environment

pip install flask pymongo pandas python-dotenv dnspython sklearn requests

NOTE: Our env file is not included as it is related to our individual Mongo database

Version for these prerequisites include...

dnspython==2.0.0
Flask==1.1.2
pandas==1.1.5
pymongo==3.11.2
python-dotenv==0.15.0
scikit-learn==0.23.2
sklearn==0.0
requests==2.24.0

🖥️ Usage

Completing the above, proceed to run the code by

python app.py

🚧 Project Outline

Our group set out to develop a machine learning model that can predict whether a zip code is a tech hub or not.

Data Sources

Census report API (Age, education, ethnic group, median salary)
Zillow API (Real estate data)

Gathering data

Our objective was to find usable data from the data sources listed above and make readable in a JSON format to work with our JavaScript visualization libraries. Our approach starts with identify the level of detail for location (city, neighborhood, zip codes, etc.) that is consistent across our data sources. Web APIs will then be used to pull data for NYC regions to feed into an unsupervised learning model.

Data Wrangling

Used Pandas for ETL. Cleaned the data, and gathered the specific features that we wanted. Merged the census and zillow dataframes, using zip code as our key.

Machine Learning

Unsupervised k-mean machine learning

Created five clusters, using the elbow method, to define the parameters of a tech hub. This served as our training set.
Analyzed each cluster to determine which cluster we would use to determine tech hub viability.
Created a new column to identify the zip codes as a tech hub or not.

Supervised logistic regression machine learning:

Split data into training and testing sets.
Trained a logistical regression model based on outputs defined from our unsupervised machine learning model.
Used this model to predict which locations across the US are tech hubs
Exported trained logistical model through pickle in order to run our model through flask application

Data Loading

From here, all the data was loaded in an AWS database by creating an S3 bucket. This allows for our data to be stored remotely, which allows for anybody to run our model without needing to download all the data locally.

Then, using a provided API which we used on our Flask app

📖 Authors

👤 Deep Patel

Website: www.mrdeeppatel.com
Github: @Frozte
LinkedIn: @Deep Patel

👤 Joshua Coronel

Github: @joshuajonme
LinkedIn: @Joshua Coronel

👤 Keana Mabilog

Github: @keana-m
LinkedIn: @Keana Mabilog

👤 Stephano Castro

Github: @castrostephano
LinkedIn: @Stephano Castro

👌 Show your support

Give a ⭐️ if this project helped you!

📝 License

This README was generated with readme-md-generator

Name		Name	Last commit message	Last commit date
Latest commit History 79 Commits
ETL		ETL
models		models
static		static
templates		templates
.gitignore		.gitignore
LICENSE		LICENSE
Procfile		Procfile
app.py		app.py
connections.py		connections.py
readme.md		readme.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

nextTech - Find Your Next Start Up City Here

✨ Visit the Website Here

✔️ Prerequisites

🖥️ Usage

🚧 Project Outline

Data Sources

Gathering data

Data Wrangling

Machine Learning

Data Loading

📖 Authors

👌 Show your support

📝 License

About

Releases

Packages

Languages

License

DeepIntoData/machine-learning-tech-hubs

Folders and files

Latest commit

History

Repository files navigation

nextTech - Find Your Next Start Up City Here

✨ Visit the Website Here

✔️ Prerequisites

🖥️ Usage

🚧 Project Outline

Data Sources

Gathering data

Data Wrangling

Machine Learning

Data Loading

📖 Authors

👌 Show your support

📝 License

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages