Skip to content

Latest commit

 

History

History
30 lines (23 loc) · 1.65 KB

README.md

File metadata and controls

30 lines (23 loc) · 1.65 KB

Sparkify - Churn Prediction for music streaming app with PySpark

This repository is part of the final project submited to Udacity for the Data Science Nanodegree. The objective is to predict churn, from a simulated music streaming app, using historical data from user interactions.

A blog post with a detailed analysis is available at https://medium.com/@ttozatto.ds/churn-prediction-for-music-streaming-app-sparkify-d6e26d1ac80f

Dependencies

  • pyspark
  • matplotlib

Files

Summary of Results

Test Scores

results_medium

Parameters for best models

bestModel

Feature importance

feature_importance

Aknowledgements:

I would like to pay my special regards to:

  • Udacity, that proposed this work in the Data Science Nanodegree.
  • Spark team and community, that provides a powerful opensource tool to everyone.