Playground Series - Season 4, Episode 3: EDA/Modelling for Multi-Class Prediction of Steel Plate Defects
This repository contains a Jupyter Notebook detailing the exploratory data analysis (EDA) and modeling process for multi-class prediction of steel plate defects. The notebook is part of the Playground Series - Season 4, Episode 3.
The notebook is structured into six main parts:
- Data loading and first exploration
- Target analysis
- EDA and data preparation
- Modeling
- Explainability
- Preparation of the submission
The data is loaded and basic exploration is performed to understand the dataset's structure and features.
An analysis of the target variables is conducted to understand their distribution and characteristics.
Exploratory data analysis (EDA) techniques are applied to understand the relationships between features and prepare the data for modeling.
Modeling is performed using XGBoost with a focus on optimizing hyperparameters and evaluating model performance.
The model's explainability is explored using SHAP values to understand feature importance and model predictions.
The final model predictions are prepared for submission, including ensembling strategies to improve performance.
For the full details and code implementation, please refer to the notebook in this repository.