Skip to content

The project uses KMeans clustering on the Global Superstore dataset to categorize customers based on their buying habits, aiming to help retailers make better business decisions by tailoring their marketing strategies and improving their inventory management.

Notifications You must be signed in to change notification settings

orestasdulinskas/customer_segmentation

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 

Repository files navigation

Customer Segmentation and Insights in Retail: A Data-Driven Approach

Orestas Dulinskas

February 2024

Background

In today's competitive retail landscape, understanding customer behavior and preferences is crucial for business success. This project leverages the Global Superstore dataset, a rich collection of sales data from a hypothetical multinational retail corporation, to uncover insights that can drive strategic decision-making and improve business performance.

Objective

The primary objective of this project is to segment customers based on their purchasing behavior and preferences, using clustering techniques. By identifying distinct customer segments, retailers can tailor marketing strategies, optimize inventory management, and enhance the overall shopping experience to drive customer loyalty and satisfaction.

Data

The dataset contains information about sales transactions from a hypothetical multinational retail corporation. It includes details about products, customers, orders, and sales across different regions and product categories. Key columns include:

  • Row ID: A unique identifier for each row in the dataset.
  • Order ID: A unique identifier for each sales order.
  • Order Date: The date when the order was placed.
  • Ship Date: The date when the order was shipped.
  • Ship Mode: The shipping method used for the order.
  • Customer ID: A unique identifier for each customer.
  • Customer Name: The name of the customer.
  • Segment: The market segment to which the customer belongs.
  • Postal Code: The postal code of the customer's location.
  • City: The city of the customer's location.
  • State: The state or province of the customer's location.
  • Country: The country of the customer's location.
  • Region: The region of the customer's location.
  • Market: The market in which the sale occurred.
  • Product ID: A unique identifier for each product.
  • Category: The category to which the product belongs.
  • Sub-Category: The sub-category to which the product belongs.
  • Product Name: The name of the product.
  • Sales: The total sales amount for the order.
  • Quantity: The quantity of the product sold.
  • Discount: The discount applied to the order.
  • Profit: The profit generated from the order.
  • Shipping Cost: The cost of shipping the order.
  • Order Priority: The priority of the order.

This dataset provides a rich and detailed view of sales transactions, allowing for analysis of customer behavior, product performance, and regional sales trends. It is well-suited for customer segmentation and insights projects in the retail industry.

Conclusion

In conclusion, the project has successfully applied KMeans clustering to the Global Superstore dataset to segment customers based on their purchasing behavior. By identifying distinct customer segments, the project has provided valuable insights that can inform strategic decision-making and improve business performance in the retail industry. The identified customer segments include:

  • High-Value, Frequent Customers: This segment consists of customers with very low recency, but very high frequency and amount of purchases. These customers are likely to be loyal and valuable, making frequent and high-value purchases.

  • High-Value, Recent Customers: This segment includes customers with high recency, frequency, and amount of purchases. These customers are likely to be loyal and valuable, making recent, frequent, and high-value purchases.

  • Low-Value, Inactive Customers: This segment comprises customers with very high recency, but very low frequency and amount of purchases. These customers are likely to be inactive or lapsed, making few and low-value purchases.

  • Moderate-Value, Recent Customers: This segment consists of customers with low recency, but slightly higher frequency and amount of purchases compared to the third segment. These customers are somewhat engaged with the retail corporation but are not as active or valuable as those in the first and second segments.

Overall, the project has demonstrated the power of customer segmentation and insights in driving strategic decision-making and improving business performance in the retail industry. By understanding the distinct needs and behaviors of different customer segments, retailers can tailor their marketing strategies, optimize inventory management, and enhance the overall shopping experience to drive customer loyalty and satisfaction.

About

The project uses KMeans clustering on the Global Superstore dataset to categorize customers based on their buying habits, aiming to help retailers make better business decisions by tailoring their marketing strategies and improving their inventory management.

Topics

Resources

Stars

Watchers

Forks