Skip to content

ANURUDRA-JENA/Unsupervised-Customer-Segmentation

Repository files navigation

CustomerSegmentation using Unsupervised Learning

CustomerSegmentationDiagram

Primary Objective:

  • To perform a comprehensive analysis of transactional data from a UK-based non-store online retail business to optimize marketing and sales strategies.

Secondary Objectives:

  • Clean and preprocess the dataset to ensure its reliability and consistency.
  • Conduct exploratory data analysis to uncover patterns and correlations within the data.
  • Analyze customer behavior, product dynamics, temporal trends, and country-specific patterns.
  • Segment customers based on RFM analysis and K-Means clustering to identify high-value customers.
  • Provide actionable insights to the business for strategic decision-making and growth.

Summarizing the workflow and the results from the tests conducted:

  • Workflow: The project involves preprocessing data, feature selection, exploratory data analysis (EDA), generating RFM scores, and clustering. Key steps include handling missing values, outliers, and creating new features from the InvoiceDate.
  • Tests Conducted: Two hypothesis tests were performed:
    1. Frequent vs. Non-Frequent Customers: Tested if frequent customers spend more than non-frequent customers.
    2. Recent vs. Older Customers: Tested if recent customers tend to spend more than older customers.
  • Results Generated:
    1. Frequent Customers: Reject the null hypothesis; frequent customers spend more.
    2. Recent Customers: Reject the null hypothesis; recent customers tend to spend more.

Summarizing the insights generated:

  • Top-Selling Products: The highest-selling products include items like the WHITE HANGING HEART T-LIGHT HOLDER and JUMBO BAG RED RETROSPOT.
  • Customer Behavior: Frequent customers tend to spend significantly more than non-frequent customers. Recent customers also tend to have higher monetary values.
  • Sales Trends: Sales peak in November and December, with the highest sales occurring in the afternoon.
  • Country Analysis: The majority of sales come from the United Kingdom, followed by Germany and France.

Cluster 0: At-Risk or Lapsed Customers

  • Characteristics: Low recency, frequency, and monetary value.
  • Action: Prioritize reactivation campaigns and investigate reasons for inactivity. Consider seasonal offers or targeted campaigns to re-engage these customers.

Cluster 1: Champions or Loyal Customers

  • Characteristics: High recency, frequency, and monetary value.
  • Action: Maintain high engagement levels, explore up-selling and cross-selling opportunities, and consider feedback or brand ambassador programs.

Cluster 2: Potential Loyalists or Promising Customers

  • Characteristics: Moderate levels of recency, frequency, and monetary value.
  • Action: Implement tailored marketing strategies, loyalty programs, and incentives to increase purchase frequency and value.

Overall, the analysis provides valuable insights into customer behavior and helps identify opportunities for targeted marketing efforts to improve customer retention and loyalty.

In Short...

This project aims to analyze transactional data from a UK-based online retail business to gain valuable insights into customer behavior, product performance, and temporal trends. By cleaning the data, conducting exploratory analysis, and segmenting customers, the project will provide actionable recommendations to optimize marketing and sales strategies. The ultimate goal is to help the business understand customer needs, tailor product offerings, and enhance customer satisfaction and loyalty.

You can simply go through the notebook and add the dataset into the colab "Sample file" and click Run all in the Runtime dropdown. You can download the source code and the dataset from the link provided below: https://drive.google.com/file/d/1GUCdNCi8SWxcRV6DNzQRKp3taLizRJFS/view?usp=sharing

About

Unsupervised Machine learning Model on Customer Segmentation

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published