Online Retail K-Means Clustering

Gendhis Bestari Tanjung

Sosial Media


0 orang menyukai ini
Suka

Summary

DATA : https://www.kaggle.com/code/hellbuoy/online-retail-k-means-hierarchical-clustering/data

We aim to segement the Customers based on RFM so that the company can target its customers efficiently.

The steps are broadly divided into:

Step 1: Reading and Understanding the Data

Step 2: Data Cleansing

Step 3: Data Preparation

Step 4: Model Building

Step 5: Final Analysis

Description

Step 1: Reading and Understanding the Data

 

Step 2: Data Cleansing

Step 3: Data Preparation

We are going to analysis the Customers based on below 3 factors:

R (Recency): Number of days since last purchase

F (Frequency): Number of tracsactions

M (Monetary): Total amount of transactions (revenue contributed)

We will treat outliers as it can skew our dataset:

 

 

Step 4: Model Building

K-MEANS CLUSTERING

K-means clustering is one of the simplest and popular unsupervised machine learning algorithms.

 

The algorithm works as follows:

 

First we initialize k points, called means, randomly.

We categorize each item to its closest mean and we update the mean’s coordinates, which are the averages of the items categorized in that mean so far.

We repeat the process for a given number of iterations and at the end, we have our clusters.

Elbow Curve to get the right number of Clusters

 

 

 

 

 

 

 

Step 5: Final Analysis

K-Means Clustering with 3 Cluster Ids

  • Customers with Cluster Id 1 are the customers with high amount of transactions as compared to other customers.
  • Customers with Cluster Id 1 are frequent buyers.
  • Customers with Cluster Id 2 are not recent buyers and hence least of importance from business point of view.

Informasi Course Terkait
  Kategori: Artificial Intelligence
  Course: Dasar - Dasar Python