Gendhis Bestari Tanjung
DATA : https://www.kaggle.com/code/hellbuoy/online-retail-k-means-hierarchical-clustering/data
We aim to segement the Customers based on RFM so that the company can target its customers efficiently.
The steps are broadly divided into:
Step 1: Reading and Understanding the Data
Step 2: Data Cleansing
Step 3: Data Preparation
Step 4: Model Building
Step 5: Final Analysis
Step 1: Reading and Understanding the Data
Step 2: Data Cleansing
Step 3: Data Preparation
We are going to analysis the Customers based on below 3 factors:
R (Recency): Number of days since last purchase
F (Frequency): Number of tracsactions
M (Monetary): Total amount of transactions (revenue contributed)
We will treat outliers as it can skew our dataset:
Step 4: Model Building
K-MEANS CLUSTERING
K-means clustering is one of the simplest and popular unsupervised machine learning algorithms.
The algorithm works as follows:
First we initialize k points, called means, randomly.
We categorize each item to its closest mean and we update the mean’s coordinates, which are the averages of the items categorized in that mean so far.
We repeat the process for a given number of iterations and at the end, we have our clusters.
Elbow Curve to get the right number of Clusters
Step 5: Final Analysis
K-Means Clustering with 3 Cluster Ids