BISA AI - AI For Everyone

BEST EMAIL SPAM DETECTION MACHINE LEARNING MODEL

Bethelsando Gemilang Wahyudi

Sosial Media

0 orang menyukai ini
Suka

Summary

SUMMARY

Email spam, also known as junk email, is unsolicited, unwanted, or irrelevant messages sent via email. These messages are typically sent in large quantities by spammers, who hope to either scam people out of their money or trick them into giving away personal information. Spam emails may contain links to malicious websites or attachments that can harm your computer, so it's important to be careful when dealing with them. Most email providers have spam filters in place to help protect users from this type of unwanted email. Actually we can detect which one is spam or not use machine learning.

Description

DESCRIPTION:

Download the dataset from kaggle : https://www.kaggle.com/datasets/balaka18/email-spam-classification-dataset-csv
I create this machine learning use google collaboratory so I upload my dataset to google drive

3. Mounting drive with collab

4. Import library that we need to process dataset and make data to a dataframe

5.Preprocessing dataframe

After we know that dataset is clean we continue to next step

6. Get the statistical from dataframe and get in the columns on pandas dataframe

7. Define the X data from dataframe

8. Define the y data or result from dataframe

9. We do some classification use eleven algorithm

10. Train data with each algorithm

Until we get this output

==============================

KNeighborsClassifier

****Results****

Accuracy: 87.0070%

Log Loss: 1.379305610485888

==============================

SVC

****Results****

Accuracy: 71.0750%

Log Loss: 0.4836649541561642

==============================

NuSVC

****Results****

Accuracy: 82.6759%

Log Loss: 0.3320301535134764

==============================

DecisionTreeClassifier

****Results****

Accuracy: 93.1168%

Log Loss: 2.3773790403302795

==============================

RandomForestClassifier

****Results****

Accuracy: 97.8345%

Log Loss: 0.16699441191493325

==============================

XGBClassifier

****Results****

Accuracy: 96.5197%

Log Loss: 0.12179309395210326

==============================

AdaBoostClassifier

****Results****

Accuracy: 96.2877%

Log Loss: 0.5251066199866752

==============================

GradientBoostingClassifier

****Results****

Accuracy: 96.7517%

Log Loss: 0.12360515066970783

==============================

GaussianNB

****Results****

Accuracy: 95.2823%

Log Loss: 1.6259912051490315

==============================

LinearDiscriminantAnalysis

****Results****

Accuracy: 72.3125%

Log Loss: 8.361442220781141

/usr/local/lib/python3.8/dist-packages/sklearn/discriminant_analysis.py:878: UserWarning: Variables are collinear

warnings.warn("Variables are collinear")

==============================

QuadraticDiscriminantAnalysis

****Results****

Accuracy: 75.2514%

Log Loss: 8.547879695569543

==============================

11. Compare the accuracy from each algorithm and get the best model machine learning

12. And we get the best algorithm

13. From this chart we know that randomforest classifier is the best model for this dataset

Informasi Course Terkait

Kategori: Artificial Intelligence
Course: Infrastuktur Kecerdasan Artifisial (SIB AI-INFRA)

Kelas GRATIS

Master Class

Master Class on Job Training

Learning Path

Kelas OFFLINE

Kelas Corporate

Prakerja

Webinar

Udemy

Kampus Merdeka

Learncation

Portofolio Peserta

Sertifikasi International

Sertifikasi Nasional

Kuliah RPL

Politeknik BISA AI

Pendidikan Profesional

Educloud

Siakad by Bisa AI

IT Solution

Konsultan Pendidikan

Kolaborasi Seminar

Kolaborasi pelatihan

Gallery

Tentang Kami

Testimonial Peserta

Corporate Social Responsibility

Hubungi Kami

Dokter Mekanik

E-learning

Bisa Design

Flungo

Tampil

Bakerspice Academy

TripTracker

Gramatikal

BEST EMAIL SPAM DETECTION MACHINE LEARNING MODEL

Sosial Media

Summary

Description

Informasi Course Terkait

Bisa AI Academy