Twitter Sentiment Analysis

Raniah Nur Hanami

Sosial Media


0 orang menyukai ini
Suka

Summary

Description

Nowadays, Twitter is one of the most popular social media application. We can find any information on Twitter, even the information that are needed to be develop by an AI tecnology. One of the most popular AI task is sentiment analysis of a tweet. In this portofolio, the author will explain about the development of machine learning and deep learning model for sentiment analysis of an English tweet.

  1. Pre-processing
    1. Label mapping: Label mapping: Sentiment label are mapped into 5 classes, extremely negative, negative, neutral, positive, dan extremely positive. Author decided to divide the label only into three main classes, that are positive, neutral, and negative.
    2. Balancing data: There are imbalances in data between the positive, neutral, and negative classes. If we only collect the uppermost data, the positive class tend to have less data than the other two classes. In terms of that, it needs to be balanced by retrieving 2000 data for each of the classes. 
    3. Normalization: Text data is nomalized by changing the text using lower case, deleting excessive whitespace, digit, stopword, mention, as well as, link, and do lemmatization on tokens. 
      • Normalization
      • Removing Stopwords
      • Lemmatization
      • Remove mention & link in tweet
      • Pre-process
  2. Model
    • SVM
      Machine learning model that was being tested was SVM. The first step was extracting text feature into vector by using BERT Feature Extraction. Result obtained from the model mostly and preferably predict negative class rather than the other two classes. 
      1. BERT Feature Extraction
      2. BERT encoding
      3. Training & Testing

    • CNN
      Deep learning model that was developed by Author is using the Convolutional Layer. The first step is converting text into vector with 10000 words as a maximum total number of count. Result obtained using this model are quite good, which the validation accuracy reached 71%. 
      1. Transform sentence to vector
      2. Model
      3. Training & Testing


 


 

Informasi Course Terkait
  Kategori: Artificial Intelligence
  Course: Blockchain Kecerdasan Artifisial (SIB AI-BLOCKCHAIN)