Eka Wulan Yunita
Sentiment analysis is a field of Natural Language Processing (NLP) that builds a system to recognize and extract opinions in text form. Information in the form of text is currently widely available on the internet in the format of forums, blogs, social media, and sites with reviews. With the help of sentiment analysis, previously unstructured information can be transformed into more structured data. The data can explain public opinion about products, brands, services, politics, or other topics. Companies, governments, and other fields then use these data to make marketing analysis, product reviews, product feedback, and community services. Here, we will conduct a sentiment analysis of reviews on language learning applications, namely Duolingo. The dataset is taken using the scrapping technique. The steps taken in this process are: building dataset, Labelling, preprocessing, Feature Extraction, and model building. The result is that review sentiment is more positive and shows an accuracy of 0.93%.
WHAT IS SENTIMENT ANALYSIS ?
Sentiment analysis is a field of Natural Language Processing (NLP) that builds a system to recognize and extract opinions in text form. Information in the form of text is currently widely available on the internet in the format of forums, blogs, social media, and sites with reviews. With the help of sentiment analysis, previously unstructured information can be transformed into more structured data. The data can explain public opinion about products, brands, services, politics, or other topics. Companies, governments, and other fields then use these data to make marketing analysis, product reviews, product feedback, and community services. In order to generate the required opinion, sentiment analysis must not only be able to identify opinions from texts. This process, which is also known as opinion mining, also needs to work by recognizing the following three aspects:
HOW SENTIMENT ANALYSIS FROM REVIEW DUOLINGO APP WITH MACHINE LEARNING ?
STEP :
A. Building Dataset
The data we collect is a review of the Duolingo application on Google Play, so we need to download the Google Scrapper library to retrieve the dataset.
2. Building DataSet
by using the google scrapper library, we can call data with the name of the web application (duolingo.com).
3. Converting to DataFrame
4. Amount of the Data
B. Labelling
Giving negative, positive, and neutral labels to the common column, where the rating is originally a number.
Then, taking the variables needed to build the model.
C. Preprocessing
Changing all data to lowercase, and removing the punctuation.
2. Tokenizing
Dividing sentences into words.
3. StopWord
Actually, Stopword are the same as filtering, but the defference is that stopword only select words to be removed/added, while filtering selects other than words..
4. Stemming
The stemming stage is a stage that is also needed to minimize the number of different indices from one data so that a word that has a suffix or prefix will return to its basic form.
D. Feature Extraction
The feature extraction method can provide additional information and facilitate learning in supervised learning models by converting text data into numerical vector representations.
E. Building Machine Learning
SPLITTING DATA ( Data splitting or data separation is a method of dividing data into two or more parts that form subsets of data )
RESULT :
F. Visualization of Data
RESULT :