FISH WEIGHT PREDICTION WITH LINEAR REGRESSION

ATI ZAIDIAH

Sosial Media


0 orang menyukai ini
Suka

Summary

The purpose of this project is: to predict the weight of the fish by comparing the data on the length, width and height of the fish

Description

Data understanding: The dataset used is data taken from the dataset on kaggle.com,. is data from 7 different common fish species in fish market sales. With this dataset, it is possible to predict fish weight using linear regression. The data consists of 159 records and 7 columns. The dependent variable is the weight of the fish while the independent variables are length1, length2, length3, height and width

Data Structure:

Species, weight, lenght1, lenght2, lenght3, height, width

The dataset used is as shown in Figure 1 below:

Figure 1. Fish species dataset (source kaggle.com)

 

DATA VISUALIZATION

Visualization was carried out to compare the weight of the fish with lenght1, lenght2, lenght3, height and width.

Data visualization was previously used using MS. Excel, where the data to be compared are:

  1. Weight with Length1
  2. Weight with Length2
  3. Weight with Length3
  4. Weight with height
  5. Weight by width

The following is a scatter graph (Figure 2) which illustrates the visualization of the 5 comparisons above

 

 

 

Figure 2. Graphics visualization

 

From Figure 2 above, it can be assumed that the relationship between fish weight and length1, length2, length3, height and width all have a high positive correlation, because the higher the value of length1, length2, length3, height and width, the higher the fish's weight will be.

To see whether the data visualization is consistent or not, normalization is carried out on the data by dividing the value of each variable by its maximum value. Here is presented one of the data visualization that has been normalized.

Figure 3. Data Visualization after normalization

From Figure 3 above, it can be concluded that the data before and after normalization produce almost the same image so that it can be said that the visualization results are consistent.

After visualizing the data using a scatter graph, the next step is to perform calculations using a data analyst using excel.

The following are the results of the calculation of data analysis using linear regression, namely by doing a correlation between the dependent variable and the independent variable.

Figure 4. Correlation between weight and length1

From the picture above, it can be concluded that the results of the correlation analysis between weight and length1 have a high positive correlation, indicated by the correlation value above 0.9

Figure 5. Correlation between weight and length2

 

From the picture above, it can be concluded that the results of the correlation analysis between weight and length2 have a high positive correlation, indicated by the correlation value above 0.9

Figure 6. Correlation between weight and length3

From the picture above, it can be concluded that the results of the correlation analysis between weight and length3 have a high positive correlation, indicated by the correlation value above 0.9

Figure 7. Correlation between weight and height

 

From the picture above it can be concluded that the results of the correlation analysis between weight and height have a fairly high positive correlation, this is indicated by the correlation value above approaching the number 0.9

Figure 8. Correlation between weight and width

From the picture above it can be concluded that the results of the correlation analysis between weight and height have a fairly high positive correlation, this is indicated by the correlation value above approaching the number 0.9

Conclusion

The conclusion from the analysis that has been done is that all independent variables have a positive correlation where the one with the highest positive correlation is weight with length2, which is 0.92 while the smallest positive correlation is weight with height, which is 0.72.

 

 

Informasi Course Terkait
  Kategori: Data Science / Big Data
  Course: Persiapan Ujian Sertifikasi Internasional DSBIZ - AIBIZ