exploratory data analysis of house prices

Aliyyul Wafa

Sosial Media


0 orang menyukai ini
Suka

Summary

The house is a means of residence that is very valuable for human survival. however, many people don't consider several aspects beforehand when buying a house, so they only know market prices from friends or certain sites which causes them to sometimes feel dissatisfied with their choice. therefore made an analysis of several aspects that affect house prices

Description

Data obtained from Kaggle entitled "Real Estate Price Prediction". This dataset contains information about house prices in King County, Washington, United States, including information about the location, size, and amenities of the house. The dependent variable of this dataset is house prices which are continuous variables. To download the dataset via the following link: https://www.kaggle.com/harlfoxem/housesalesprediction. After downloading the dataset, then reading it on Google Colab using the following command:

Next is data preprocessing:

In the program above, the "id" and "date" columns are removed because they are not considered important in data analysis. Next, the missing data is filled with the median value of the column. Finally, the data categories in the "view" column are changed to simpler categories.
The next step is normalization. To normalize the data, the scikit-learn library is used. Here's how to normalize data using scikit-learn:

Data visualization from normalization using seaborn :

Generate outputs:

Data visualization using matplotlib :

In addition to normalization, correlation analysis between variables is also carried out, using the following command:

Visualization using seaborn:

 

Visualization using matplotlib :

Informasi Course Terkait
  Kategori: Data Science / Big Data
  Course: Riset Kecerdasan Artifisial (SIB AI-RESEARCH)