Regresi Predicting House Price

Wahyu Cakra Ningrat

Sosial Media


0 orang menyukai ini
Suka

Summary

One of the goals of this work is to demonstrate step by step how to analyze and visualize a data set to predict future home prices. In addition, it will explain most of the concepts used so that we understand why a model or metric is used. Based on features such as sqft_living, bathrooms, bedrooms, views and more, we will build a deep learning model that can predict future home prices

Description

  • Import Gdrive, import library python and read dataset

  • Changing the value from sample to sample in the numeric feature makes it easier to select the appropriate plot for visualization

  • Pearson correlation coefficient to test the strength and direction of the linear relationship between two continuous variables the correlation coefficient can range in value from -1 to +1. The greater the absolute value of the coefficient, the stronger the relationship between variables. For the Pearson correlation, an absolute value of 1 indicates a perfect linear relationship. The correlation close to 0 indicates that there is no linear relationship between variables. The sign of the coefficient indicates the direction of the relationship. If the two variables tend to rise or fall together, the coefficient is positive, and the line representing the correlation is upward sloping. If one variable tends to increase when the other decreases, the coefficient is negative, and the line representing the correlation is downward sloping

  • Price Corelation

  • Feature Price

  • box plots is a method for graphically depicting groups of numeric data through their quartiles. box plots may also have lines extending from the boxes (whiskers) indicating variability beyond the upper and lower quartile, hence the name boxes-and-whiskers plots. The outliers contained are plotted as individual points. The distance between the different parts of the box indicates the degree of dispersion (spread).

  • By removing some features, it makes it easier to deal with fewer data points. Speed up the notebook and make it easier (removing the Id, zip code, and Date features).
  • Looking at the boxplot, that there is no big difference between 2014 and 2015.
  • The number of houses sold per month tends to be the same every month.
  • The lineplot shows that around April there was an increase in house prices.

  • Train the model and predict the required solution. There are 60+ predictive modeling algorithms to choose from. We must understand the type of problem and the need for a solution to narrow it down to a select few models that we can evaluate. The problem in this case is the mean squared error regression problem. we perform a category of machine learning called supervised learning as well as train a model against a dataset.

  • To prevent data leakage from the test set, we only attach the scaler to the training set.

  • estimate the number of neurons (units) of existing features. Example: X_train.shape (15117, 19). The optimizer performs a gradient descent using the Adam optimizer and the mean square error loss function.

  • Training The Models

  • provide a model on a test set to get a list of predictions. Then compare the correct values with the prediction list. use different metrics to compare predictions.

  • compare Model predictions vs perfect fit to see how accurate the model is.
  • The red line represents the perfect fit.
  • outliers, which are expensive homes. This model is not able to predict luxury homes.
  • On the other hand, our model is good at predicting house prices between o and $2 million. Obviously there's a match.
  • Retraining our model for just under $3 million

  • The model used to predict the price of a new house by selecting the first house from the data set and dropping the price. single_house will have all the features we need to predict prices. After that, it is necessary to reshape the variables and scale the features.
  • The original price is $221,900 and the model's prediction is in the range of \$280,000

 

Informasi Course Terkait
  Kategori: Artificial Intelligence
  Course: Machine Learning For Beginner