Helmi Sulaeman
in this project I will detect the classify of data price mobile
Context
To find out some relation between features of a mobile phone( RAM,Internal Memory etc) and its selling price.
Exploratory Data Set
Link = kaggle kernels output ibraheemseyam/mobile-price-classification-99 -p /path/to/dest
Train And Test Data
Output
Comparing content between train and test data
Values in training data but not testing data: n_cores []
Values in training data but not testing data: blue []
Values in training data but not testing data: dual_sim []
Values in training data but not testing data: four_g []
Values in training data but not testing data: three_g []
Values in training data but not testing data: touch_screen []
Values in training data but not testing data: wifi []
graph(df_train, 'price_range', 2) |
#creating features combine = [df_train, df_test] for dataset in combine: dataset['total_pixels'] = dataset['px_height']*dataset['px_width'] dataset['screen_area'] = dataset['sc_h'] *dataset['sc_w'] #short range connections like bluetooth or wifi can be inter-changable, if neither exist it is huge disadvantage dataset['connectivity'] = 0 dataset['connectivity'][dataset['blue']==1]=1 dataset['connectivity'][dataset['wifi']==1]=1 clf=setup(df_train,target='price_range') best = compare_models(sort = 'AUC',n_select=1, fold = 10) best_classifier=create_model(best) best_classifier_tuned = tune_model(best_classifier) plot_model(best_classifier_tuned,plot='feature') |
Processing: 100% 69/69 [01:48<00:00, 2.70s/it]
Processing: 0% 0/4 [00:00<?, ?it/s]
#trying predictor on held-out data predict_model(best_classifier_tuned) |
#visualizing the predictions pred_df = predict_model(model_mobile_price, data = df_test) pred_df['price_range']=pred_df['Label'] pred_df = pred_df.drop('Label', axis = 1) pred_df['dataset']='predictions' df_train['dataset']='training' merged_df = pd.concat([df_train, pred_df]).reset_index(drop=True) graph(merged_df, 'dataset') |
Conclution