Image Classification

Alya Shafwa Wafiya

Sosial Media


0 orang menyukai ini
Suka

Summary

Laporan ini membahas langkah-langkah yang dilakukan dalam membangun model klasifikasi gambar kucing dan anjing menggunakan dataset "Cat and Dog". Model ini dapat digunakan untuk mengenali dan membedakan gambar kucing dan anjing.

Description

Project Overview

  • Project Objective: The aim of this project is to develop a machine learning model that can classify cat and dog images with high accuracy. The model will be used to automatically distinguish between cat and dog images. 
  • Business Understanding: Next is the business understanding. Business understanding involves a deep understanding of the business problem or opportunity that the project aims to address. In a business context, this project aims to provide automation capabilities that can recognize and classify cat and dog images. This will provide benefits to platforms or applications related to pets, including more efficient organization, search, and discovery of pet images.
  • Problem Statement: The problem to be solved is the difficulty in manually differentiating between cat and dog images. With a classification model that can perform this task automatically, users can save time and effort in organizing and searching for pet images.
  • Proposed Solution: The proposed solution is the development of a sequential model in machine learning that uses a dataset of cat and dog images to train and classify new images. The model will process these images and provide predictions on whether the image depicts a cat or a dog. 
  • Methods and Tools Used: This project utilizes machine learning methods with a classification approach using a sequential model. The tools used include the Python programming language, the TensorFlow library, and the Keras library. 
  • Dataset: The dataset used is the "Cat and Dog" dataset by SCHUBERTSLYSCHUBERT, which contains cat and dog images collected from open sources. 
  • Evaluation: The model will be evaluated using metrics such as accuracy, precision, recall, and F1-score to measure its performance in classifying cat and dog images.

By understanding this project overview, we can have a general understanding of the project's objectives, business context, problem to be solved, proposed solution, and the methods and tools used in this project.

Data Understanding

  • Data Source: The dataset used in this project is sourced from "Cat and Dog" by SCHUBERTSLYSCHUBERT. This dataset contains cat and dog images collected from open sources.
  • Data Size: The dataset consists of a number of cat and dog images. The exact number of images in the dataset can vary and depends on the size of the dataset used.
  • Data Format: Each data in the dataset is represented in image file formats such as JPEG or PNG. Each image has different sizes and resolutions.
  • Target Variable: The target variable in this dataset is a label indicating whether the image depicts a cat or a dog. This label is typically represented by a number, for example, 0 for cat and 1 for dog.
  • Data Exploration: In the Data Understanding stage, we can perform data exploration to understand the characteristics of the dataset. Some steps of data exploration that can be done include:
  1. Viewing sample cat and dog images from the dataset. 
  2. Checking the class distribution (number of cat images vs number of dog images) to ensure dataset balance. 
  3. Examining the size and resolution of the images to determine preprocessing steps required. 
  4. Performing simple statistical analysis such as counting the total number of images in the dataset. 
  • Data Preprocessing: In the Data Understanding stage, preprocessing steps may be required to prepare the data before the modeling process. Some examples of preprocessing that may be needed are:
  1. Resizing the images to have a uniform size. 
  2. Normalizing pixel intensities to improve model convergence. 
  3. Splitting the dataset into training and testing subsets.

 By understanding the characteristics and condition of the data in the dataset, we can determine the next steps in the data preparation, modeling, and evaluation processes of the machine learning model.

 

Data Preparation

 

Here is an explanation of the code:

• The first step is to import the necessary libraries:

  1. tensorflow for creating and training the model.
  2. ImageDataGenerator from tensorflow.keras.preprocessing.image for data augmentation and preparing data generators for training and validation.
  3. MobileNetV2 from tensorflow.keras.applications as the base model to be used.
  4. Input from tensorflow.keras.layers to create the input layer of the model.

• Next, you define the training directory with the variable TRAINING_DIR, which indicates the location of the training dataset.

• You use the ImageDataGenerator to perform data augmentation on the training images. Some of the augmentations applied include rescaling, rotation, horizontal and vertical shifts, shearing, zooming, and horizontal flipping. Additionally, you split the validation data by 20% from the training data using the validation_split parameter.

• After that, you use flow_from_directory to create the training and validation generators. You specify class_mode='categorical' to generate one-hot encoded categorical labels. The images are also resized to 150x150 pixels using the target_size parameter.

• Then, you define an ImageDataGenerator for the validation data, which only performs rescaling.

• Once all the generator settings are in place, you define a Callback class, which is a subclass of tf.keras.callbacks.Callback. This will be used to stop the training when the training and validation accuracies reach a specified threshold (greater than 0.92 in this example).

By understanding this code, you have set up the necessary libraries, prepared the data generators with augmentation, and defined a callback for stopping the training based on accuracy.

Modelling

Here is an explanation of the code:

• A pre-trained MobileNetV2 model is loaded using the MobileNetV2 function from tensorflow.keras.applications. The weights parameter is set to "imagenet" to load the weights pre-trained on the ImageNet dataset. include_top is set to False to exclude the fully connected layers of the model, and input_tensor is defined to specify the input shape as (150, 150, 3).

• The layers of the pre-trained model are set as non-trainable to freeze their weights and prevent them from being updated during training.

• The output of the pre-trained model is stored in the last_output variable.

• Additional layers are added to the model. A Flatten layer is used to flatten the output, followed by a Dropout layer to reduce overfitting. Then, two fully connected (Dense) layers with ReLU activation functions are added, and the final output layer with softmax activation is added with 2 units representing the two classes (cat and dog).

• The model is created using tf.keras.models.Model by specifying the input and output layers.

• The learning rate (int_lr) and the number of epochs (num_epochs) are defined.

• The model is compiled with the Adam optimizer, categorical cross-entropy loss function, and accuracy metric.

• The model is trained using the fit function. The training data is provided by the train_generator, and validation data is provided by the validation_generator. The training is stopped if the accuracy of both training and validation exceeds 0.92, as defined in the callbacks parameter.

• The trained model is saved in the SavedModel format using tf.saved_model.save.

• The SavedModel is converted to a TensorFlow Lite model (tflite_model) using tf.lite.TFLiteConverter.

• The TensorFlow Lite model is saved as a .tflite file using the write_bytes method of pathlib.Path.

This code loads the pre-trained MobileNetV2 model, adds additional layers, compiles and trains the model, and finally saves it as a TensorFlow Lite model for deployment on resource-constrained devices.

 

Evaluasi Model

The purpose of this visualization is to aid in understanding how the model progresses during training. We can observe whether the model tends to overfit or underfit, as well as examine the trends of accuracy and loss for each epoch. This visualization can also assist in parameter selection and decision-making related to the model.

The code above is used to create plots showing the changes in model accuracy and loss during training.

  1. Firstly, we use plt.plot() to create line plots for training accuracy (History.history['accuracy']) and validation accuracy (History.history['val_accuracy']). 
  2. Then, plt.title() is used to provide a title for the plot as "Model Accuracy", plt.ylabel() labels the y-axis as "Accuracy", and plt.xlabel() labels the x-axis as "Epoch". 
  3. Next, plt.legend() is used to display a legend ("train" and "test") in the upper left corner of the plot. 
  4. Finally, plt.show() is used to display the accuracy plot.
  5. Similarly, we use plt.plot() again to create line plots for training loss (History.history['loss']) and validation loss (History.history['val_loss']). plt.title(), plt.ylabel(), plt.xlabel(), and plt.legend() are used in the same way as before. 
  6. Finally, plt.show() is used to display the loss plot.

By using this code, we can visualize the changes in model accuracy and loss during training with the presented plots. These plots help us analyze and understand the model's performance visually.

Informasi Course Terkait
  Kategori: Artificial Intelligence
  Course: Teknologi Kecerdasan Artifisial