Muhammad Abdurrahman
The Iris Flower data set is a multivariate data set introduced by the British statistician and biologist Ronald Fisher in his 1936 paper The use of multiple measurements in taxonomic problems. It is sometimes called Anderson's Iris data set because Edgar Anderson collected the data to quantify the morphologic variation of Iris flowers of three related species. The data set consists of 50 samples from each of three species of Iris (Iris Setosa, Iris virginica, and Iris versicolor). Four features were measured from each sample: the length and the width of the sepals and petals, in centimeters. This dataset became a typical test case for many statistical classification techniques in machine learning such as support vector machines.
The decision tree model classification uses a decision tree algorithm, one of the machine learning models, to perform the process of predicting or classifying iris flower kaggel.com data. Iris data is in the form of information about iris data, division or classification, there are Sentosa iris species, iris versicolor, and iris virginica. Based on the attributes or parameters of the sepal length, sepal width, petal length, and petal width the data has been defined. Well we're going to process iris data everywhere that goes into iris Sentosa, iris versicolor, and iris virginica. This time we will apply the SVM classification model.
1. To use Google Drive, we will connect first
2. Retrieve data
The following shows the data, there are 5 columns, there are sepal length, sepal width, petal length, dan petal width attributes.
3. Scatter chart shape view plt
The following shows the green Sentosa iris species, orange iris versicolor, and blue iris virginica.
4. The classification process predicts data for 2, where data for 2 is sepal length, sepal width, petal length, petal width is stored in variable X then the target species is stored in variable Y
5. implement a training and testing classification decision tree for 2 data.
6. The following is the display of the decision tree from the import prediction results to the accuracy score.
Based on the calculation of the operating value of using the iris data decision tree, the accuracy value of using the decision tree model is 90%.
Referensi