Case Study: Prediction on Iris Dataset Using KNN Algorithm by IRJET Journal

International Research Journal of Engineering and Technology (IRJET)

e-ISSN: 2395-0056

Volume: 10 Issue: 04 | Apr 2023

p-ISSN: 2395-0072

www.irjet.net

Case Study: Prediction on Iris Dataset Using KNN Algorithm Shreyas Tayade1, Rakhi Gupta2, Deval Kherde3 , Chaitanya Ubale4 1Student,Sipna College of Engineering and Technology, Maharashtra, India

2Assistant Professor, Sipna College of Engineering and Technology, Maharashtra, India 3Student,Sipna College of Engineering and Technology, Maharashtra, India 4Student,Sipna College of Engineering and Technology, Maharashtra, India

---------------------------------------------------------------------***---------------------------------------------------------------------

Abstract - The well-known Iris dataset is used in this case

study to use the K-Nearest Neighbors (KNN) method. The 150 iris flower observations in the Iris dataset include 50 observations of each of the three species—Setosa, Versicolor, and Virginica. This case study aims to identify the four characteristics of sepal length, sepal breadth, petal length, and petal width that may be used to categorize iris flowers into their respective species. The KNN method is a well-liked and straightforward classification technique that makes predictions by locating the nearest neighbors of each observation. To guarantee that all of the characteristics in this case study are on the same scale, the dataset is first divided into training and testing sets. The next step is to train a KNN model with k=3, which takes into account each observation's three nearest neighbors. Lastly, the accuracy score is used to assess how well the model performed on the test set.

fig-1 Dataset For those who are new to machine learning, the Iris dataset serves as a nice example of a classification issue that can be handled using KNN. Further categorization issues in the future can be solved using the knowledge and methods obtained from this case study.

2. ATTRIBUTE SELECTION

Key Words: K-Nearest Neighbors,sepal length, sepal breadth, petal length,petal width

The key to attaining good classification accuracy on the Iris dataset is selecting the best attribute for KNN. The four characteristics in this dataset are sepal length, sepal width, petal length, and petal width.

1.INTRODUCTION The Iris dataset, which includes measurements of three different iris flower species, is well-known in the machine learning field. The dataset is a well-known example of a problem that may be resolved using supervised learning techniques and has been widely used as a benchmark for classification systems. This issue may be resolved using the straightforward and well-liked classification technique K-Nearest Neighbors (KNN). In this case study, we will use the Iris dataset and the KNN method to categorize iris blossoms according to four characteristics: sepal length, sepal width, petal length, and petal width. This case study's main objective is to outline the fundamental procedures for using KNN on the Iris dataset, from loading the data through assessing the model's performance on hypothetical data. We'll load the dataset first, then divide it into training and testing sets, normalise the data, train the KNN model, and assess its performance.

Impact Factor value: 8.226

2 Description of Data Using feature selection approaches that rank the characteristics according to their significance or relevance to the classification job is one method for selecting the best attribute. This may be accomplished using a variety of techniques, including feature selection based on mutual information, correlation, or trees.

ISO 9001:2008 Certified Journal

Page 325