Automated Essay Grading using Features Selection

International Research Journal of Engineering and Technology (IRJET) Volume: 04 Issue: 03 | March -2017

www.irjet.net

e-ISSN: 2395 -0056 p-ISSN: 2395-0072

AUTOMATED ESSAY GRADING USING FEATURES SELECTION Y.Harika1, I.Sri Latha2, V.Lohith Sai3, P.Sai Krishna4 , M.Suneetha5 1,2,3,4 IV/IV

B.Tech ,5Professor VELAGAPUDI RAMAKRISHNA SIDDHARTHA ENGINEERING COLLEGE Dept. of Information Technology ,kanuru,A.P,India ---------------------------------------------------------------------***--------------------------------------------------------------------1)

Abstract - Automated essay grading has been a research area to maximize Human-Machine agreement for automatic evaluation of textual summaries or essays. With the increasing number of people attempting several exams like GRE, TOEFL, IELTS etc., it’ll become quite difficult for each paper to be graded besides the difficulty for humans to focus with a consistent mindset. Now-adays we require such interfaces to practice in improving efficient writing skills along with their presence as graders in competitive exams. In this scenario a person finds it very difficult to grade numerous essays every day within time bounds. This project aims to solve this problem by making a stable interface that can help the humans for grading the essays. This research has been a platform for us to extract features like Bag of words, numerical features like count of sentences and words along with their average lengths, structure and organization; to grade the essay and achieve the maximum accuracy possible. For this we select the best possible features set, by comparing the accuracy of every possible set. This system basically uses the Sequential forward feature selection algorithm to find out the best possible feature subset. This algorithm starts with empty set and ends with an efficient set. This algorithm has been used because it performs well for small sized dataset. It uses only simple operations, therefore it is easy to implement.

Key Words: Bag of Words, Sequential Forward feature Selection, Features set.

1. INTRODUCTION Automated Essay Grading has been a research area since early 60’s. We’ll have to predict the score of an essay that resembles the output given by human readers. It is a difficult task because, we should extract both quantifiable and unquantifiable features like thoughts of writer while inscribing on paper. It may seem to be an easy process for extracting quantifiable features, but to analyze ambiguities in natural language that humans use every day is a challenging task.

|

Impact Factor value: 5.181

|

Objective of the system is to classify a large set of textual entities into a small number of discrete categories, corresponding to the possible scores—for example, from 1 to 100. Using a training dataset, the artificial environment we create can identify the patterns and tries to predict the next possible output. This project explores text mining approach and it can help with essay scoring. 1.1.

Motivation

Many testing programs around the world include at least one essay writing in them. Examples include the GMAT, GRE, as well as the Pearson Test. Some of the strengths of scoring by human graders are that they can (a) consistently score the essay every time, (b) connect it with their prior knowledge, and (c) make a judgment on the quality of the text. Only in 2012, more than 655,000 test takers. [4] Worldwide took the GRE revised General Test (ETS, 2013), with two essay prompts, producing a total of more than 1.3 million responses.[1] So, involving humans in such assessments evaluation seem to be a laborious task besides it being expensive and time consuming. Besides this if humans are the graders essay score may be biased. So, as a student we think we will be very grateful to get a system like that. And that is what motivated us to work on this. 1.2. Research Goal Goal is to make the system understand to one of the human languages and complexities in that. We’ve also tried to find algorithms to check how accurate they work. Further, this has enabled us to know about the automated systems and experiment it with the machine learning algorithm to generate a stable interface to serve our purpose.

2. PROJECT FLOW 2.1. Data We used a data set provided by Kaggle.com as a part of some online competition. Each essay has one or more human scores. Each essay set has a different grading rubric, from which we’ve derived a set of rubrics. Each essay is approximately 3-4 paragraphed in length. Some more essays are extracted and graded by human experts. ISO 9001:2008 Certified Journal

|

Page 2852

Turn static files into dynamic content formats.

Create a flipbook