Vikalp
Professor,
![]()
Professor,
Dept. of AI and Data
Vivekanand Education Societyâs institute
Technology, Mumbai, Maharashtra, India
2018.parth.kalbag@ves.ac.in
2018.divya.kukreja@ves.ac.in
2018.ritu.kalyani@ves.ac.in, Dept. of Information Technology, Vivekanand Education Societyâs institute of Technology, Mumbai, Maharashtra, India
Abstract In any education system, examinations are conducted to judge the caliber of the students. To conduct the examination, teachers need to generate the questions manually which is a very time consuming process. To reduce time and effort, a system through which multiple choice questions can be automaticallygeneratedforuser giventextis proposed in this paper. Fill In the Blanks, True or False and Match the Following are the typesof multiple choicequestions covered.
Key Words: Automatic Multiple choice questions, Natural Language Processing, Mcqâs, Distractors, Conceptnet, Wordnet, Sense2Vec.
Inthismodernworldwheretechnologyisalwaysevolving and has increased its reach to various sectors of the educational industry, various schools and colleges have adopted the e learning platforms in which they assess students through various exams like GATE, CAT, College Exams, etc. it has be come quite a cumbersome task to generatemultiple choicequestionsforthewholesyllabus. MCQ stylequestionsareabasicinstructionaltoolthatmay be used for several reasons. These types of questions can affect student learning in addition to serving as an assessment tool. In certain examinations, the only way of ensuring proper assessment of students is by generating goodqualityMCQssothattheycanexploremuchmorethan usualandarewell suitedtothecurrente learningmeasures.
ThepaperfocusesongeneratingvarioustypesofMCQslike Fill in the Blanks, True or False, and Match the following. MultipleChoicequestionsareaformofevaluationinwhich respondents are inquired to choose the most appropriate reply out of the choices from a list. The MCQs are formed frombasicallytwoentities:aquestionandvariouspossible optionsincludingthecorrectanswer.Thequestionisformed bydeterminingwhichtypeofquestioncanbeformed.For e.g.,âThenumberofdaysinaweekareâForthefollowing sentenceasthesentencecontainsanumericalvaluesothe typeofquestiongeneratedcanbeanumericalonewiththe correct answer replaced by a â â and suppose considering anotherexampleâStill,numbersforserveruseofWindows
(thatarecomparabletocompetitors)showonethirdmarket share, similar to that for end user use.â. If this is the respective question generated, then the type of MCQ the systemshouldgenerateiseitheraTrueorFalseoraFillin theblanktypeofMCQ.
The major challenge in generating any MCQ is the requirementforgreatdistractors,i.e.,distractorsought to showupasaconceivablereplytothequestionindeedtoan understudywithgreatinformationonthespace.Atthesame time,itshouldnotbeasubstitutereplyorsynonym.Besides, awellwrittenMCQshouldcontainsufficientdatatoanswer thequestion.Forexample:âThecolorofthebloodis.âand thecorrespondingdistractorsare:Red,Maroon,DarkRed, CrimsonRed.Asinthisexampleallthedistractorshavevery similarmeaning.Hence,thedistractorsshouldactuallymake moresenseasinthisexample:âThecolorofthebloodis.â and the corresponding distractors are: Red, Blue, Green, White.Here,allthedistractorshavesimilarcontextbutare clearly distinguishable from each other and are making a goodimpact.
Thissectiondiscussesresearchworkswhichproposedthe ideaforAutomaticMCQgeneration.ClozeQuestionscontain questionswherequestionscontainoneormoreblanksand multiplechoiceslistedtopickananswerisdiscussedin[1].It wasactuallya goal orientedsystem,thatis,aspecific field basedoncricketworldcupdata.Clozesystemisdividedinto three modules: sentence selection, keyword selection and distractor selection. So, the end output gives an English articleoncricketWorldcupandthesystemgeneratesCloze questions.
Realtimemultiple choicequestiongenerationforlanguage testing and English grammar is discussed in [2]. For the application,NLPtechniqueandbasicmachinelearningsemi supervised algorithm such as Naive Bayes Classifier and KNearestNeighborsalgorithmwereused.Areal timesystem generatesonlyonetypei.eFillintheBlankstypeofquestions on English grammar and vocabulary from online news articleswhichtakesanHTMLfileasinputandturnsitinto thequizsession.
Thenextresearchpaperthatwasdiscussedandexamined was based on generating MCQ questions using string similaritymeasurein[3].Theresearchpapermainlyfocused onkeywordselectionandgeneratingdistractorsbasedonthe 1 semantic labels and named entities in text and string similarity measures respectively. So, for selection of sentencesandkeywords,itisbasedonthesemanticlabels and named entities that exist in the sentence. And for the distractorgeneratorconceptofsimilaritymeasurebetween thesentencesisused.Inthisproposal,threealgorithmsofthe characterbasedtypeandfivealgorithmsofthetermbased typetomeasurethesimilaritybetweentwosentenceswere used.
Similarly,anothersystemisproposedwhichgeneratesthe automaticMCQbasedquestionsusingthetextextractedfrom the web in [4]. So mainly web scraping is done, and it summarizes the text using the technique of fireflies preference learning. So, the main part is sentences and distractors.Theytransformedthesentenceintostemsandfor distractorgenerationtheyusedthesimilaritymetricssuchas hyponyms and hypernyms. Also, the system is used to generatetheanalogyquestionstotesttheverbalabilityof students.
Thesystemtakestheinputtextfromtheuserandgenerates MCQs.TheMCQsthatwillbegeneratedareinthemixedform ofFillintheBlanks,TrueorFalseandMatchthefollowing.
Thesystemwouldanalyzethetextbysummarizingthetext using the âBERT State of Artâ language model algorithm. Besides,all the watchwordsfrom the summarizedcontent areextractedusingPOStagging.AtthatpointtheSentence Mapping is done by extracting the sentences for each keyword. Within the Sentence Mapping, all the sentences fromwhichMCQsaretobeshapedareextracted.Sincethe systemisnâtfairlykepttoonesortofMCQ,thusclassification isanimportantstepofthesystem.Allthesentenceswhich were extracted within the Sentence Mapping phase are classifiedhereaswhichsortofMCQshouldbeshapedfrom whichsentence.Andafterthatbasedonthesortofsentence classified,comparingDistractorsarecreated.Atlast,different techniquesareusedsothattheuserwouldgetsyntactically correctMCQs.Thestepscarriedoutbythesystemareshown inFig 1.
Forthetextsummarizationthesystemisusingapredefined BERT (Bidirectional transformer) Extractive Summarizer model.Itisastateofartlanguagemodelinwhichminimum length and maximum length is provided to the model to generatesummarizedtext.
After finding the relevant information from the generated summarized text the next step is to find all the important keywordswhichcanbetreatedasananswertoMCQ.Hence, POS tagging is used so that all the relevant nouns, proper nouns,adjectives,andverbsareextractedaskeywords.Then they are sorted in descending order according to the preferenceofthebestones.
Keyword sentence mapping is done by tokenizing the sentences. Here, all the sentences of the corresponding keywords are extracted and tokenized. These are the sentences from which corresponding Multiple choice questionswillbegenerated.
If the sentence contains any numeric value, then those sentencesareclassifiedintonumericMCQscategoryassuch typeofMCQswouldbemoreaccurate,elsearandomnumber
International Research Journal of Engineering and Technology (IRJET)
Volume: 09 Issue: 04 | Apr 2022 www.irjet.net
isgeneratedbetween0to1andifthenumberisbetween0to 0.75thenFillintheBlanksMCQwouldbegeneratedandif thenumberisbetween0.76to1thenTrueorFalsetypeof MCQwouldbegenerated.
A good distractor is one that gives similar meaning as the answer but is not the answer itself. The goodness and toughness of an MCQ question depend on how close the 2 distractorsaretothekey.Thecloserthedistractorsaretothe key, the more difficult the question is. This step is independentforeachtypeofMCQ.Basedonthecategoryof the MCQ that is to be formed from the sentence, correspondingtypesofdistractorsaregenerated.Beforethe generationofdistractors,âwordsenseâisfirstappliedforthe answer and based on that sense the distractors are generated
This section discusses the implementation part of our proposedsystem.Firstweshowtheoutcomeofcoep package developedto standardise the questiongeneration process. Next,weshowtheintegrationoftheproposedsystemwitha websitethroughAPIsalongwiththePerformanceAnalysis andNavigationpart
Forthetextsummarizationthesystemisusingapredefined BERT (Bidirectional transformer) Extractive Summarizer model.Itisastateofartlanguagemodelinwhichminimum length and maximum length is provided to the model to generatesummarizedtext.
© 2022, IRJET
ISSN: 2395 0056
ISSN: 2395 0072
For the Fill in the blank type of MCQs, the distractors are generatedby2ways.Theyareasfollows:
1)Wordnet:Forallthedictionarykeywordswordnetworks well.Itisusedtodeterminethesimilaritybetweenwords. TheWordnetalgorithmmeasuresthedistanceamongwords and synsets in WordNetâs graph structure, such as by countingthenumberofedgesamongsynsets.Ifthewords are closer, the synsets are similar. It works in 2 stages : i) GeneratingtheSynsets.ii)Generatingdistractorsforallthe Synsets.
2) Conceptnet: Conceptnet works well with Nouns and ProperNouns.Henceforthenounandpropernountypeof keywordsConceptnetisusedoverWordnet.Itisaknowledge graphthatconnectswordsandphrasesofnaturallanguage with labelled edges. Its knowledge is collected from many sources that include expert created resources, crowdsourcing,andgameswithapurpose.
Impact Factor value: 7.529
Forgeneratingappropriatedistractorsforanumericanswer, addinganyintegerbetween+5and 5.Anyrandomnumber will beselectedbetweenthisrangeanditwill beaddedto answer and corresponding three distractors would be generated. Then shuffling these numbers using the FisherYates Shuffle algorithm so that the answer wonât be justaparticularchoiceratheritwouldbearandomchoice.
ISO 9001:2008
Journal |
âSense2Vecâisappliedtogeneratethemostsenseoptionsfor the filtered keywords as column 2 to match the following pattern, ensuring to find the most identical word for the filteredkeywords.ForExample:âmostsensewordforforest iswoodslikewiseforpawsisbackfeet.âFramingthecorrect option first, then the other distractors. To begin with, the generated choicesare a,b,c,dand used the loop to find the appropriate option, i.e., the correct value for the filtered keyword, and assigned the random choice, ensuring that otherkeywordsareassignedtootherchoices,andrepeated thisprocessfortheotherfilteredkeywords.Forexample: âwoodsgetsthechoicedasarandomchoicethentheback feet will be assigned to another choice but not d as it is alreadyassignedâ.Aftergeneratingthecorrectoption,other distractors are generated. Finally, the Fisher Yates Shuffle algorithm is used to shuffle these distractors so that the outputisnotadefinitechoicebutratherarandomchoice.
ThealgorithmsareimplementedinPythonusingFlaskasa framework and tested on a number of different text files whicharetakenfromavarietyofdifferentwebsiteswhich includes blogs, Wikipedia articles, articles based on some topics and so on. The experimental results show how effectivethesystemis,forextractingtheMCQsfromthetext. Following are some of the resultant MCQs that have been obtainedwhenprovidingthefollowingtextfile.
Output: [ { "question":"Themostrecentversionforservercomputers isWindowsServer_____version21H2", "answer":"2022", "id":1, "choice3":"2022", "choice4":"2021", "choice1":"2024", "choice2":"2023" }, { "answer":false,
International Research Journal of Engineering and Technology (IRJET)
Volume: 09 Issue: 04 | Apr 2022 www.irjet.net
"question":"MicrosoftWindows,commonlyreferredtoas Windows, is a group of multiple proprietary graphics operatingsystemfamilies,allofwhicharenotdevelopedand soldbyMicrosoft", "id":2
"question":"AspecializedversionofWindowsalsorunson theXboxOneandXboxSeriesXSvideo______consoles", "answer":"Game", "id":3, "choice3":"Positivity", "choice4":"CriticalMass", "choice1":"Game", "choice2":"Acting"
"question":"1)serverâ JustTheNumber2)numberâ WholeGame3)gameâ DesktopComputer4)computerâ SameServer", "answer":"1â d2â a3â b4â c", "id":4, "choice3":"1â b2â c3â d4â a", "choice4":"1â d2â a3â b4â c", "choice1":"1â a2â d3â b4â c", "choice2":"1â d2â a3â c4â b"
"answer":true, "question":"StillnumbersforserveruseofWindowsthat arecomparabletocompetitorsshowonethirdmarketshare similartothatforenduseruse", "id":5 } ]
So,thesystemisevaluatedmanuallywhereevaluatorscheck thesystemonvarioustextfilestoensurethesyntacticand semanticcorrectnessofthequestionsandalsothequalityof distractors.ItisobservedthatthenumberofMCQsthatare formed from any text are variable. This is directly proportionaltothesummarizationofthetext.Theexactness of MQCs generator is found to be really good (nearly all questionsareofgreatlevel)sincedistractorsarechosenby
ISSN: 2395 0056
ISSN: 2395 0072
applying various techniques. Hence, the framework effectivelycreatestheprogrammeddifferenttypesofMCQs.
The system is tested by giving various inputs on different domainsintheformoftextandthesystemisworkingwellby generating good quality MCQs and generates the syntactic andsemanticcorrectquestionsalongwiththegoodqualityof distractors. The problem ofmanuallycreatingtheMCQs is solved,andthesystemishelpfulforteachersforgenerating theMCQtypequestions.
Followingarethefuturescopesofthesystem:
1) To build the same system for other languages such as Hindi,Urdu,SouthIndianLanguages,etc.
2)Variousdifferenttypesofmultiple choicequestionslike answers containing images, a multiple choice question containingquestionandmultiplecorrectanswerscouldbe added.
[1] Annamaneni Narendra, Manish Agarwal and Rakshit shah(2013)âAutomaticCloze QuestionsGenerationâ.In Proceedings of Recent Advances in Natural Language Processing,pp.511 515.
[2] AyakoHoshino,HiroshiNakagawa(2005).âAreal time multiple choicequestiongenerationforlanguagetesting a preliminary studyâ. In Proceedings of the 2nd WorkshoponBuildingEducationalApplicationsUsing NLP,pp.17 20.
[3] [3]IbrahimEldesokyFattoh,2014.âAutomaticMultiple Choice Question Generation System for Semantic Attributes Using String Similarity Measuresâ. In Computer Engineering and Intelligent Systems www.iiste.orgISSN2222 1719(Paper)ISSN2222 2863 (Online)Vol.5,No.8,pp.66 73.
[4] [4] Santhanavijayan, A., Balasundaram, S.R., Hari Narayanan,S.,VinodKumar,S.,andVigneshPrasad,V., 2017. âAutomatic generation of multiple choice questionsfore assessmentâ.InInt.J.SignalandImaging SystemsEngineering,Vol.10,Nos.1/2,pp.54 62.
[5] [5]MingLiu,RafaelCalvo,A.,andVasileRus,2012.âG Asks: An Intelligent Automatic Question Generation SystemforAcademicWritingSupportâ.InDialogueand Discourse3(2),pp.101 124.
[6] [6] Shivank Pandey, Rajeswari, K.C., 2013. âAutomatic Question Generation Using Software Agents for Technical Institutionsâ. In International Journal of AdvancedComputerResearch(ISSN(print):2249 7277
© 2022, IRJET | Impact Factor value: 7.529 | ISO 9001:2008 Certified Journal | Page3213
International Research Journal of Engineering and Technology (IRJET) e ISSN: 2395 0056
09 Issue: 04
2022 www.irjet.net
ISSN(online):2277 7970)Volume 3Number 4Issue 13,pp.307 311
[7] [7] Manish Agarwal, Rakshit Shah and Prashanth Mannem,2011.âAutomaticQuestionGenerationusing DiscourseCuesâ.InProceedingsoftheSixthWorkshop on Innovative Use of NLP for Building Educational Applications,pp.1 9.
[8] [8] Arjun Singh Bhatia, Manas Kirti, and Sujan Kumar Saha,2013. âAutomatic Generation of Multiple Choice Questions Using Wikipediaâ. In P. Maji et al. (Eds.): PReMI2013,LNCS8251,pp.733 738.
ISSN: 2395 0072