Skip to main content

Vikalp-Automatic multiple choice questions generator

Page 1

Vikalp

Professor,

Automatic multiple choice questions generator

Dept. of AI and Data

Vivekanand Education Society’s institute

Technology, Mumbai, Maharashtra, India

2018.parth.kalbag@ves.ac.in

2018.divya.kukreja@ves.ac.in

2018.ritu.kalyani@ves.ac.in, Dept. of Information Technology, Vivekanand Education Society’s institute of Technology, Mumbai, Maharashtra, India

Abstract In any education system, examinations are conducted to judge the caliber of the students. To conduct the examination, teachers need to generate the questions manually which is a very time consuming process. To reduce time and effort, a system through which multiple choice questions can be automaticallygeneratedforuser giventextis proposed in this paper. Fill In the Blanks, True or False and Match the Following are the typesof multiple choicequestions covered.

Key Words: Automatic Multiple choice questions, Natural Language Processing, Mcq’s, Distractors, Conceptnet, Wordnet, Sense2Vec.

1. INTRODUCTION

Inthismodernworldwheretechnologyisalwaysevolving and has increased its reach to various sectors of the educational industry, various schools and colleges have adopted the e learning platforms in which they assess students through various exams like GATE, CAT, College Exams, etc. it has be come quite a cumbersome task to generatemultiple choicequestionsforthewholesyllabus. MCQ stylequestionsareabasicinstructionaltoolthatmay be used for several reasons. These types of questions can affect student learning in addition to serving as an assessment tool. In certain examinations, the only way of ensuring proper assessment of students is by generating goodqualityMCQssothattheycanexploremuchmorethan usualandarewell suitedtothecurrente learningmeasures.

ThepaperfocusesongeneratingvarioustypesofMCQslike Fill in the Blanks, True or False, and Match the following. MultipleChoicequestionsareaformofevaluationinwhich respondents are inquired to choose the most appropriate reply out of the choices from a list. The MCQs are formed frombasicallytwoentities:aquestionandvariouspossible optionsincludingthecorrectanswer.Thequestionisformed bydeterminingwhichtypeofquestioncanbeformed.For e.g.,“Thenumberofdaysinaweekare”Forthefollowing sentenceasthesentencecontainsanumericalvaluesothe typeofquestiongeneratedcanbeanumericalonewiththe correct answer replaced by a “ ” and suppose considering anotherexample“Still,numbersforserveruseofWindows

(thatarecomparabletocompetitors)showonethirdmarket share, similar to that for end user use.”. If this is the respective question generated, then the type of MCQ the systemshouldgenerateiseitheraTrueorFalseoraFillin theblanktypeofMCQ.

The major challenge in generating any MCQ is the requirementforgreatdistractors,i.e.,distractorsought to showupasaconceivablereplytothequestionindeedtoan understudywithgreatinformationonthespace.Atthesame time,itshouldnotbeasubstitutereplyorsynonym.Besides, awellwrittenMCQshouldcontainsufficientdatatoanswer thequestion.Forexample:“Thecolorofthebloodis.”and thecorrespondingdistractorsare:Red,Maroon,DarkRed, CrimsonRed.Asinthisexampleallthedistractorshavevery similarmeaning.Hence,thedistractorsshouldactuallymake moresenseasinthisexample:“Thecolorofthebloodis.” and the corresponding distractors are: Red, Blue, Green, White.Here,allthedistractorshavesimilarcontextbutare clearly distinguishable from each other and are making a goodimpact.

2. LITERATURE SURVEY

Thissectiondiscussesresearchworkswhichproposedthe ideaforAutomaticMCQgeneration.ClozeQuestionscontain questionswherequestionscontainoneormoreblanksand multiplechoiceslistedtopickananswerisdiscussedin[1].It wasactuallya goal orientedsystem,thatis,aspecific field basedoncricketworldcupdata.Clozesystemisdividedinto three modules: sentence selection, keyword selection and distractor selection. So, the end output gives an English articleoncricketWorldcupandthesystemgeneratesCloze questions.

Realtimemultiple choicequestiongenerationforlanguage testing and English grammar is discussed in [2]. For the application,NLPtechniqueandbasicmachinelearningsemi supervised algorithm such as Naive Bayes Classifier and KNearestNeighborsalgorithmwereused.Areal timesystem generatesonlyonetypei.eFillintheBlankstypeofquestions on English grammar and vocabulary from online news articleswhichtakesanHTMLfileasinputandturnsitinto thequizsession.

International Research Journal of Engineering and Technology (IRJET) e ISSN: 2395 0056 Volume: 09 Issue: 04 | Apr 2022 www.irjet.net p ISSN: 2395 0072 © 2022, IRJET | Impact Factor value: 7.529 | ISO 9001:2008 Certified Journal | Page3207
-
Amit Singh1 , Parth Kalbag2 , Divya Kukreja3, Ritu Kalyani4 1Assistant
amit.singh@ves.ac.in,
,
of
2 4Student,
,
,
***

Thenextresearchpaperthatwasdiscussedandexamined was based on generating MCQ questions using string similaritymeasurein[3].Theresearchpapermainlyfocused onkeywordselectionandgeneratingdistractorsbasedonthe 1 semantic labels and named entities in text and string similarity measures respectively. So, for selection of sentencesandkeywords,itisbasedonthesemanticlabels and named entities that exist in the sentence. And for the distractorgeneratorconceptofsimilaritymeasurebetween thesentencesisused.Inthisproposal,threealgorithmsofthe characterbasedtypeandfivealgorithmsofthetermbased typetomeasurethesimilaritybetweentwosentenceswere used.

Similarly,anothersystemisproposedwhichgeneratesthe automaticMCQbasedquestionsusingthetextextractedfrom the web in [4]. So mainly web scraping is done, and it summarizes the text using the technique of fireflies preference learning. So, the main part is sentences and distractors.Theytransformedthesentenceintostemsandfor distractorgenerationtheyusedthesimilaritymetricssuchas hyponyms and hypernyms. Also, the system is used to generatetheanalogyquestionstotesttheverbalabilityof students.

3. PROPOSED METHODOLOGY

Thesystemtakestheinputtextfromtheuserandgenerates MCQs.TheMCQsthatwillbegeneratedareinthemixedform ofFillintheBlanks,TrueorFalseandMatchthefollowing.

Thesystemwouldanalyzethetextbysummarizingthetext using the “BERT State of Art” language model algorithm. Besides,all the watchwordsfrom the summarizedcontent areextractedusingPOStagging.AtthatpointtheSentence Mapping is done by extracting the sentences for each keyword. Within the Sentence Mapping, all the sentences fromwhichMCQsaretobeshapedareextracted.Sincethe systemisn’tfairlykepttoonesortofMCQ,thusclassification isanimportantstepofthesystem.Allthesentenceswhich were extracted within the Sentence Mapping phase are classifiedhereaswhichsortofMCQshouldbeshapedfrom whichsentence.Andafterthatbasedonthesortofsentence classified,comparingDistractorsarecreated.Atlast,different techniquesareusedsothattheuserwouldgetsyntactically correctMCQs.Thestepscarriedoutbythesystemareshown inFig 1.

A.Summarization

Fig -1:ProposedSystem

Forthetextsummarizationthesystemisusingapredefined BERT (Bidirectional transformer) Extractive Summarizer model.Itisastateofartlanguagemodelinwhichminimum length and maximum length is provided to the model to generatesummarizedtext.

B.KeywordExtraction

After finding the relevant information from the generated summarized text the next step is to find all the important keywordswhichcanbetreatedasananswertoMCQ.Hence, POS tagging is used so that all the relevant nouns, proper nouns,adjectives,andverbsareextractedaskeywords.Then they are sorted in descending order according to the preferenceofthebestones.

C.KeywordSentenceMapping

Keyword sentence mapping is done by tokenizing the sentences. Here, all the sentences of the corresponding keywords are extracted and tokenized. These are the sentences from which corresponding Multiple choice questionswillbegenerated.

D.Classification

If the sentence contains any numeric value, then those sentencesareclassifiedintonumericMCQscategoryassuch typeofMCQswouldbemoreaccurate,elsearandomnumber

International Research Journal of Engineering and Technology (IRJET) e ISSN: 2395 0056 Volume: 09 Issue: 04 | Apr 2022 www.irjet.net p ISSN: 2395 0072 © 2022, IRJET | Impact Factor value: 7.529 | ISO 9001:2008 Certified Journal | Page3208

International Research Journal of Engineering and Technology (IRJET)

Volume: 09 Issue: 04 | Apr 2022 www.irjet.net

isgeneratedbetween0to1andifthenumberisbetween0to 0.75thenFillintheBlanksMCQwouldbegeneratedandif thenumberisbetween0.76to1thenTrueorFalsetypeof MCQwouldbegenerated.

E.GeneratingDistractors

A good distractor is one that gives similar meaning as the answer but is not the answer itself. The goodness and toughness of an MCQ question depend on how close the 2 distractorsaretothekey.Thecloserthedistractorsaretothe key, the more difficult the question is. This step is independentforeachtypeofMCQ.Basedonthecategoryof the MCQ that is to be formed from the sentence, correspondingtypesofdistractorsaregenerated.Beforethe generationofdistractors,‘wordsense’isfirstappliedforthe answer and based on that sense the distractors are generated

4. ALGORITHMS

This section discusses the implementation part of our proposedsystem.Firstweshowtheoutcomeofcoep package developedto standardise the questiongeneration process. Next,weshowtheintegrationoftheproposedsystemwitha websitethroughAPIsalongwiththePerformanceAnalysis andNavigationpart

A. Fill in the Blanks

Forthetextsummarizationthesystemisusingapredefined BERT (Bidirectional transformer) Extractive Summarizer model.Itisastateofartlanguagemodelinwhichminimum length and maximum length is provided to the model to generatesummarizedtext.

© 2022, IRJET

ISSN: 2395 0056

ISSN: 2395 0072

For the Fill in the blank type of MCQs, the distractors are generatedby2ways.Theyareasfollows:

1)Wordnet:Forallthedictionarykeywordswordnetworks well.Itisusedtodeterminethesimilaritybetweenwords. TheWordnetalgorithmmeasuresthedistanceamongwords and synsets in WordNet’s graph structure, such as by countingthenumberofedgesamongsynsets.Ifthewords are closer, the synsets are similar. It works in 2 stages : i) GeneratingtheSynsets.ii)Generatingdistractorsforallthe Synsets.

2) Conceptnet: Conceptnet works well with Nouns and ProperNouns.Henceforthenounandpropernountypeof keywordsConceptnetisusedoverWordnet.Itisaknowledge graphthatconnectswordsandphrasesofnaturallanguage with labelled edges. Its knowledge is collected from many sources that include expert created resources, crowdsourcing,andgameswithapurpose.

Impact Factor value: 7.529

Fig 2:Fillintheblanks

B. Numeric Fill in the Blanks

Forgeneratingappropriatedistractorsforanumericanswer, addinganyintegerbetween+5and 5.Anyrandomnumber will beselectedbetweenthisrangeanditwill beaddedto answer and corresponding three distractors would be generated. Then shuffling these numbers using the FisherYates Shuffle algorithm so that the answer won’t be justaparticularchoiceratheritwouldbearandomchoice.

ISO 9001:2008

Journal |

e
p
|
|
Certified
Page3209

C. True or False questions

International Research Journal of Engineering and Technology (IRJET) e ISSN: 2395 0056 Volume: 09 Issue: 04 | Apr 2022 www.irjet.net p ISSN: 2395 0072 © 2022, IRJET | Impact Factor value: 7.529 | ISO 9001:2008 Certified Journal | Page3210 Fig -3:NumericFillintheblanks
Byusingthisalgorithmtrueorfalsequestionsaregenerated soonlyoptionsforthequestionwouldbeTrueorFalsesono algorithmforgeneratingthedistractorsisrequired.

Fig 4:TrueorFalsequestions

D. Match the Following .

‘Sense2Vec’isappliedtogeneratethemostsenseoptionsfor the filtered keywords as column 2 to match the following pattern, ensuring to find the most identical word for the filteredkeywords.ForExample:“mostsensewordforforest iswoodslikewiseforpawsisbackfeet.”Framingthecorrect option first, then the other distractors. To begin with, the generated choicesare a,b,c,dand used the loop to find the appropriate option, i.e., the correct value for the filtered keyword, and assigned the random choice, ensuring that otherkeywordsareassignedtootherchoices,andrepeated thisprocessfortheotherfilteredkeywords.Forexample: “woodsgetsthechoicedasarandomchoicethentheback feet will be assigned to another choice but not d as it is alreadyassigned”.Aftergeneratingthecorrectoption,other distractors are generated. Finally, the Fisher Yates Shuffle algorithm is used to shuffle these distractors so that the outputisnotadefinitechoicebutratherarandomchoice.

International Research Journal of Engineering and Technology (IRJET) e ISSN: 2395 0056 Volume: 09 Issue: 04 | Apr 2022 www.irjet.net p ISSN: 2395 0072 © 2022, IRJET | Impact Factor value: 7.529 | ISO 9001:2008 Certified Journal | Page3211

5. RESULTS

ThealgorithmsareimplementedinPythonusingFlaskasa framework and tested on a number of different text files whicharetakenfromavarietyofdifferentwebsiteswhich includes blogs, Wikipedia articles, articles based on some topics and so on. The experimental results show how effectivethesystemis,forextractingtheMCQsfromthetext. Following are some of the resultant MCQs that have been obtainedwhenprovidingthefollowingtextfile.

Fig 6:Inputfile

Output: [ { "question":"Themostrecentversionforservercomputers isWindowsServer_____version21H2", "answer":"2022", "id":1, "choice3":"2022", "choice4":"2021", "choice1":"2024", "choice2":"2023" }, { "answer":false,

International Research Journal of Engineering and Technology (IRJET) e ISSN: 2395 0056 Volume: 09 Issue: 04 | Apr 2022 www.irjet.net p ISSN: 2395 0072 © 2022, IRJET | Impact Factor value: 7.529 | ISO 9001:2008 Certified Journal | Page3212 Fig 5:MatchtheFollowing

International Research Journal of Engineering and Technology (IRJET)

Volume: 09 Issue: 04 | Apr 2022 www.irjet.net

"question":"MicrosoftWindows,commonlyreferredtoas Windows, is a group of multiple proprietary graphics operatingsystemfamilies,allofwhicharenotdevelopedand soldbyMicrosoft", "id":2

"question":"AspecializedversionofWindowsalsorunson theXboxOneandXboxSeriesXSvideo______consoles", "answer":"Game", "id":3, "choice3":"Positivity", "choice4":"CriticalMass", "choice1":"Game", "choice2":"Acting"

"question":"1)server→ JustTheNumber2)number→ WholeGame3)game→ DesktopComputer4)computer→ SameServer", "answer":"1→ d2→ a3→ b4→ c", "id":4, "choice3":"1→ b2→ c3→ d4→ a", "choice4":"1→ d2→ a3→ b4→ c", "choice1":"1→ a2→ d3→ b4→ c", "choice2":"1→ d2→ a3→ c4→ b"

"answer":true, "question":"StillnumbersforserveruseofWindowsthat arecomparabletocompetitorsshowonethirdmarketshare similartothatforenduseruse", "id":5 } ]

So,thesystemisevaluatedmanuallywhereevaluatorscheck thesystemonvarioustextfilestoensurethesyntacticand semanticcorrectnessofthequestionsandalsothequalityof distractors.ItisobservedthatthenumberofMCQsthatare formed from any text are variable. This is directly proportionaltothesummarizationofthetext.Theexactness of MQCs generator is found to be really good (nearly all questionsareofgreatlevel)sincedistractorsarechosenby

ISSN: 2395 0056

ISSN: 2395 0072

applying various techniques. Hence, the framework effectivelycreatestheprogrammeddifferenttypesofMCQs.

6. CONCLUSION

The system is tested by giving various inputs on different domainsintheformoftextandthesystemisworkingwellby generating good quality MCQs and generates the syntactic andsemanticcorrectquestionsalongwiththegoodqualityof distractors. The problem ofmanuallycreatingtheMCQs is solved,andthesystemishelpfulforteachersforgenerating theMCQtypequestions.

Followingarethefuturescopesofthesystem:

1) To build the same system for other languages such as Hindi,Urdu,SouthIndianLanguages,etc.

2)Variousdifferenttypesofmultiple choicequestionslike answers containing images, a multiple choice question containingquestionandmultiplecorrectanswerscouldbe added.

REFERENCES

[1] Annamaneni Narendra, Manish Agarwal and Rakshit shah(2013)“AutomaticCloze QuestionsGeneration”.In Proceedings of Recent Advances in Natural Language Processing,pp.511 515.

[2] AyakoHoshino,HiroshiNakagawa(2005).“Areal time multiple choicequestiongenerationforlanguagetesting a preliminary study”. In Proceedings of the 2nd WorkshoponBuildingEducationalApplicationsUsing NLP,pp.17 20.

[3] [3]IbrahimEldesokyFattoh,2014.“AutomaticMultiple Choice Question Generation System for Semantic Attributes Using String Similarity Measures”. In Computer Engineering and Intelligent Systems www.iiste.orgISSN2222 1719(Paper)ISSN2222 2863 (Online)Vol.5,No.8,pp.66 73.

[4] [4] Santhanavijayan, A., Balasundaram, S.R., Hari Narayanan,S.,VinodKumar,S.,andVigneshPrasad,V., 2017. “Automatic generation of multiple choice questionsfore assessment”.InInt.J.SignalandImaging SystemsEngineering,Vol.10,Nos.1/2,pp.54 62.

[5] [5]MingLiu,RafaelCalvo,A.,andVasileRus,2012.“G Asks: An Intelligent Automatic Question Generation SystemforAcademicWritingSupport”.InDialogueand Discourse3(2),pp.101 124.

[6] [6] Shivank Pandey, Rajeswari, K.C., 2013. “Automatic Question Generation Using Software Agents for Technical Institutions”. In International Journal of AdvancedComputerResearch(ISSN(print):2249 7277

© 2022, IRJET | Impact Factor value: 7.529 | ISO 9001:2008 Certified Journal | Page3213

e
p
}, {
}, {
}, {

International Research Journal of Engineering and Technology (IRJET) e ISSN: 2395 0056

09 Issue: 04

2022 www.irjet.net

ISSN(online):2277 7970)Volume 3Number 4Issue 13,pp.307 311

[7] [7] Manish Agarwal, Rakshit Shah and Prashanth Mannem,2011.“AutomaticQuestionGenerationusing DiscourseCues”.InProceedingsoftheSixthWorkshop on Innovative Use of NLP for Building Educational Applications,pp.1 9.

[8] [8] Arjun Singh Bhatia, Manas Kirti, and Sujan Kumar Saha,2013. “Automatic Generation of Multiple Choice Questions Using Wikipedia”. In P. Maji et al. (Eds.): PReMI2013,LNCS8251,pp.733 738.

ISSN: 2395 0072

Volume:
| Apr
p
© 2022, IRJET | Impact Factor value: 7.529 | ISO 9001:2008 Certified Journal | Page3214

Turn static files into dynamic content formats.

Create a flipbook