Tourist Destination Recommendation System using Cosine Similarity

Page 1

International Research Journal of Engineering and Technology (IRJET) e-ISSN:2395-0056

Volume: 09 Issue: 10 | Oct 2022 www.irjet.net p-ISSN:2395-0072

Tourist Destination Recommendation System using Cosine Similarity

Shashank Jagtap¹, Sayali Borate²

¹Student, School of Science, Dr. Vishwanath Karad MIT World Peace University, Pune, Maharashtra, India, ²Student, Dept. of Computer Engineering, PES Modern College of Engineering, Pune, Maharashtra, India ***

Abstract - Over the past decade, internet has become our go-to tool for information and even for recommendations. Internet is used to find places for vacations. However, with the vast amount of destination options and its information, a lot of time is wasted before a relevant tourist destination is ascertained. The Destination Recommendation System leverages on Data AnalysisandMachineLearninginordertogivecogentand fast recommendations. This paper describes an approach which offers generalized recommendations to every user using the Cosine Similarity algorithm. The dataset used exhibits a vast and distinct combination of tourist places. The Cosine Similarity algorithm predicts the most relevant tourist places using some important features of the dataset such as tourism category, minimum budget(per day)andthevisarequirement.

Keywords: machine learning, cosine similarity, recommendationsystem,touristdestination

1. INTRODUCTION

The amount of information is increasing day by day. To traverse this information in order to find relevant information is difficult. To automate the process of traversing this enormous information, we use machine learning algorithms. They are of 3 main types[2]: 1. Supervised:learningafunctionthatmapsaninputtoan output based on example input-output pairs. 2. Unsupervised[11]: algorithms are left to their own devicestodiscoverandpresenttheinterestingstructure in the data without a supervisor. 3. Reinforcement: algorithms are concerned with how software agents ought to take actions in an environment in order to maximize some notion of cumulative reward. We use these machine learning algorithms to find patterns and similarities between different items of a dataset. A recommendation system is an application of machine learning that plays a vital role in providing relevant recommendations, be it a movie recommendation or a productrecommendation.Inourdailylifewedependon recommendationsprovidedbyourfriendsandfamilyor general surveys.Similarly,recommendationsystemsare tools used to provide logical and rational product recommendations to users that might interest them by usingsomealgorithms.

TheDestinationRecommendationSystemfiltersthrough enormous data and provides a highly relevant and

cogentrecommendationbasedonvitalparametersofthe tourist dataset. It uses Cosine Similarity algorithm to providefastandreliablerecommendations.

Ourgoalistominimizeusereffortandprovideafastand onestopsolutionforfindingtheirdreamvacationplace.

1.1 Dataset Attributes:

● Destination: This specifies the tourist destination.Itconsistsofallthepossibletourist destinations and the output predicted is based ontheinputdestinationprovidedbytheuser.

● Category: This attribute describes the category ofaparticulartouristplace.Itspecifieswhether theplacehasa historical,cultural,architectural, commercialoranyothersignificance.

● Minimum budget per day: It specifies the minimum daily budget. This variable is expressedintermsofdollars($).

● Best Months: This attribute gives the best time to visit a particular destination based on climaticconditions.

● State/ Country: It specifies the location of the destination.

● Continent: Tells about the continent of the destination.

● Language: This variable describes the native languageofaparticulardestination.

● Visa: This attribute tells about the visa requirementforIndianpassports.Itisaboolean variable.

1.2 Phases of recommendation process:

There are three main phases[3] in the process of recommendationasfollows:

1. Information collection phase : In this phase, relevantinformationabouttheusersiscollected in order to generate a user profile. The system needs to know as much as possible from the user in order to provide reasonable recommendations.

© 2022, IRJET | Impact Factor value: 7.529 | ISO 9001:2008 Certified Journal | Page106

International Research Journal of Engineering and Technology (IRJET)

e-ISSN:2395-0056

Volume: 09 Issue: 10 | Oct 2022 www.irjet.net p-ISSN:2395-0072

Recommendation systems rely on different typesofinputsuchas:

● Explicit Feedback: In this, the system asks the users to provide their ratings for items via the user interface. The quality and efficiency of the recommendation system relies on the user ratings.

● Implicit Feedback: The user’s preferences are automatically inferred by monitoring different actionssuchasnavigationhistory,timespenton web pages, links followed, etc. There is no user participation required to gather implicit feedback,astheexplicitfeedback.

● Hybrid Feedback: It is a combination of bothexplicitandimplicit feedback.Itworksbyusing an implicit data as a check on explicit rating or allowing the user to give explicit feedback only whenhechoosestoexpressexplicitinterest.

2. Learning Phase : This phase applies learning algorithms on the user’s data which are obtained from the feedback in the information collection phase. In certain situations, the learningalgorithmsarethemethods which are helpful in drawing out the patterns appropriate forapplication.

3. Recommendation Phase: The pattern obtained from the previous phase is analyzed in order to provide recommendations for given data. This recommendation can be made either directly based on the dataset collected in information collection phase or through the system’s observedactivitiesoftheuser.

2. RELATED WORK

Recommender systems take into account user preferencesandthensuggestpersonalizedcontent.

Recommendersystemscanbedesignedusinganyofthe followingapproaches[1]:

feedback. It does not require other users' data duringrecommendationstooneuser.

2. Collaborative filtering[13] : Collaborative filtering filters information by using the interactions and data collected by the system from other users. This technique uses the similarity index-based technique. This filter can filter out items that users like on the basis of ratingsorreactionsbysimilarusers.Thereare2 subtypes-

a. User - user based - this measures the similarity between target users and otherusers.

b. Item - item based - this measures the similaritybetweentheitemsthattarget theusersrateorinteractwithandother terms.

3. Hybrid Filtering: Hybrid filtering is the combination of Content based filtering and Collaborative Filtering. It is used to eliminate the limitations of content-based and collaborative filtering and generate more accuraterecommendations.

3. METHODOLOGY

The project aims to build a platform that will recommend tourist destinations to users, provides a detailed description of the recommended destination. The information provided cuts down the time spent in going through multiple websites in order to decide a specificdestinationtovisit.

The model allows us to predict the best tourist destination based on user input. It uses a content based filtering algorithm, cosine similarity, to make these recommendations.

Following are the steps used for recommending destinationstotheuser

1. Content-based filtering[14]: Content-based recommenders are essentially a user-specific learning problem to quantify the user’s utility (likes and dislikes, rating, etc.) based on item features. Content-based filtering provides recommendations based on previously used

© 2022, IRJET | Impact Factor
7.529 | ISO 9001:2008 Certified Journal | Page107
value:
1. Content-basedfiltering 2. Collaborativefiltering
Hybridfiltering
3.
1. Datacollection 2. Datapreparation 3. Combiningrelevantfeatures 4. Applyfilteringalgorithm 5. Providerecommendations 1. Data collection: The
to build
destination recommendation system is
the appropriate data.
we designed the dataset. The dataset
contains
first step
a
getting
In this project,
used

International Research Journal of Engineering and Technology (IRJET)

e-ISSN:2395-0056

Volume: 09 Issue: 10 | Oct 2022 www.irjet.net p-ISSN:2395-0072

9 features namely, Index, Destination, Category, Min Budget(Per Day in $), Best Months, State/Country,Continent,LanguageandVisa.

2. Data Preparation : After data collection, preprocessing the data to handle corrupted or missingdataisessential.Thisisdoneinthedata preparationstep.

3. Combining relevant features: In this step, only thefeaturesrequiredtomakerecommendations will be combined into a single attribute. In destination recommendation system, 3 features(Category,MinBudget(Perdayin$)and visa requirement) will be used to make predictions.

4. Apply a filtering algorithm: To recommend destinations, a filtering algorithm needs to be implemented in order to find the amount of similarity between different destinations availableinthedataset.Inthisproject,acontent based filtering algorithm - cosine similarity is usedtomakerecommendations.

5. Provide recommendations: Once cosine similarity for a particular user input has been applied,differentdatarecordsinthedatasetwill get differentvalues.Based onthis,after sorting, the top 5 destinations will be recommended to theuser.

The proposed model uses cosine similarity in order to recommendtouristdestinationsbasedonuserinput.

Cosine similarity[5][12] : Cosine similarity is a metric, helpful in determining, how similar the data objects are irrespective of their size. We can measure the similarity between two records using cosine similarity. The comparison is done by finding the dot product between the two identities. The formula to find the cosine similarityis, where, cos(x,y)= || || || || (1)

● x.y=product(dot)ofthevectors‘x’and‘y’.

● ||x||and||y||=lengthofthetwovectors‘x’and ‘y’.

● ||x||*||y||=crossproductsoftwovectors.

Fig-1: CosineSimilarity

As the above diagram shows, the angle between v1 and v2isΘ.Greatertheanglebetweenthetwovectorslesser is the similarity. It means if the angle between two vectors is large, they are very different from each other and if the angle between the two vectors is small, then thevectorsarealmostalike.

Thecosinedistanceiscalculatedusingtheformula: cosinedistance=1-cosinesimilarity

Hence,when

● Θ=0 cos0=1 cosinedistance=1-1=0

∴Thetwovectorsaresame

● Θ=90 cos90=0 cosinedistance=1-0=1

∴Thetwovectorsareverydifferent

● Θ=180 cos180=-1 cosinedistance=1-(-1)=2

∴Thetwovectorsareoppositetoeachother

4. RESULT

In this recommendation system, we have used Cosine Similarity and Content-based filtering to recommend a destination to the user. The code is written in python languageandusesNumPyandpandaslibrary.

The project uses cosine similarity in order to determine what destinations are to be recommended to the user. Cosine similarity measures the angle between two vectorsanddeterminestheamountofsimilarity.

© 2022, IRJET | Impact Factor value: 7.529 | ISO 9001:2008 Certified Journal | Page108

International Research Journal of Engineering and Technology (IRJET)

e-ISSN:2395-0056

Volume: 09 Issue: 10 | Oct 2022 www.irjet.net p-ISSN:2395-0072

In Tourist Destination Recommendation System, the user is first asked to enter a destination of his liking. Cosine similarity algorithm is implemented on the destinationenteredbytheuser,thusgeneratingacosine distance for every destination available in the dataset. Parameters- Minimum budget, visa requirement and category are used to determine the similarity. Once all destinationshavebeenassignedascore,theyaresorted in descending order. The top 5 destinations similar to the one entered by the user are then recommended to him. Additional information about the location, budget, visa requirements, best time to visit is also provided to theuser.

Inthisdiagram,wehavetakentwocategories-‘Beaches’ onx-axisand‘Historical’ony-axis.Whentheuserenters the destination as “Maldives”, all destinations available in the dataset will be ranked. To measure the similarity, the angle between them is denoted by theta(Θ). Similarity ranges between 0 - 1. Mumbai lies in both categories (Historical, Beaches). Hence, when the destinations are ranked there is a high possibility of Mumbaigettingrecommendedafterplaceswithbeaches categoryarerecommended;ratherthanaplacewhichis onlyhistorical.

Output when the user enters “San Francisco” as his preferreddestinationisshowninFig-3.

5. CONCLUSION

The proposed system recommends similar destinations based on the user’s input which is a place of his liking. Every destination in the dataset is ranked with the help of the principle of Cosine Similarity and is then sorted before providing the recommendations. Currently, the recommendations are based on only three parameters; butthesystemcanbeimprovedusingmoreparameters. Newdestinationscanbeaddedtothedatasettoprovide morerecommendations.

Fig-2: Cosinesimilaritydepictingtwodifferent categories-BeachesandHistorical

6. REFERENCES

© 2022, IRJET | Impact Factor value: 7.529 | ISO 9001:2008 Certified Journal | Page109
Fig-3 :Topfiverecommendationsprovidedwhentheuserenters“SanFrancisco”ashislikeddestination
Sargam Maurya,
Gaurav
[1] Ramni Harbir Singh,
Tanisha Tripathi, Tushar Narula,
Srivastav, Movie Recommendation System using Cosine Similarity and KNN,InternationalJournalofEngineeringandAdvanced Technology(IJEAT)ISSN:2249–8958
[2] Batta Mahesh, Machine Learning Algorithms - A Review, International Journal of Science and Research (IJSR)ISSN:2319-7064 [3]F.O.Isinkaye, Y.O.Folajimi, B.A.Ojokoh, Recommendation systems: Principles, methods and Evaluation, Egyptian Informatics Journal, Volume 16, Issue3,November2015,Pages261-273
H
ori
[4] Prof Sachin Walunj, Yashwant Bhaidkar, Pranav Bhagwat,PriyankaBhalere,RadhikaGujar,TouristPlace RecommendationSystem,IJARIIE-ISSN(O)-2395-4396
is t
c al Beaches Mumbai Maldives

International Research Journal of Engineering and Technology (IRJET) e-ISSN:2395-0056

Volume: 09 Issue: 10 | Oct 2022 www.irjet.net p-ISSN:2395-0072

[5] Hieu V. Nguyen and Li Bai, Cosine Similarity Metric LearningforFaceVerification

[6] Mallari Vijay Kumar, P.N.V.S. Pavan Kumar, A Study on Different Phases and Various Recommendation System Techniques, International Journal of Recent Technology and Engineering (IJRTE) ISSN: 2277-3878, Volume-7,Issue-5C,February2019

[7]Shubham Pawar, Pritesh Patne, Priya Ratanghayra, Simran Dadhich, Shree Jaswal, Movies Recommendation System using Cosine Similarity, International Journal of Innovative Science and Research Technology ISSN No:2456-2165

[8] Sowmya D, Sayyed Johar, Ganavi M, Sankhya N Nayak,Analyzing Wine types and Quality using Machine Learning Techniques, International Journal of Engineering Applied Sciences and Technology, 2019, ISSNNo.2455-2143

[9] Pravinkumar Swamy, Sandeep Tiwari , Kunal Pawar Information Technology, Prof.Bharati Gondhalekar, Tourist Place Recommendation System, International Journal of Engineering Research & Technology (IJERT) ISSN:2278-0181

[10] Prof. P. A. Manjare, Miss P. V. Ninawe, Miss M. L. Dabhire,MissR.S.Bonde,MissD.S.Charhate,MissM.S. Gawande, Recommendation System Based on Tourist Attraction,InternationalResearchJournalofEngineering and Technology (IRJET), e-ISSN: 2395 -0056, p-ISSN: 2395-0072

[11]https://www.guru99.com/unsupervised-machinelearning.html

[12]https://www.geeksforgeeks.org/cosine-similarity/

[13]https://builtin.com/data-science/collaborativefiltering-recommender-system

[14]https://towardsdatascience.com/recommendationsystems-a-review-d4592b6caf4b

© 2022,
|
7.529 | ISO 9001:2008 Certified
Page110
IRJET
Impact Factor value:
Journal |

Turn static files into dynamic content formats.

Create a flipbook