A study on identifying similar users across multiple social media sites

Page 1

International Research Journal of Engineering and Technology (IRJET)

e-ISSN: 2395 -0056

Volume: 04 Issue: 03 | Mar -2017

p-ISSN: 2395-0072

www.irjet.net

A STUDY ON IDENTIFYING SIMILAR USERS ACROSS MULTIPLE SOCIAL MEDIA SITES Anju Viswam1, Gopu Darsan2 PG scholar, Department of Computer Science and Engineering, Sree Buddha College of Engineering, Alappuzha, India. Assistant Professor, Department of Computer Science and Engineering, Sree Buddha College of Engineering, Alappuzha, India. ---------------------------------------------------------------------***---------------------------------------------------------------------1

2

Abstract - Within this course of time, social media sites

have gained a great attention. People rely on social media for different purposes. Not all the online social media will provide the same service. For this reason people tend to have multiple accounts on multiple social media sites. It is challenging and also interesting to identify the account that belongs to the same user in multiple social media sites. Many researchers have been conducted to match the user accounts in different social media sites. In this paper we have concentrated on different techniques used by the researchers to identify the similar accounts in multiple social media sites. Hence this paper will give an idea about the techniques that solves the problem of identifying user accounts across multiple online social media sites. Key Words: cross-communities, k-anonymity, l-diversity, social networks, social-tagging systems, user identification.

1. INTRODUCTION The users of internet are increasing and a large number of them are an active member of a social network. People rely on different social media networks for news, information and opinion of other people about different subjects. For example, people use Facebook for chat with their friends and families and also to share about their aspects of personal lives and Twitter to post about the things they are more passionate. For this reason people tend to have multiple accounts in multiple social media sites. Discovering the same user accounts that belong to the same user is becoming a growing interest among researchers. Though it is more challenging, it is useful in developing many applications. It is useful to aggregate information about a single user. Merged information about a single user can give a detailed view about all available data. This information will be helpful to construct a complete social graph that helps in many applications such as information retrieval, collaborative filtering, sentiment analysis etc. It is also useful in modern marketing. As the modern marketing deals with targeting marketing with promotional messages, it is very useful to discover the same user accounts. Once the target customer is identified, the marketer does not have to bother the customer with multiple messages that has the same content. Identifying the same user accounts among multiple Š 2017, IRJET

|

Impact Factor value: 5.181

|

sites is also useful in the application automatic contacts’ merging that happens almost in most of the mobile phones.

So identifying the same user accounts among multiple online social media sites is a challenging research area. Many studies where based on the profile attributes of the users, contents posted by the users and also by analyzing the network structures. Some of these studies are explained below.

2. LITERATURE SURVEY As explained before the current studies depend on profile attributes, contents and network structures.

2.1 User identification based on profile Username is a publicly available feature in the profile of the users. Perito et al. [1] presented an analytical model based on binary classifiers to calculate the similarity of usernames to identify the similar user accounts. An unsupervised approach was followed by Liu et al. [2] for linking users across multiple online sites. They computed the n-gram probabilities of the usernames to identify the rare and common usernames used by the persons. R. Zafarani and H. Liu [3] matched the users by extracting the usernames that appear in the URLs of the web pages. R. Zafarani and H. Liu [4] further developed a supervised learning approach to study the behaviors of various usernames chosen by the users among crosscommunities. The usernames are publicly available and it can be faked by anyone. So these studies have some limitations. Acquisti et al. [5] used the profile photos for matching the users. They used the face recognition algorithm and conducted the experiment on Facebook. Facebook profile photos are publicly available and it can be easily extracted. But this algorithm cannot be applied to large network because many users can use the same profile photos. Iofciu et al.[6] linked the users based on the tags and user ids across social tagging systems. The tagging behaviors of the users where used to construct the user profiles based on the symmetric variant BM25 to link the users using the tags. Though it supports cross-platform, this technique cannot assure about the privacy. Motoyama and G. Varghese [7] used the classifier based on boosting to calculate the similarity of usernames. They analyzed the profile attributes such as the ISO 9001:2008 Certified Journal

|

Page 354


Turn static files into dynamic content formats.

Create a flipbook
Issuu converts static files into: digital portfolios, online yearbooks, online catalogs, digital photo albums and more. Sign up and create your flipbook.