Issuu

International Research Journal of Engineering and Technology (IRJET)

e-ISSN: 2395-0056

Volume: 11 Issue: 11 | Nov 2024

p-ISSN: 2395-0072

www.irjet.net

Using Large Language Models (LLMs) to Detect Bad Actors on Social Media Platforms Ajay Krishnan Prabhakaran Data Engineer, Meta Inc ---------------------------------------------------------------------***---------------------------------------------------------------------

Abstract - The rapid proliferation of social media platforms

handling the continuous flow of user-generated content across platforms. By analyzing posts, comments, and interactions, LLMs can detect a wide range of harmful behaviors, from toxic language and harassment to misinformation and bot activity

has enabled the spread of harmful content by bad actors, including bots, trolls, and purveyors of misinformation. This article examines how Large Language Models (LLMs), such as GPT-4 and BERT, can be leveraged to detect and mitigate the impact of these bad actors. Through advanced natural language processing (NLP) capabilities, LLMs offer significant improvements over traditional moderation tools. This paper explores the advantages, challenges, and ethical considerations of using LLMs for social media moderation and presents case studies where these models have been implemented effectively

2. BACKGROUND OF SOCIAL MEDIA AND THE RISE OF BAD ACTORS 2.1 The Growth and Influence of Social Media Over the past two decades, social media platforms have evolved from simple networking tools to powerful global communication networks that have fundamentally altered the way individuals interact, share information, and form communities. Platforms such as Facebook, Twitter, Instagram, and TikTok boast billions of active users, with many individuals using these platforms to share their personal lives, discuss current events, and connect with others across cultural and geographical boundaries. Social media has become essential in both personal and professional spaces, serving as a primary source of news, entertainment, and even political discourse. The ability to access information and communicate with others in real-time has democratized content creation and consumption, offering unprecedented opportunities for interaction and engagement.

Key Words: Large Language Models (LLMs), Social Media Moderation, Bad Actors, Fake News Detection, Toxicity Detection, AI Ethics, Automated Moderation, Natural Language Processing (NLP), Content Filtering, Misinformation

1.INTRODUCTION Social media platforms have become a cornerstone of modern communication, allowing individuals to connect, share information, and engage in real-time interactions. However, as these platforms grow, so do the challenges associated with moderating harmful content. Bad actors, such as trolls, bots, and purveyors of misinformation, have emerged as significant threats to the safety and integrity of online spaces. These individuals or automated accounts engage in activities like harassment, spreading fake news, and manipulating public opinion, which can have serious consequences, including societal division and the erosion of trust in democratic processes. The sheer volume and complexity of content on social media make it increasingly difficult for traditional content moderation systems, which rely on manual review or simple keyword filters, to keep pace.

However, the rapid growth of social media has also introduced significant challenges. The sheer volume of content shared on these platforms, combined with the diversity of users and the wide array of interactions, has created an environment where harmful behaviors can thrive. While social media can be a tool for empowerment, it has also become a breeding ground for bad actors who engage in activities that undermine the integrity and safety of online spaces.

2.2 Types of Bad Actors on Social Media

The limitations of traditional moderation highlight the need for more advanced solutions, leading many social media platforms to explore the potential of Large Language Models (LLMs), such as GPT-4 and BERT, to automate and enhance content moderation efforts. Unlike previous models, LLMs possess advanced Natural Language Processing (NLP) capabilities, allowing them to understand context, detect subtle nuances, and identify harmful content with greater precision. These models can process vast amounts of unstructured text data at scale, making them an ideal tool for

Impact Factor value: 8.315

Bad actors on social media encompass a wide range of individuals and automated accounts that engage in harmful behaviors, each posing different threats to the safety and reliability of online interactions. These actors can generally be divided into several categories: 

Trolls: These are individuals who deliberately post provocative, inflammatory, or disruptive content with

ISO 9001:2008 Certified Journal

Page 428