A System for Detection and Prevention of Data Leak

Page 1

1

International Research Journal of Engineering and Technology (IRJET) e-ISSN:2395-0056

Volume: 09 Issue: 09 | Sep 2022 www.irjet.net p-ISSN:2395-0072

A System for Detection and Prevention of Data Leak

2

Abstract - Technology is growing exponentially in the recent years and most of the organizations store their data in digital format. With the rapid growth in technology, there is a need for maintaining security of data. It is extremely essential as data leak might have a huge effect on the organization. Preventing data leak has become one of the biggest challenges to the organizations. For the security purposes, the organizations have implemented several methods like implementation of policies, Firewalls, VPNs, etc. However, with the enhancement of data theft methods, these security measures are not reliable anymore. Hence there was a need for a system that can prevent data leak. Also, as employees have access to sensitive information of the company, they could leak the information either by negligence or on purpose. Hence, securing the data has become a big challenge for the organizations. In this article, we propose a system that will achieve the information security goals of the organization, and will be capable of detecting data leak at any state of the data. The proposed system mainly focuses on preventing data leak.

1. INTRODUCTION

Security has become an important factor in our life. Security is required in all sectors of industry. An attacker has various methods to access the confidential information of any organization. Hence, preventing such attackers from accessing the information is the main aim of information security. We need to implement various strategiestosecuretheinformation.

Data leak occurs when unauthorized users can access the confidential or sensitive data to. Data leak can happen intentionally through employees of the organization or maliciousattackers.Itcanalsobeanunintentionalleakby employees.Inanycase,thedataistransferredoutsidethe organization. Data leak usually occurs through email. It can also occur through data storage devices such as laptops.

Data is one of the most precious asset. Therefore, the preventionofdataleakisthemostimportanttaskforany organization.

***

Even with security measures like firewalls, data leak still occurs. In any organization, the employees have access to sensitive data. Hence, there is a chance that the data leak occurs through employees rather than through malicious attackers.

1.1 Types of Data

Any organization must deal with three types of data to preventdataleak:

1. Data in motion

It refers to data that is moving from the network to the outsideworldthroughtheinternet.

2. Data at rest

Itreferstodatathatisstoredinthefilesystems,databases andotherstoragemethods

3. Data at the endpoint

Itreferstodatapresentattheendpointsinthenetwork

Most of the organizations scan the emails that have been received from outside the organization for any malicious malwares. But, they do not check the emails sent outside the organization, thereby allowing the sensitive informationtobesentoutsidetheorganization.

MostcommoncausesofDataBreachare:

Ā© 2022, IRJET | Impact
7.529 | ISO 9001:2008 Certified Journal | Page 1019
Factor value:
M.Tech Student, Department of Computer Engineering and IT, VJTI College, Mumbai, Maharashtra, India Associate Professor, Department of Computer Engineering and IT, VJTI College, Mumbai, Maharashtra, India
2.
3.
4.
5.
6.
1. Hacking
Malware
UnintendedDisclosure
Virus
Worms
Insiderleak 7. Dataloss

International Research Journal of Engineering and Technology (IRJET) e-ISSN:2395-0056

1.2 Causes of Data Leak

1. Virus Attack

If a machine is infected by viruses and worms, spyware, adware, etc. this might result in corruption and loss of data.Inordertopreventthis,anti-virusshouldbeused.

2. Malicious Attack

Ill-intentioned and malicious attackers can hack into the system and steal, modify or delete valuable information. Thiswillcausedataleak.

Data leak prevention (DLP) is the practice for detection and prevention of data breaches and destruction of sensitive data. It is a set of tools and processes used to ensure that unauthorized users do not access, delete or modifythesensitivedata.

āž¢ Network DLP

NetworkDLPstandsforNetwork DataLoss Prevention. It is a technique for protecting communications over the network such as web applications, emails, and data transfermechanismsoftheorganization.

It helps to prevent the loss of sensitive data on the network. Moreover, it allows the company to encrypt data and to block risky information flows in order to monitor and controltheflowofdataoverthenetworkaccordingtothe regulatorycompliance.

āž¢

Endpoint DLP

EndpointDLPstandsforEndpointDataLeakPrevention.It protects sensitive data at the endpoints. It also helps the organizationtotrackemployeebehaviors.

It monitors and addresses daily risky actions like sending emails,uploadingdatatothecloud,etc.Itprovidesawider rangeofthreatprotection.

Inthisproject,wewillbeimplementingEndpointDLP

2. PROPOSED METHODOLOGY

2.1 Problem Statement

TodevelopasystemforDetectionandPreventionofData Leak.

2.2 Problem Elaboration

Fig -1

:DataLeakPreventionmodel

DLP software classifies data into different categories and checksforviolationsofpoliciesdefinedbyorganizations.

Once these policies are violated, DLP issues alerts, and other protective measures to prevent end users from accidentally or maliciously sharing confidential information.

1.3 Types of Data Leak Prevention System

TheDLPcanbeclassifiedintotwotypes: āž¢ NetworkDLP āž¢ EndpointDLP

Inthissystem,thedatawill besecuredbyusingDLP.DLP is a method to prevent the users from sending sensitive dataoutsidetheorganization.Thesystemisusedtocheck for any activities of data transfer that might lead to data leak.

Thesystemwillcontrolandmonitorendpointactivities.It will also monitor the data in the cloud to protect data at rest,inmotion,andinuse.Itwillalsogeneratereportsand identifyweaknessesinthesystemtoenhancethesecurity.

2.3. Proposed System

TheproposedsystemsuggestsaSystemforDetectionand Prevention of Data Leak. The system will constantly monitor the activities of the employees and will restrict any malicious activities. If any suspicious activity from employees is found, the system will inform the admin throughgenerationofincidentmail.

Volume: 09 Issue: 09 | Sep 2022 www.irjet.net p-ISSN:2395-0072 Ā© 2022, IRJET | Impact Factor value: 7.529 | ISO 9001:2008 Certified Journal | Page 1020

International Research Journal of Engineering and Technology (IRJET) e-ISSN:2395-0056

2.3.1 Parameters to Detect and Prevent Data Leak:

Thesystemusesasetofparameterstodetectandprevent DataLeak.Theyareasfollows:

āž¢ TimeRestriction āž¢ ExtensionRestriction āž¢ KeywordsRestriction

1) Time Restriction

It is a technique which will restrict any malicious activity from users by applying time constraints. Here, the admin of the system can decide a particular time frame for sending the mails. The mail can be sent only in the time framedefinedbytheadmin.Itwillnotbesentoutsidethe timeframe.

e.g.: The time restriction can be set from 8:00 am to 9:00 pm.Anytransmissionsoutsidethistimeframewilltrigger incidentmailtotheadmin.

2) Extension Restriction

It is a technique which will restrict the files with extensions that cannot be read by the system. The followingarethereadablefileformatsforthesystem:

āž¢ Excel(.xls)

āž¢ Word(.doc) āž¢ PDF(.pdf) āž¢ TextFile(.txt)

Transmission of files with extensions other than Excel (.xls),Word(.doc),PDF(.pdf)andTextFile(.txt)willpose a threat to the confidentiality of the information of any organization.

Hence,theExtensionRestrictionfeatureallowstheadmin toblockthetransmissionoffileswithallotherextensions likeJpg,jpeg,png,mp3,mp4,etc.whichcannotbereadby thesystem.

3) Keywords Restriction

It is a technique which will restrict the transmission of fileswhichincludesanyconfidentialdata.

The admin can define a set of suspicious keywords. The system will check whether the file contains any of the keywords.

If the file contains the suspicious keywords, then the system will block the transmission, else it will allow the transmission.

2.3.2 System Architecture

Figure 2. System Architecture

Thesystemhastwomajormodules:

āž¢ ADMINMODULE āž¢ EMPLOYEEMODULE āž¢

ADMIN MODULE:

TheAdmincanloginusingthecredentialsprovidedbythe system.

ThefunctionalitiesofAdminModuleare:

1. Manage Employees:

The admin has the rights to add, update and delete employee details. After registration of the user, login credentialswillbegeneratedandwillbesentviamail.

2. Manage Keywords & File Extensions:

Theadmincandefineasetofkeywords.Thekeywordscan beadded,modifiedanddeleted.Adminwillblockthenonreadable fileExtensions.

3. View Activity Logs:

Theactivitylogswillbedisplayedhere.

Volume: 09 Issue: 09 | Sep 2022 www.irjet.net p-ISSN:2395-0072 Ā© 2022, IRJET | Impact Factor value: 7.529 | ISO 9001:2008 Certified Journal |

Page 1021

International Research Journal of Engineering and Technology (IRJET) e-ISSN:2395-0056

Volume: 09 Issue: 09 | Sep 2022 www.irjet.net p-ISSN:2395-0072

4. Settings:

Theadmincansetatimeframeforsendingemails. Admin willalsoprovideEmailidtoreceiveincidentemail.

āž¢ EMPLOYEE MODULE:

The employeecanloginusingthecredentialsprovided by thesystem.

ThefunctionalitiesofEmployeeModuleare:

1. Send File:

The employee can send files to anyone inside or outside theorganization.

2. View Received Files:

The employee can download the files received through email.

3. View Registered Employees:

Theemployeecanviewtheprofileof otheremployeesand getinformationliketheiremailid.

4. My Profile:

Theemployeescanviewandedittheirownprofile.

2.3.3 Working

The system constantly monitors the activities of employeestocheckwhetherthereisanymaliciousactivity or not. The employee can send files to anyone inside or outsidetheorganization.

Wheneveranemployeewantstosendafileovermail,the system will check all theparameters and will eitherblock themailorallowthemailtobesent.

Thesystemwillidentifydataleakbasedontheparameters usedtoidentifyDataLeak,i.e.TimeRestriction,Extension Restrictionand KeywordsRestriction.

The system will also generate incident mail to the admin andcreateanactivitylog.

3. RESULTS

Figure 3. Working

The following are the various case scenarios in which the system is able to restrict the transmission of suspicious data:

CASE 1: Restricted due to Time:

In this case, the transmission outside the timeframe mentionedinthesystemisblocked.

Factor value:

Ā© 2022, IRJET | Impact
7.529 | ISO 9001:2008 Certified Journal | Page 1022

International Research Journal of Engineering and Technology (IRJET) e-ISSN:2395-0056

Volume: 09 Issue: 09 | Sep 2022 www.irjet.net p-ISSN:2395-0072

Figure 4: Settings

Aswecanseeinthefigure,thetimesetbyusisfrom9:00 amto5:00pm.

Nowwhenwetrytosendafileoutsidethetimeframe,the systemblocksthetransmission.

Figure 7: Activity Logs

CASE 2: Restricted due to Extension:

In this case, the transmission of the file is carried out within the timeframe mentioned in the system, but the extensionforthefileisnotreadable.

Figure 5: Transmission Blocked

An incident email is generated to the admin and the activitylogsarerecorded.

Figure 8: Manage Files and Extensions

As we can see in the figure, the approved file extensions fortransmissionare:

Excel(.xls)

Word(.doc)

PDF(.pdf)

TextFile(.txt)

Nowwhenwetrytosenda filewithinthetimeframe but with a suspicious extension, the system blocks the transmission.

Figure 6: Incident Mail

Ā© 2022,
Certified Journal | Page 1023
IRJET | Impact Factor value: 7.529 | ISO 9001:2008
āž¢
āž¢
āž¢
āž¢

International Research Journal of Engineering and Technology (IRJET) e-ISSN:2395-0056

Volume: 09 Issue: 09 | Sep 2022 www.irjet.net p-ISSN:2395-0072

CASE 3: Restricted due to Keyword:

In this case, the transmission of the file is carried out within the timeframe mentioned in the system and the extension for the file is readable, but there are keywords presentinthefile.

Figure 9: Transmission Blocked

An incident email is generated to the admin and the activitylogsarerecorded.

Figure 10: Incident Mail

Figure 12: Manage Files And Extensions

Aswecan see in thefigure, the given set of keywordsare restricted.

Now when we try to send a word file with suspicious keywords,thesystemblocksthetransmission.

Figure 13: Transmission Blocked

An incident email is generated to admin and the activity logsarerecorded.

Figure 11: Activity Logs

Ā© 2022, IRJET | Impact Factor value: 7.529 | ISO 9001:2008 Certified Journal | Page 1024

International Research Journal of Engineering and Technology (IRJET) e-ISSN:2395-0056

Figure 14: Incident Mail

Figure

15: Activity Logs

CASE 4: Successful Transmission:

In this case, the transmission of the file is carried out withinthetimeframementionedinthesystem,thefilehas appropriate extension and it does not contain any keywords.

Here, the system allows the transmission and the activity logsarerecorded.

Figure 16: Transmission Successful

Aswecansee,thesystemshowsasuccessmessage.

Figure

17: Activity Logs

Thereceivercandownloadandviewthefile.

Figure 18: Download Recieved Files

Volume: 09 Issue: 09 | Sep 2022 www.irjet.net p-ISSN:2395-0072 Ā© 2022, IRJET | Impact Factor value: 7.529 | ISO 9001:2008 Certified Journal | Page 1025

International Research Journal of Engineering and Technology (IRJET) e-ISSN:2395-0056

Volume: 09 Issue: 09 | Sep 2022 www.irjet.net p-ISSN:2395-0072

4. CONCLUSION

Data leak is a major issue for many organizations. Data leak can have a disastrous effect on any organization. Hence, preventing data leak is very important. In this paper, a system is proposed for detection and prevention of data leak, which will achieve the security goals of an organization. The proposed method is easy to implement andcanbeusefulformanyorganizations.

REFERENCES

[1] S. Czerwinski, R. Fromm, and T. Hodes, ā€œDigital Music Distribution and Audio Watermarking, ā€http://www.scientificcommons.org/430256 58, 2007. Availableat:www.researchpublications.orgNCAICN-2013, PRMITR,Badnera399

[2] Y. Li, V. Swarup, and S. Jajodia, ā€œFingerprinting Relational Databases: Schemes and Specialties,ā€ IEEE Trans.DependableandSecureComputing,vol.2,no.1,pp. 34-45,Jan.-Mar.2015.

[3] Y.CuiandJ.Widom,ā€œLineageTracingforGeneralData WarehouseTransformations,ā€TheVLDBJ.,vol.12,pp.4158,2014.

[4] Panagiotis Papadimitriou and Hector Garcia-Molina, ā€œData Leakage Detection,ā€ IEEE Trans, Knowledge and DataEngineering,vol.23,no.1,January2013.

[5] P. Bonatti, S.D.C. di Vimercati, and P. Samarati, ā€œAn Algebra for Composing Access Control Policies,ā€ ACM Trans.InformationandSystemSecurity,vol.5,no.1,pp.135,2011.

BIOGRAPHIES

Aishwarya Jadhav is currently pursuing M Tech from VJTI COE, Mumbai. She has done her B.E (Computer Engineering) from AtharvaCollegeofEngineering.

Prof. Pramila M Chawan, is working as an Associate Professor in the Computer Engineering Department of VJTI, Mumbai. She has done her B.E.(Computer Engineering) and M.E.(Computer

Engineering)fromVJTICollegeof Engineering,MumbaiUniversity. She has 28 years of teaching experienceandhasguided85+M. Tech. projects and 130+ B. Tech. projects. She has published 143 papers in the International Journals, 20 papers in the National/International Conferences/ Symposiums. She has worked as an Organizing Committee member for 25 InternationalConferencesand5 AICTE/MHRD sponsored Workshops/STTPs/FDPs.Shehas participated in 16 National/International Conferences. Worked as ConsultingEditoron–JEECER,JETR,JETMS,Technology Today,JAM&AEREngg.Today, TheTech.World Editor–JournalsofADR Reviewer-IJEF,Inderscience

She has worked as NBA CoordinatoroftheComputer Engineering Department of VJTI for 5 years. She had written a proposal under TEQIP-I in June 2004 for ā€˜Creating Central Computing Facility at VJTI’. Rs. Eight Crore were sanctioned by theWorldBankunderTEQIP-Ion this proposal. Central Computing Facility was set up at VJTI through this fund which has played a key role in improving the teaching learning process at VJTI.

AwardedbySIESRPwith Innovative & Dedicated Educationalist Award Specialization : Computer Engineering &I.T.in2020

ADScientificIndexRanking (World Scientist and University Ranking2022)–

2nd Rank- Best Scientist, VJTI ComputerSciencedomain 1138th Rank- Best Scientist, ComputerScience,India

Ā© 2022,
Journal | Page 1026
IRJET | Impact Factor value: 7.529 | ISO 9001:2008 Certified

Turn static files into dynamic content formats.

CreateĀ aĀ flipbook