Skip to main content

Secure Dataset Verification Using Blockchain and Merkle Trees

Page 1

International Research Journal of Engineering and Technology (IRJET)

e-ISSN: 2395-0056

Volume: 13 Issue: 02 | Feb 2026

p-ISSN: 2395-0072

www.irjet.net

Secure Dataset Verification Using Blockchain and Merkle Trees D. John Subuddhi¹, S. Keerthi Srivalli², Yesaswini Swarna³, S. Karthik´, P. Chetan Satishµ ¹²³´µDepartment of Computer Science and Engineering, Vishnu Institute of Technology, Bhimavaram, India ---------------------------------------------------------------------***---------------------------------------------------------------------

Abstract - Most of the models rely heavily on data. That

checking integrity tracking where the data comes from and evaluating trust is a gap in research that this study is trying to fill. This study tested the idea that it can help make digital data systems more trustworthy [6].

data may be large, continuously updating. Existing data management models lack many aspects like trustworthy of data and which makes them weak to attacks. Block chain is current demanding technology for increasing the assurance of data integrity in distributed systems. Still using cryptographic hash methods results in ineffective results. So, in this paper we introduce a model which uses hash generation to create a merkle root node and by comparing this merkle root node with later produced root node value, we can confirm whether a dataset has been modified or not. The current existing models may have this creation of merkle tree but no model has other features like calculation of trust score and tampering alert system. So, we integrated other features like developing dynamic trust score, which automatically reduces if there is detection of tampering and role based access of data to control other unauthorized actions and differential dataset backup which helps to restore the original dataset and smart alert system which sends alerts immediately when tamper is detected. By combining all these features into a single framework, it introduces new capabilities such as transparent tracking of data, automated trust evaluation and dataset versioning that were not introduced in previous models. The primary beneficiaries include ML practitioners, data researchers, and organizations.

Making sure the data is correct directly affects how reliable the final results, machine learning models and decision-making processes are [7]. By providing a way to prevent tampering this study helps improve accountability and reduce security risks [8], which's important for applications that rely on data. We have already used hash functions [9] to detect when data has been changed but they are not good enough. Merkle Trees [10] are a way to efficiently check datasets. With Merkle Trees some records from a dataset are combined to get a hash value, and then all these hash values are combined to get a single Merkle root value [11]. The big problem this research found is that there is no way to guarantee that the data is correct and trustworthy in a distributed environment [12]. The main goal of this study is to design a framework that uses blockchain and combines Merkle Tree- based verification [13] with a trust scoring mechanism, time tamper detection [14] and rolebased access control [15]. What makes this approach special is that it creates Merkle Trees and different versions of datasets. The solution we propose shows that using blockchain technology with Merkle Tree-based verification [16] can greatly increase the assurance of data integrity in distributed systems. We expect the system to be able to detect tampering in time to update trust scores based on verification results to send out alerts when tampering occurs and control access based on roles. This is possible because of hashing, the efficiency of Merkle Trees, and the decentralized trust model that blockchain technology offers. We chose to use blockchain technology because it does not rely on authorities and Merkle Trees because they make it efficient to verify large datasets without needing to access the whole dataset.

Key Words: Block chain-based Security, Data set Integrity, SHA256 Hashing, Binary Markel Tree , Smart Contracts, Role-Based Access Control(RBAC).

1. INTRODUCTION This study says that distributed and data-driven systems [1] are growing fast. In these systems it is very important to make sure that the data is correct and trustworthy. Many modern applications use intelligence, cyber security [2] and other platforms that rely heavily on shared datasets that are always being updated and have a lot of data [3]. This makes it morelikely that someone could change the data without permission tamper with it or poison it which can hurt the systems reliability. So we need a way to always check that the data is correct and trustworthy [4].

This makes the model very suitable for distributed environments where data is frequently updated and shared among parties that do not trust each other. The advantages of this proposed system are that it provides guarantees of integrity, non-repudiation and tamper resistance. It also reduces storage requirements by storing hash values on the blockchain supports tampering verification enables proper auditing and allows real-time integrity monitoring making it feasible, for real-world use.

The solutions we have now are mostly. Depend on trusted authorities, which makes them weak to failures and attacks. Also most systems do not have a way to check data integrity in time evaluate trust and send out alerts automatically [5]. Not having a framework that supports

© 2026, IRJET

|

Impact Factor value: 8.315

|

ISO 9001:2008 Certified Journal

|

Page 838


Turn static files into dynamic content formats.

Create a flipbook
Secure Dataset Verification Using Blockchain and Merkle Trees by IRJET Journal - Issuu