Cryptography: Verify file integrity with md5 hash checksum
Checksum is used to verify the data integrity of files. It can be used to detect errors while transmitting data between devices or over the internet. The checksum functions as a digital fingerprint of a file. Computer programs like md5sum and sha256sum calculate and verify checksum. md5sum calculates and verifies 128-bit MD5 hashes and sha1sum calculates and verifies SHA-1 hashes.
Verifying locally stored files
Suppose, you had computed checksum of your files and stored checksum of all the files in another file. After some time/days, you need to check if those files are tampered / modified, or corrupted then you can recompute the checksum of your files and check if it matches with the previously stored checksum. If it matches then, your data are not altered or corrupted. Otherwise, your files are corrupted. You can verify the checksum of each individual file.
Verifying downloaded files
Sometime, data might get lost during download. In such case, you can verify the integrity of your downloaded files with the help of checksum. For this, you need to have previously stored checksum of your files on your server. You can download that file containing checksum of all the other files from your server. Then, you can compute the checksum of your downloaded files and verify if it matches with the previoulsy stored checksum. If it matches then, your data are not altered or corrupted. Otherwise, your files are corrupted. You can verify the checksum of each individual file.
In this article, I will show how to compute checksum of files, store the checksum in a new file and then verify the data integrity / authenticity of the files using md5sum. I am using Ubuntu Linux as my operating system.
1. Create files and folders
I go to my home folder and create a new directory over there named ‘crypto’. So, crypto folder will be my working directory for this article example.
I create two directories named ‘folder1’ and ‘folder2’ and 5 files. folder1 and folder2 have one file each inside them.
mkdir folder1 folder2
touch file1.txt file2.txt file3.txt folder1/file4.txt folder2/file5.txt
2. Print all files
Just checking all the files and folders present inside our working directory.
Print all files
find . -type f -print
Print all directories
find . -type d -print
3. Compute md5 hash of all files and store it in a new file named ‘md5’
md5sum * > md5
4. Print content of the new file ‘md5’
This file has md5 hash of all the files present inside our working folder.
5. Making changes
Let’s update one of the exisiting files. We add “some text” to file1.txt.
echo "some text" > file1.txt
6. Verify md5 hash
Now, we check and verify the integrity of all the files in our working directory.
md5sum -c md5
md5sum: WARNING: 1 computed checksum did NOT match
We can see that it shows one file (file1.txt) as altered because we had edited the file.
Hope this helps. Thanks.