Data integrity is essential to research, regardless of data classification, since unauthorized modifications to your data can disrupt a project, produce incorrect results, undermine trust in academic research, and affect your reputation. It is important that throughout the research workflow and lifecycle that the integrity of the data is verifiable.
👥 Audience
RESEARCHERSIT STAFF
Initial considerations
💽 Establish a resilient backup strategy and back up your data.
Should unauthorized modifications take place, it is important that you have a “clean” (unaffected) backup to revert back to.
🔢 Compute and check checksums on your integral data files.
Keep the generated checksums secure and separate from your data/file, since modified checksum outputs are no longer useful.
Hashing algorithms provide a unique output sequence for every unique input. This means that even if a single bit of a file has been altered, a radically different output sequence will be produced, even if the input file might initially appear unmodified.
Practicing manual version control or leveraging a version control system allows you to properly manage and retain the numerous revisions to a file that might take place over the course of a task or project.