Data Obfuscation..
Amesh PARIT
Data Engineer | SQL Server 2019 |ADF|Power BI | ETL |Developer. Available to work Location Germany, Budapest, Netherlands & United Kingdom
Sometimes it is important to use real production data to test applications and ensure that the results are correct and there are no discontinuities in recent revisions. However, all sensitive data should be disguised before it is loaded into another environment, either a test or development.
In a recent project I was involved in, I worked on obfuscating data, so I felt I should write a small article and share the knowledge. There are tools to generate convincing test data, but it occasionally happens that the variances and frequencies within the data cannot be easily simulated.
?So, What is Data Obfuscation?
Data obfuscation is the process of replacing sensitive information with data that looks like real production information, it refers to the act of making something appear different from its actual form. To a security-aware developer, the term refers to any method used when hiding the actual value of a data object. In the realm of software testing, data obfuscation is of paramount importance. Testing is awesome , but it can lead to user data being compromised if your test data management strategy is reckless when it comes to data protection.
There are three primary data obfuscation techniques:
But before you adopt an obfuscation technique, you should consider following points:
There are many techniques and procedures available. I think the book Protecting SQL Server Data by John Magnabosco is the best book on this subject. There are also good procedures and code available on google search, but I suggest writing your own so makes it easy to change it and can share the repository for other projects.