Data Obfuscation..

Data Obfuscation..

Sometimes it is important to use real production data to test applications and ensure that the results are correct and there are no discontinuities in recent revisions. However, all sensitive data should be disguised before it is loaded into another environment, either a test or development.

In a recent project I was involved in, I worked on obfuscating data, so I felt I should write a small article and share the knowledge. There are tools to generate convincing test data, but it occasionally happens that the variances and frequencies within the data cannot be easily simulated.

?So, What is Data Obfuscation?

Data obfuscation is the process of replacing sensitive information with data that looks like real production information, it refers to the act of making something appear different from its actual form. To a security-aware developer, the term refers to any method used when hiding the actual value of a data object. In the realm of software testing, data obfuscation is of paramount importance. Testing is awesome , but it can lead to user data being compromised if your test data management strategy is reckless when it comes to data protection.

There are three primary data obfuscation techniques:

  • Masking out?is a way to create different versions of the data with a similar structure. The data type does not change, only the value change.?.
  • Data encryption?uses cryptographic methods, usually symmetric or private key systems to codify the data, making it completely unusable until decrypted.
  • Data tokenization?replaces certain data with meaningless values. However, authorized users can connect the token to the original data.
  • Also there some other Data Obfuscation Techniques like Non-deterministic randomization, Shuffling, Blurring, Nulling. Repeatable masking, Substitution. Custom rules and so on.

But before you adopt an obfuscation technique, you should consider following points:

  • Detect confidential or sensitive data and check. It is necessary to obfuscate only the columns containing sensitive data, but this can cause problems when running certain chunks of code, such as DOB, which are sensitive data but can cause problem in calculations and therefore need to be changed accordingly so that testing will not be affected.
  • Evaluating the impact of different obfuscation approaches on your data so that a common process can be used for all systems accessing the data.?
  • ?Identifying use cases to achieve quick wins.
  • Evaluating technologies to simplify and even automate obfuscation, e.g. adding a SQL agent job/power shell script/SSIS to backup and restore to the test environment and run the obfuscation SQL.?

There are many techniques and procedures available. I think the book Protecting SQL Server Data by John Magnabosco is the best book on this subject. There are also good procedures and code available on google search, but I suggest writing your own so makes it easy to change it and can share the repository for other projects.

要查看或添加评论,请登录

Amesh PARIT的更多文章

  • On Prem SQL to Azure Migration

    On Prem SQL to Azure Migration

    Cloud computing is one of the most transformative technologies of our time. By moving data and applications to the…