Coding Challenge #48 - Data Privacy Vault
John Crickett
Helping you become a better software engineer by building real-world applications.
This challenge is to build your own Data Privacy Vault. A Data Privacy Vault is a way to keep sensitive information safe. The vault stores the data and controls who can get to it, making sure it's managed, watched over, and used carefully.
Every business has sensitive data, whether it's about people's health (medical records), their payment details (credit card numbers), or other personal information (names and addresses).
At the same time every business wants and needs to leverage that data: analysing trends, processing payments, confirming someone's identity or simply delivering the product to the customer. The data is incredibly valuable so keeping it safe and secure is essential to the ongoing operation of a business as well as compliance with industry and legally required standards.
A data vault helps businesses achieve this by separating sensitive data and storing it in a secure service or ‘vault’. This vault then controls access to the data only when absolutely needed.
If You Enjoy Coding Challenges Here Are Four Ways You Can Help Support It
The Challenge - Building Your Own Data Privacy Vault
For this challenge we’re going to build a service that can function as a Data Privacy Vault, that’s the green bit in the high-level diagram below that conceptually shows where it would fit in some sort of production system.
Our service will provide a HTTP based API that can be used to send sensitive data to be stored in the vault, receiving back a token that can later be used to retrieve the sensitive data by an authorised user/service.
To understand why the Data Privacy Vault is preferable to just encrypting the data check out Wikipedia’s article on Tokenization.
Step Zero
In all the best programming languages we index arrays from zero. Coding Challenges continues that trend starting every challenge from step zero.
As usual, step zero is to set up your IDE / editor of choice and programming language of choice. We’re going to build a service with HTTP based API so pick your favourite language to do that with or the one you most fancy learning.
Aside: Some programming language index arrays from zero because arrays are essentially a pointer to the beginning of a contiguous block of memory and an offset from that point. In other words, if you’re familiar with C then array[index] is equivalent to *(array + index).
Step 1
In this step your goal is to create a simple tokenisation service that can create tokens and return their value, for the moment storing the data in memory is fine.
Once this is done you should have two endpoints:
Endpoint: /tokenize
领英推荐
Method: POST
Request payload:
{
"id": req-12345”,
"data": {
"field1": "value1",
"field2": "value2",
"fieldn": "valuen"
}
}
Success response: HTTP Code 201
Payload:
{
"id": req-12345”,
"data": {
"field1": "t6yh4f6",
"field2": "gh67ned",
"fieldn": "bnj7ytb"
}
}
Don’t forget to create the appropriate error response codes.
Endpoint: /detokenize
Method: POST
Request payload:
{
"id": req-33445”,
"data": {
"field1": "t6yh4f6",
"field2": "gh67ned",
"field3": "invalid token"
}
}
Response:
"id": req-33445”,
"data": {
"field1": {
"found": true,
"value": "value1"
},
"field2": {
"found": true,
"value": "value2"
},
"fieldn": {
"found": false,
"value": ""
}
}
Don’t forget to create the appropriate error response codes and to handle a token not being found.
Continued...
You can find Step 2 and beyond on the Coding Challenges website as build your own Data Privacy Vault.
Or if you'd rather get the whole challenge delivered to you inbox every week, you can subscribe on the Coding Challenges Substack.
Co-Founder of Pybites / Python Coach / Software Developer
1 年Very true and great work offering those challenges. ?? ?? Tutorials are polished and iterated over in advance, not showing the real side of software development. You'll need to go through the iterative process to really learn. Building complete + real-world projects provides the full exposure needed to grasp software development and become proficient. The importance of collaboration in this journey cannot be overstated. Engaging with a community of developers on these challenges can offer invaluable insights and diverse problem-solving techniques. Continuous learning is key in our fast-evolving field, and these challenges are a perfect way to keep our skills sharp. I encourage everyone to not only take on these challenges but also share their projects and progress. In the spirit of fostering a supportive learning environment, I appreciate the efforts of people like you who create and share these cool code challenges. It's through deliberate practice, collaboration and sharing that we can all grow together.
Great one.
Vice President / CTO at MentorTek
1 年For extra credit, build your Lisp interpreter with Lisp. Oh, and from our ChatGPT friend, your Huffman encoder/decoder was generated along with: This code does the following: Builds a frequency table to count how often each character appears in the input string. Builds a Huffman tree based on the frequencies of characters. It creates a binary tree where each node represents a character (leaf nodes) and its frequency. The tree is built by repeatedly merging the two least frequent nodes. Generates Huffman codes for each character by traversing the Huffman tree. Characters found deeper in the tree have longer codes, and those with higher frequency (closer to the root) have shorter codes. Encodes the input string using the generated Huffman codes. This implementation is a basic demonstration of Huffman encoding. In practice, you would also need to consider how to store or transmit the Huffman tree along with the encoded data, as the receiver must use the exact same tree to decode the message accurately. This list might also be a good exercise for how to coach ChatGPT into giving you working code (once you debug it). He did nail the huffman encoder, but isn't always that easy.
??Unpacking Software Architecture
1 年Another use case for this is for storing PII in different regions than the rest of the data. For example, the main database might be in the US but PII for European users must be stored in the EU for GDPR compliance.
?? Generate Leads and Sales Through Search Engine Optimization; specialized for Law Firms, Veterinarians, Local Business and Ecommerce Sites ????
1 年Love the challenge! Building real-world applications is indeed the best way to hone coding skills.