Refined Data, the new oil and You can't let it spill!
Puneet Jindal
Top Voice | Enable 10x speed in AI dev with Labellerr (Top 10 automated data labeling tools 2024 by G2)
We all have been reading that data is the new oil and companies will continue to gain competitive edge by "storing, processing and selling" it as they adopt data science practices in their organization formally
While this is true to certain level, it is important to rephrase this line in a better context mentioning <lawful> storing, <lawful> processing and <lawful> selling.
By this time you already must have understand what I am pointing to. I am talking about various data compliances and laws and regulations have been put in place by governments and by industry to protect personal data. These compliances have difference scope on which these are applicable. For e.g. only to a specific country citizens, or particular domain or particular state or particular rights.
Why understanding these legalities have become important to understand for engineers?
The reason in simple words that these laws dictate on how you handle your data and so it directly impacts your software engineering practices and software implementation!
Though there is big list of compliances which have come into existence in last few years due to increased digital transformation and increased data breaches and implications. But few of the most prominent ones where you should begin your awareness taking them as base to understand these in details are the following:-
1) GDPR -
The General Data Protection Regulation (GDPR) was a recent EU regulation which came into mandate on 25 May 2018. Its aim is to improve privacy and give greater control to customers and citizens over their personal information and how it is used. It is applicable to all the companies in the world storing EU citizens personal data.
Personal data is information that can identify you - e.g. your name, address, date of birth, PPS number, passport number, gender, family members, nationality, postal address, email address, phone number, email address, password, location and IP address etc
2) CCPA -
The California Consumer Privacy Act (CCPA) is a state statute intended to enhance privacy rights and consumer protection for residents of California, United States. The bill was passed by the California State Legislature on June 28, 2018, to amend the California Civil Code. CCPA applies to any business, including any for-profit entity that collects consumers' personal data, which does business in California, and satisfies at least one of the following thresholds:
- Has annual gross revenues in excess of $25 million
- Buys, receives, or sells the personal information of 50,000 or more consumers or households; or
- Earns more than half of its annual revenue from selling consumers' personal information.
3) HIPAA
The Health Insurance Portability and Accountability Act (HIPAA) sets the standard for sensitive patient data protection. Companies that deal with protected health information (PHI) must have physical, network, and process security measures in place and follow them to ensure HIPAA Compliance. The Health Insurance Portability and Accountability Act of 1996 was enacted by the 104th United States Congress and signed by President Bill Clinton in 1996.
It was created primarily to modernize the flow of healthcare information, stipulate how Personally Identifiable Information maintained by the healthcare and healthcare insurance industries should be protected from fraud and theft, and address limitations on healthcare insurance coverage.
- Covered entities (anyone providing treatment, payment, and operations in healthcare)
- Business associates (anyone who has access to patient information and provides support in treatment, payment, or operations)
- Other entities, such as subcontractors and any other related business associates
4) PDPA
The PDPA provides for the establishment of a national Do Not Call (DNC) Registry. The DNC Registry allows individuals to register their Singapore telephone numbers to opt out of receiving marketing phone calls, mobile text messages such as SMS or MMS, and faxes from organisations.The PDPA took effect in phases starting with the provisions relating to the formation of the PDPC on 2 January 2013. Provisions relating to the DNC Registry came into effect on 2 January 2014 and the main data protection rules on 2 July 2014The PDPA takes into account the following concepts:
- Consent – Organisations may collect, use or disclose personal data only with the individual's knowledge and consent (with some exceptions);
- Purpose – Organisations may collect, use or disclose personal data in an appropriate manner for the circumstances, and only if they have informed the individual of purposes for the collection, use or disclosure; and
- Reasonableness – Organisations may collect, use or disclose personal data only for purposes that would be considered appropriate to a reasonable person in the given circumstances.
The Personal Data Protection Bill, 2019 was introduced in Lok Sabha by the Minister of Electronics and Information Technology, Mr. Ravi Shankar Prasad, on December 11, 2019. The Bill seeks to provide for protection of personal data of individuals, and establishes a Data Protection Authority for the same.
The Bill governs the processing of personal data by:
(i) government,
(ii) companies incorporated in India, and
(iii) foreign companies dealing with personal data of individuals in India.
Personal data is data which pertains to characteristics, traits or attributes of identity, which can be used to identify an individual. The Bill categorises certain personal data as sensitive personal data. This includes financial data, biometric data, caste, religious or political beliefs, or any other category of data specified by the government, in consultation with the Authority and the concerned sectoral regulator.
Many of the compliances that exist today around data security are not very old except the HIPAA and few others and so are evolving every year as we move into a economy which is data-centric and organizations which are become data-driven and consumers who are become data-aware.
So important thing to understand here is: While developing an AI application software, its important to give due consideration to the laws and regulations that might be applicable to the problem in hand and considered very carefully across the lifetime of the problem statement be it a product or a services.
I will put further emphasis on explaining each of these regulations in details(in the next articles) with some use cases on how data science/software engineers/IT teams end up violating these laws unknowingly which could cost billions of dollars/money to the organizations and the individuals.
For any further details/suggestions/corrections in the article, you can reach out to me at [email protected] via email or you can comment below!
Some of the work that my team @tensormatics are doing in making AI accessible, affordable can be seen at www.labellerr.com
Disclaimer: This article contains views that are completely my own based on my experience working on data systems. Also few references like definitions explained in a simple way has been taken from various sources. Main objective of this article is to put a light towards the existing data laws and regulations and this article could serve as a starting point for further discussions and could later on lead to a deeper exploration based on the use case of the reader!