The old rule of data warehousing also applies to AI - focusing on data quality and governance
Dr. Petri I. Salonen
AI Transformation, Business Modeling, Software Pricing/Packaging, and Advisory. Published author with a strong software business background. Providing interim management roles in the software/IT field
I have spent more than half of my career building data warehousing and business intelligence solutions—in fact, that is where I also did my doctorate dissertation. I quickly learned the old rule "garbage in, garbage out" when implementing data warehouses and business intelligence solutions. The time was typically spent mostly in the ETL (extraction, transformation, and load) to ensure that the data for the reporting was clean.
What I am not hearing much about in discussions and in the media is the cleanliness of data for AI. With bad data, you can't expect a good outcome. ZDNet explores 3 ways to build strong data foundations for AI implementation based on the feedback from three business leaders, and they are as follows:
??Put your people first. Even if people don't get bogged down in a long-term strategic plan that defines technology, processes, people, and rules required to manage information assets, the organization must understand why information governance matters for each stakeholder in the long run. AI can't handle bad data; it is not made for that. A wrong email address is still a wrong one. However, you want to look at it. The data and IT departments must have a close working relationship, as the data will live in the IT team's applications.
??Master your transactional data. Smart organizations focus on the foundational elements of data use long before they consider exploiting AI and machine learning. There are no shortcuts to implementing a data strategy. Having great experiences with AI and digital transformation requires a solid data strategy.
??Work with your industry peers. A great data strategy goes beyond internal working practices and spans organizational boundaries. Different vertical domains have different industry associations and bodies that strive with some standardization efforts. One is Offshore Energy Digital Strategy Group (DSG) , a specialist body formed in late 2022 to create a collaborative effort across UK public bodies. These types of bodies focus on things such as data, standards, and principles.
The quality of AI is always, always dependent on the quality of the underlying data. If you prompt an LLM with bad input without giving enough context, you can't expect to get a good result. Understanding the domain is important. If you don't understand the output from a LLM prompt, then you can't possibly know if that was based on hallucination. Having domain experts or humans in the loop is essential to verify AI suggestions. This approach is known as "Machine Suggested, Human Verified". A domain expert or user with the most knowledge about a situation or initiative should be able to overrule or reverse AI decisions.
Data can be in many different forms, and now, in the era of Copilots, data security and governance requirements have been elevated to a new level. There are several critical steps that an organization has to take to ensure data security, compliance, and effective management when deploying Microsoft 365 Copilot in an organization. Some of them are as follows:
??Data Identification and Classification - Identify and classify your data to understand its sensitivity and importance. This helps in applying appropriate governance policies.
??Data Labeling - Apply sensitivity labels to your classified data. Sensitivity labels help in enforcing data protection policies and preventing data loss.
??Access Management - Establish, review, and maintain robust permission policies. Ensure that users have the appropriate level of access based on their roles and responsibilities.
??Data Loss Prevention (DLP) - Implement DLP policies to prevent unauthorized sharing of sensitive information. These policies can help detect and block risky activities.
领英推荐
??Zero Trust Principles - Apply Zero Trust principles to your Microsoft 365 environment. This includes verifying user identities, enforcing least privilege access, and assuming breaches to minimize risks.
??Compliance and Auditing - Regularly audit your data governance policies and compliance status. Use tools like Microsoft Purview to monitor and report on data activities.
??Training and Awareness - Educate users about data governance policies and best practices. Ensure they understand the importance of data security and compliance.
??Continuous Monitoring and Improvement - Continuously monitor your data governance framework and improve as needed. Stay updated with the latest security and compliance requirements.
The steps above show that there isn't really a shortcut to deploying technology without emphasizing data within an organization. Many organizations have been able to avoid investing in data governance, but now, with the era of AI, that investment requirement is ahead of them.
It would be interesting to hear whether you agree that data is the fuel for AI and that data cleanliness is a requirement.
Yours,
Dr. Petri I. Salonen
PS. If you would like to get my business model in the AI Era newsletters to your inbox on a weekly or bi-weekly basis, you can subscribe to them here on LinkedIn https://www.dhirubhai.net/newsletters/business-models-in-the-ai-era-7165724425013673985/