Natural Language Execution The new wave of AI with Bas van der Raadt
DAMA Southern Africa
Data Literacy: From Executives to Data Citizens to Data Management Professionals; We all need to improve our DM KSCs
Executive Summary
This webinar outlines the key concepts and implications of Natural Language Execution in AI, including using Large Language Models in information modelling and application design. Bas van der Raadt covers Domain Driven Design, Ontology creation and management, Business Rules, Reference Data & Classification, State Modelling, and more. He also emphasises the importance of precision and interpretability in model specifications and data quality while discussing the challenges associated with natural language and data modelling. The webinar provides a comprehensive overview of Natural Language Execution and its potential impact on enterprise.
Natural Language Execution Presentation
Bas van der Raadt is enthusiastic about presenting on the topic of Natural Language Execution. The presentation concerns Natural Language Execution and Bas will share his knowledge and experience from working on a platform called HAPSAH. His academic background includes computer science and business information science, and he holds a PhD in Enterprise Architecture. Bas successfully combined his practitioner work and research, making it easy for him to put case studies into practice.
Introduction to the HAPSAH Platform and Natural Language Specifications
Bas is an independent adviser with expertise in architecture, IT strategy, and portfolio management. He has undertaken a project to create a platform that enables the expression of application requirements in natural language, with the potential to deliver working software directly from these specifications. The project involves ongoing research and collaboration with universities to streamline the process from specification tools to running applications. Bas delves into the underlying concepts and importance of natural language execution in software development, explaining why, what, and how this approach works.
Implications of Large Language Models in the AI Community
Large language models have revolutionised how we interact with systems by providing personalised experiences and access to vast knowledge and data. However, these models face challenges with basic common sense reasoning and lack transparency and reliability in decision-making, leading to concerns about false outputs and lower data quality. To address these concerns, experts recommend a hybrid approach that combines natural language interaction with deterministic execution and strict logical reasoning, which can improve reliability and transparency.
Natural Language Execution in AI
Natural Language Execution is a controlled natural language-based AI system that enables direct execution of deterministic decision-making systems from natural language input. It ensures transparency, predictability, and no code generation between specification and execution. The HAPSAH platform provides a modeller for ontology building and business rule writing in natural language, visual and textual representations of the same model, a REST API for creating ontology and business rules, and an owl import function adapter. The HAPSAH principle ensures that humans remain in control and there
Overview of Information Modelling and Application Design
HAPSAH is a platform that enables the easy importation of case talk models, making the model instantly executable without needing code or data schema generation. The ontology of the model is available in the runtime environment, which can be accessed through a graphical user interface and a Rest API. The platform supports event-driven architecture, with commands and events being managed through the platform. The concept of "meta-metadata" allows for data description and entity relationships to be fully accessible in the modelling environment and through the API. The data modelling language is structured around nouns and verbs, with "entities" and "attributes" being the key components for connecting various application parts.
Domain Driven Design and Data Modelling Concepts
In Domain Driven Design, "Descriptor" refers to the concept of "value objects" that create immutable composite attributes such as an address. Similarly, "Entity" in data management refers to master data, and "Attribution" is a unique way to create unique combinations of entities and attributes, enabling us to specify the strength of a person instead of just stating their weight attribute. Verbs such as association, composition, aggregation, extension, and attribution are used to specify relationships.
Creating and Implementing Ontologies in Software Development
Building an ontology involves creating triples of subject, predicate and object to define relationships between concepts. These concepts are commonly used in RDF and form the basis of OWL (Ontology Web Language). Statement instances are created to represent actual data in the runtime environment and connect metadata with transactional data to persist in a database. Transactional data is time-stamped and indicates a specific transaction that has occurred. While the example provided is a simple ontology, ontologies can become much more complex in
Business Rules and Ontology in Application Development
Bas provides a brief overview of the ontology and business rules involved in application development. The ontology represents the structure of the application, whereas business rules consist of action rules, constraints, and query rules. Action rules specify the actions that need to happen under certain conditions in an application, which are composed of conditions, operations, and outputs. Conditions can include associations, attributions, and equations. Operations are basic functions comparable to Excel functions, and actions can involve searching or manipulating entities within the application.
Creating Rules and Assigning Values in Ontologies
Creating rules in an ontology involves creating new elements, associating existing ones, and assigning values to attributes. Rules are specified in plural, but in the rule, they are specified with a singular form and a number that indicates the entity type. For instance, a rule may involve creating a greeting message when a new person with a first name is created. The condition for the rule involves creating a new greeting and assigning the message "hello" joined with the person's first name to the greeting. The next step is to make the person receive the greeting. However, creating new elements and associating existing entities in a rule can involve finding and using new data. The syntax of rules in ontologies is continuously being improved to make them more readable and easier to understand.
Specifying Action Rules and Constraints
Conditions, operations, relationships, and constraints must be established to create an effective system. These triggers are used to fire rules and perform specific actions. Constraints play a crucial role in ensuring that undesired decisions are not made, and they take priority over action rules. Additionally, reference data is important, but it may not always be adequately detailed in certain versions of DMBoK. Overall, the challenge is to make these conditions readable for humans and machines to create a successful and functional system.
领英推荐
State Modeling and Availability State
State modelling is a method to represent the life cycle of an entity by indicating its current position in the cycle. This approach involves transitioning from one state to another until the end state is achieved. It combines entity descriptor and attribution concepts to offer predefined instances for modelling, allowing for the connection of attributes to the availability state. For instance, in a library setting, books can be in an available state, such as available, reserved, or lent, and business rules can be defined to manipulate the state based on certain conditions, ensuring consistent data quality.
Reference Data and Classification
Reference data is crucial for maintaining data integrity across different applications and domains. It includes master data, which serves as a unique identifier for a specific data instance in different domains, and classification data, which arranges items based on their properties or characteristics. Categorisation data, on the other hand, organises items based on their definition of category, allowing for changes in an item's category. These concepts are fundamental to building.
Utilizing Large Language Models to Build Ontology and Business Rules
Large language models (LLM) can be a helpful sidekick to humans in building ontology and business rules, with the API's structure aiding in outputting the desired format. This deterministic system delivers high-quality data and constraints, safeguarding the model and preventing misleading.
Managing Reference Data and Master Data
The challenge with reference data lies in its subjective nature, as what constitutes reference data for one person may be considered master data for another. This complexity is difficult to resolve with technology alone, as it depends on individual perspectives and contexts. Reference data management typically occurs at the database or integration platform level, shielding the technical complexities from business users. An example of this complexity is seen in how addresses are managed: while a postal service views addresses as master data, an individual simply uses them as reference data when sending mail. Addressing these nuanced challenges is still in early development, with efforts focused on incorporating these concepts into ontologies and defining rules to express them more explicitly. Understanding the context and difficulties surrounding reference data remains pivotal in addressing this complex topic.
Ontology and Reference Data in Research
Ontologies and defining rules provide a research platform for building tangibility in the field. The primary use of reference data is internal, helping with categorisation and classification. However, if data comes from an external party, it becomes reference data, showcasing the complexity of structure. Originally developed in philosophy for validating research papers against a structured map, some researchers are now using ontologies to validate or reduce hallucinations in research papers.
Use of Ontology in Language Models (LLMs)
Ontology is crucial in generating structured and logical output in language models. However, language models struggle to understand the semantics and lack an explicit structure or common sense. They are based on vector databases that look at similarities or connections. Combining LLM output with a separate ontology for validation could be an effective solution to improve the output quality. A process is needed to check LLM output against an ontology and feedback incorrect results into the language model for correction. Researchers are exploring the use of ontology to validate language model output.
Challenges with Natural Language and Data Modelling
The language model (LM) can generate an ontology and backtrack to its output using the "Chain of Thought" concept. Still, there is a lack of understanding among users about its workings and limitations. Attempting to define data management may skip important steps outlined by the DMBoK, and the statistical relationships of data management are defined by the amount of papers or discussions treating it in a certain way. Data modelling provides a structured language for more effective communication to address the lack of precision in natural language. Further research in a hybrid approach combining natural language and data modelling seeks to provide concrete designs or implementations.
Ensuring Precision and Interpretability of Model Specifications
When designing a language for a final vision, it is important to maintain precision and ensure alignment with the intended purpose to avoid errors. Research is being done to verify the correctness of the model while balancing the freedom of expression with the machine's interpretation. The human interpretation of the model requires every noun and verb to have a mandatory definition, with the name being optional or having multiple synonyms. A well-defined concept reduces the chance of misinterpretation, although it does not guarantee complete interpretability. In applications, the natural language, including names and definitions, is always available for the user to access.
Fact-Oriented Modelling and Data Quality
Using attributes in modelling and application design aims to ensure data quality by providing clear definitions. Ambiguity and a lack of understanding when filling out forms can negatively impact data quality. The philosophy is to express the meaning of things in modelling and application usage. The technique of fact-oriented modelling uses natural language from businesspeople to create specifications and constraints for the system. This allows for a quick switch from modelling to testing and validation, improving efficiency in the development lifecycle.
?
Integration of Case Talk to HAPSAH and the Concept of Reference Data.
The following are key points to consider when investigating the domain for creating new tools or systems:
·?????? Integration of Case Talk model into HAPSAH for the smooth transfer of concepts and language methodology;
·?????? Focus on capturing business knowledge in various languages without aiming for executability;
·?????? Generating artefacts such as RDF output to document and maintain data in HAPSAH;
·?????? The added value of documenting and maintaining data for modelling and running in HAPSAH;
·?????? The concept of reference data as data maintained within a single application for different roles;
·?????? No distinction between Master data and reference data when maintained within a single application.
Data Management and Information Modelling
Data management at an advanced level involves external parties and data classification as master or reference. Organising data becomes a permission issue while managing it across multiple applications, so organisations opt to centralise and manage data as a System of Reference to ensure consistency across all applications. The ownership of business processes and data plays a crucial role in managing reference data. There is no distinction between information modelling, but it is vital during deployment and management execution. Attributes have constraints on the possible values and finite reference data or domains. Reference data is responsible for managing a large furniture organisation's data. From an information modelling standpoint, the particular nature of data, such as States, does not matter. State modeling is not solely reference data and is linked to the life cycle of an entity.
State Modelling and Reference Data in Enterprise Environments
State modelling serves as a connection between ontology and rules in an enterprise environment. Reference data is only useful when multiple applications are involved, and it can lead to data quality problems if not managed correctly. Data quality issues can also arise when hidden, hardcoded choice lists for state indicators in various applications are not synchronised. Predefined instances of concepts such as attribution or descriptor are used instead of creating new elements in the ontology language. Despite initial resistance, state modelling and reference data visibility are critical in managing practical issues within enterprise architecture. The life cycle of entities within this context needs to be addressed, as it is not entirely convinced that entities have a life cycle.
State in Information Modelling and Rule Engine
In information modelling, the concept of "State" represents a condition or trigger in the rule language, either true or not. State change is always triggered by something external happening to an entity or can be derived from something else. From a pure information modelling perspective, the State might not be expected, but the system must work due to a combination of factors. The discussion illustrates the difference between case talk, the methodology behind information modelling, and the purpose of natural language execution. Despite debates about its technical necessity, the distinction lies in the necessity of the State for natural language-based information modelling and execution.
Thank you to Bas van der Raadt, PhD , for sharing this with our community!
We greatly appreciate it!
Please comment on this article if you want to receive the recording, and we will gladly share it with you.
IM consultant
5 个月Thanks Howard & Co. Please share the recording ??
Senior Data Analytics & Advisory Consultant
5 个月Thanks Howard, please share
Chief Data Officer @ Modelware Systems | CDMP Master | Data Management Advisor
5 个月Thank you to all who contributed to this webinar. Please comment below if you would like to receive the recording. Drew Kennedy; Paul Grobler; Siriki MEITE; Karel van der Walt; Preyesh Mestry; Mbali Masenye; Thomas Hildebrandt; Mayela Torrealba; Hakim Telmat; Antonio Rocha; Ann Neo; William Raubenheimer; Elham Kariri PhD; Kayle Maclou; Trudy Naidoo; Connie H.; Neo Kenneth Dunn; Simon Collery; Naveen sankar reddy Siddareddygari; Inri M?ller (CDMP); Irina Vaiciuviene; Francis Beeson; Bilash Kumar Dash; Marco Wobben; Dalibor Wijas; Margaret Matthew CDMP; Matthew Lawrence; Ekua Effraim; Brigitte Blankers; Kevin Lee; John Shewell; Anurag Kanumuri; Laila Patel - CDMP; Solomzi Mtyeku; Rizwana C.; Arianne Shaw; Daniel Lundin; Daniyah Aldawsari; Donald Williams, CDMP; Keroshin Naidoo; Fernando Cuadra; Pavani P; Manoj Kumar Lal; Deshnee Boodhram; Christine Van Den Berg; Michael Best; Carolyn Fulton; Nisha Narsi