Synthetic Data Generator Using Flame Enterprise AI Platform
Flame is an Enterprise AI platform which leverages its Knowledge Engineering and Computational Reasoning capabilities to solve enterprise problems. Current release supports robust Knowledge Engineering and Code Generation capabilities and our engineering team is currently working to launch its proprietary architecture, Ntelg, which will enable AI engineers to create intelligent workflows to solve business problems. We propose that the new architecture will support a broad range of Polynomial time and Non Polynomial time problems using heuristics and Machine Learning capabilities. Flame exploits the learnings of Large Language Models (LLMs) while at the same time manages the hallucinations and inaccuracy associated with general LLMs by applying the reasoning capabilities of the platform.
We are testing the new capabilities on various enterprise use cases and getting encouraging results. As part of this initiative, we have built a Synthetic Data Generator capability within Flame platform and testing it on various combinations of data inputs. Synthetic Data Generator is an important use case of Generative AI as AI/ML models can be trained using synthetic data efficiently. Synthetic data also have lot of usage in test data engineering and can also be used in performance testing and other type of testing requiring production like data at the same time while avoiding to compromise data privacy. This article provides an overview of how easily we can train Flame platform on the data model of the synthetic data by supplying just a few records and the structure of the data. Once Flame learns the structure of the data being requested, we create a Knowledge Base describing the business logic of the data we need to generate. This the secret sauce of Flame which enables it to use the power of Generative AI to generate the data but at the same time keeps it within the confines of the business logic. Our Client/Admin architecture ensures that we can control the business logic to serve the enterprise use cases and allows Flame Engineers to initiate parallel processing. We can use Flame to generate data at scale in the background while working on other use cases.
Below is the step by step process to generate synthetic data using Flame platform -
Step 1 - Create a New Project
Flame allows creating a project specific to a problem or a group of tasks. The project can be utilized requires Knowledge Bases, toolsets and workflows in focused and efficient manner. In this example, we will create a project for Synthetic Data Generator.
Step 2 - Create a Knowledge Base
As described above, in Flame terminology, a Project comprises Knowledge Bases, toolsets (Flame model of toolsets) and other Flame Components as needed to fulfil a business process. Knowledge Base is our secret sauce and differentiates Flame from a generic LLM architecture. Flame's reasoning capabilities allow us to exploit best of both the worlds; the learning power of Generative AI and reasoning capability of Flame platform. In this particular example, we create a Knowledge Base with below prompt -
We have used a simple prompt for this test however Flame allows a wide range of capabilities to model a business behavior. We are working on custom AI functions to model Heuristics, Predictor and Scheduler capabilities etc which will allow Flame Engineers to build powerful AI workflows.
Step 3 - Learning the Data Model
In this step, we make Flame learn the data model of the input data so that Flame can generate the similar output as defined in Knowledge Base and the input given by the user. What is impressive is that we can train Flame with a very small subset of data and it gives an impressive result after learning from a small data set. We have used below set of sample data for this training -
领英推荐
Step 4 Generating Synthetic Data
This is when the magic begins. Once Flame platform is trained and Knowledge Base is created successfully, we can keep generating synthetic data from Flame Client front end at scale and in parallel. One sample output is given below -
Below is the complete set of 10 records in CSV format
We are doing multiple tests by finetuning the training data and trying various logic while also working on the performance, and will continue to improve the platform for this specific use case and beyond. This article is to showcase how easy and simple it is to develop an enterprise use case on Flame; utilizing its inbuilt functions and flexible architecture. We are excited to continue developing new features and are on track to launch our powerful architecture, Ntelg, which will empower Flame AI Engineers to create powerful and intelligent applications in the service of enterprise use cases.
-