Start Your Generative AI Transformation Without a Development Team
Chris Pappalardo
Senior Director at Alvarez & Marsal | Software Engineer and FinTech Innovator | CPA, AWS Solutions Architect
Following the release of OpenAI’s Large Language Model (“LLM”) chatbot to the public in November of last year, the top search term in the “Finance” category on Google Trends has been “chatgpt”.
The same is true for the Business & Industrial, Computers & Electronics, and People & Society categories.?Some categories even have ChatGPT for both the top two searches with different spellings (“chat gpt”).
It’s a good bet that you, the reader, have already been either directly affected by ChatGPT in your workplace or know someone who has.?Just about everyone I know is at least aware of ChatGPT or the underlying technology.
This isn’t surprising given the novelty of this technology. What is surprising is the speed at which this technology is transforming the business world.
Even the Big 4 accounting firms, which typically move at a slow pace when adopting new technology, have recently announced plans to invest in and implement generative AI technology.?KPMG just announced their plans last month:
Certain professions will be disproportionately affected, such as accounting and law, so it makes sense that some organizations are reacting faster than others.?However, all organizations need to at least consider the impact of generative AI on their business, and some need to get started with incorporating this technology now.
So how does an organization outside of Big Tech get started with generative AI?
One of my favorite diagrams of late is this one inspired by Fred Brooks from his book on software engineering and project management from 1975:
The diagram depicts the impact on time-to-completion of adding people to a development team responsible for a large and complex software project.?The point Fred is making is that the relationship between the number of resources and time is not necessarily linear, and the “sweet spot” of the efficiency curve tends to be towards the smaller end of the spectrum.?A great example of this is Whatsapp, who built an app used by over 1 billion people with just 50 developers.
The “old way” of IT projects and software development suffers from this problem.?The “old way” includes long planning cycles with large groups of diverse stakeholders, Project Management Office teams with big, frequent status meetings, and lengthy, costly development cycles that often result in delayed “big production” system launches.
The truth is you don’t need any of this.?You can get started with just a single developer and open-source software.
The topic of generative AI and effective agile development is too broad for a single article, so let’s narrow things down to something simple.
The Economist recently published an article in April on LLMs entitled, “Large, creative AI models will transform lives and labour markets,” which made two insightful points.?First, large generative models have probably reached their peak:
And second, future improvements will likely come from private as opposed to public data:
In other words, the game is no longer about who can build the best big model, it’s about who can leverage existing technology on private data faster to create an advantage in their marketplace.
For those of you who know me or follow my content, I work at a consulting firm specializing in the valuation of financial instruments, business interests, and other tangible and intangible assets.?We have a lot of data related to valuation, mostly numeric and financial in nature, going back many years and across sectors and asset types.
That’s the good news.?The bad news is that it’s all tucked away in Excel spreadsheets and other file types that are not well suited to searching and aggregation.
Since building intelligent systems starts with (lots and lots of) data, and the future of generative AI is in data that is not public (and not just textual data, but numeric data such as financial data[1]), one of the first jobs for any organization looking to make a generative AI transformation is to take stock of and begin to harvest their own internal data.
For my organization, that meant:
领英推荐
We accomplished this with a single developer using Python and open-source libraries.
The tool is called “eparse” (short for Excel Parser) and I released it as open-source on GitHub:
Anyone can download it and use it freely under the associated MIT license.
The README file explains how to use the tool, so I won’t rehash that in this article, except to point out the following features of eparse which makes it a viable extraction tool for these purposes:
The interface is flexible, and data can also be streamed to the user, so tabular Excel data can be visually inspected during the extraction process.?This is a GIF from a demo of the tool I gave last week:
There are other tools that do this, particularly in VBA.?However, I wanted something lightweight that would work in a headless Linux environment and that had a simple and native database interface.
The point here isn’t to promote any particular tool, but to stress the fact that you need your data in a format that you can query.
Exploring data by looking at distinct column headings, shared fields between groups of data, and the ability to join subsets of your data is the purpose of the exercise.?Like making a tasty glaze, your goal should be to start with “big” data and reduce it down to higher quality.
Again, from the April Economist article:
Once you have a high-quality, curated dataset that is private and specific to your organization, you are ready to move on to experimenting with generative AI and discovering how it can give you an edge.
Where to go from here?
Over the course of my career in both finance and accounting and now in software development, I have never seen the business world react so quickly to an emerging technology as I have seen with generative AI over the past few months.?I believe that many organizations are at an inflection point that is playing out as you are reading these words.
It is clear to me from my research in and experimentation with generative AI that the technology in and of itself is no longer a competitive advantage (I will write another article soon talking about open-source LLMs, which appear to perform nearly as well as the proprietary models).?The key to creating an advantage is to leverage your own data and processes with these tools.
As we’re told in the Zen of Python, “now is better than never.”?My advice is to get started with exploring your data sooner rather than later.?Things are moving too fast to spend time building a development team.?You’ll be surprised by what you can accomplish with a single developer and some free software.
[1] To be clear, LLMs are language models and primarily learn relationships between words to generate text.?However, the underlying technology is a neural network, which trains on numeric data and can learn the relationships between any kind of numeric data, whether it is vectorized word and embedding data or financial statement and valuation data.
Director of AI @ Trullion | NLP, Computer Vision, ML
1 年Chris Pappalardo great article, and great take on how easy it is to get started, and how outsized the impacts can be! Looking forward to playing with that package.
Managing Director | Valuation Advisory Services | Aprio
1 年Love this