The real benefits of AI agents, with Dave Brewster & Ravi Ramachandran - HS#23
Miko Pawlikowski ???
I help technical leaders achieve HockeyStick growth | Head SRE | Co-founder SREday.com, Conf42.com & 5 more
In the latest episode of HockeyStick Show , I had the pleasure of speaking with Ravi Ramachandran and Dave Brewster , co-founders of Eidolon AI , about the evolving world of AI agents. We tackle everything from the true definition of an AI agent, how AI agents are solving real-world challenges, what the future holds for AI, and whether my vacuum cleaner is secretly plotting to attack me in my sleep.
You won't want to miss this!
Podcast
If this podcast created value for you, subscribe at hockeystick.show and share with 1 friend. Producing quality content like this takes a lot of time and effort, and every subscriber helps me continue this mission.
Video
Audio
Summary
Welcome to episode 23 of HockeyStick Show podcast, where we dive into groundbreaking advancements in technology, business, and performance. In this episode, host Miko Pawlikowski ??? sits down with Ravi Ramachandran and Dave Brewster , two-thirds of the co-founder team at Eidolon AI , to discuss the intricacies of AI agents and their potential to transform industries.
What is an AI Agent?
Kicking off the discussion, Miko asks, "What the hell is an AI agent?" Dave Brewster elucidates that an AI agent is essentially the smallest atomic unit capable of providing an autonomous answer. These agents can range from complex systems that debug code to simple tools that search documents. At the heart of their definition lies the element of autonomy—a pivotal characteristic that distinguishes these agents.
The Trust Issue
Miko raises an important concern about trust, given that even advanced AI models like ChatGPT sometimes "make things up." Dave explains that trustworthy AI agents need built-in fault tolerance. He points to Claude as an example of an AI that internally checks its processes, ensuring reliability. For enterprises, trust also extends to data governance, ensuring that sensitive information remains secure.
Market Maturity and Observability
Ravi sheds light on market maturity, explaining that apprehensions often stem from the evolving nature of technology. Miko elaborates on the limitations he faces as a software engineer, particularly the limited observability within LLM (Large Language Models). Dave acknowledges this challenge, emphasizing the need for multi-query, fault-tolerant systems that can better manage and debug AI outputs.
Building Practical AI Agents
When asked about their unique selling proposition at Eidolon AI, Ravi and Dave stress the importance of practical deployment. They describe their AI agent server, which focuses on easy integration and management using Kubernetes. By leveraging Kubernetes, the team ensures scalability, security, and seamless integration with existing infrastructures.
Real-World Applications
Miko presents a hypothetical yet practical use case: an AI agent that could analyze Slack history to gauge the need for a coffee break, estimate work hours, and notify his wife via text. Dave responds affirmatively but notes that while they don't have a Slack loader or text capability, integrating services like Twilio would be straightforward.
Production Use Cases
Dave shares real-world applications where AI agents have already made a significant impact, such as acting as API gateways to simplify calling multiple endpoints or using AI to detect fraudulent insurance claims. These use cases highlight the versatility and potential of AI agents in day-to-day operations and specific enterprise needs.
The Future of AI: Memory and Beyond
Dave and Ravi delve into the future, pointing to the compelling concept of adding memory to AI agents. Memory would allow agents to learn and adjust based on past interactions, dramatically enhancing their utility. This advancement could revolutionize everything from personal productivity tools to complex enterprise solutions.
Ethical Considerations
The conversation takes a philosophical turn as Miko and Dave discuss the ethical implications of AI and potential misuse. Dave stresses that while AI can enhance productivity, it's crucial to use these advancements responsibly to avoid dystopian outcomes.
Open Source and Community Engagement
Ravi emphasizes Eidolon's commitment to making AI accessible to everyone. This open-source philosophy extends to their working structure, where daily standups are open to public participation, fostering a transparent and collaborative environment.
Conclusion
In this enlightening episode, Ravi and Dave offer a nuanced perspective on the current state and future of AI agents. As they continue to push the boundaries of what's possible, their work at Eidolon AI stands as a testament to innovation, collaboration, and responsible technology development.
If you're intrigued by the possibilities AI agents present and how they can revolutionize both your personal and professional life, tune in to the full episode of the HockeyStick Show podcast. You won't want to miss the insights from these pioneers in the AI industry.
Transcript
Miko Pawlikowski: [00:00:00] What the hell is an AI agent? The smallest atomic unit of something that can give you an autonomous answer. If you're feeling really brave, you can have the LLM automatically create the text message and send it for you. Humans do this when they dream. You resolve your memories of the day when you're dreaming.
Miko Pawlikowski: But what you're really doing is understanding what's important and not important. Claude is a great example of something that checks what it's doing. Not just at the prompt level, but behind the scenes. A lot of people are afraid, oh, this is going to take over the world. It's not.
Miko Pawlikowski: So what I did today was to bring two thirds of the co founder team at Eidolon AI. got Ravi Ramachandran, Dave Brewster, How are you doing today?
?Very good.
Ravi Ramachandran: Awesome. Where are you hauling from?
Dave Brewster: I'm based in California in San Jose?
Dave Brewster: Yeah, and I'm in Ohio,
Miko Pawlikowski: That's pretty sweet.
Miko Pawlikowski: Let's jump straight into that. What the [00:01:00] hell is an AI agent?
Dave Brewster: Sure, absolutely. So the definition's a little bit all over the board, but all I can do is tell you how we define it. I think agents come in various sized packages, right? So you see some of these, larger agents trying to code for you or trying to debug your code for you, things like that.
Dave Brewster: those are obviously very large agents, very complex agents. And you see something simple,a tool that will, automatically, allow you to search your docs on S3 or something like that. both of those to us are an agent. In fact, we define an agent as the smallest atomic unit of something that can give you an autonomous answer.
Dave Brewster: So an agent to us is a simple service. some of those larger complex agents that I mentioned, those typically are made up of many different services under the covers, many different calls, many different contexts, talking to, An LLM, but all of those to us still, are agents, right?
Dave Brewster: Autonomy, I think is the key [00:02:00] word. Something that can give you an answer from an LLM, is a, autonomous agent for us,
Miko Pawlikowski: Okay. So let's address the elephant in the room immediately here. let's say I've got an agent and can do things for me, but I've also spoken to ChatGPT. And I know that even though it does great things, most of the time, it also sometimes just makes things up and that's completely bad thing. What is it useful for today?
Miko Pawlikowski: What's at the point where I can actually deploy it? And it's autonomous.
Dave Brewster: Yeah, I think if you look at some of the really nice agents that are built out there, they have fault tolerance built into them, what I would call fault tolerance, right? to the point where,a lot of them will double check answers.
Dave Brewster: A lot of them will double check what's going on. for example, if you use Claude, right? Claude is a great example of something that checks what it's doing. Not just at the prompt level, but behind the scenes. everything with Claude you can tell when you use it is made up of multiple LLM calls, not just one.
Dave Brewster: And, one of the things they're doing in the [00:03:00] background is firing off multiple requests, and getting two different answers. Or if you ask for something on how to code, they're firing off another LLM request in order to pull that off. A lot of times people call them tools, right? They'd be tool calls.
Dave Brewster: sometimes you can do it ahead of time and sequentially branch, the call and then resolve on the backend. There are many different mechanisms to check to see if something is working well. but the very simplest, the very easiest GPT 3.5 question we would ask in the past was one call, right?
Dave Brewster: It was one call without any checks. And I think that, that simplicity, at the agent framework level is going to go bye bye, right? If you're producing something that is useful. It can't be a single, simple agent call that gives you one answer. It's got to have checks and balances built into it.you have to have the ability to call off onto other agents or [00:04:00] tools, in order to do work for you, et cetera, et cetera, et cetera.
Miko Pawlikowski: I, there's lots of techniques to avoid hallucination, that's out there. But for us, from an agent framework point of view, having that checks be agentic in itself, it was really what we feel like is the way to go. So would you say that the biggest limitation really at the moment is this trust and everything that works to
Dave Brewster: yeah, it is. But trust comes at a lot of levels, right? Trust at an enterprise, comes at many levels, not just trusting the answers you get, but also, trusting that your data is not getting to get, copied and released somewhere else. it's being able to understand. the lineage of, where all the requests, right?
Dave Brewster: So governance built in to the framework that you're using. so if a user makes a request, knowing what agents that touched, what data they had access to, how it was touched, et cetera, et cetera, et cetera. is one of the things that enterprise would consider trust.?
Ravi Ramachandran: Part of it is, the market maturity. And right [00:05:00] once things are in early stages of evolution in a new market, it's just natural apprehension to make sure that things are fully baked and, can be brought to market. So I think part of that trust issues, security, this is the aspect of, Yeah, where's my data. Where am I? Where are the skills in an enterprise to roll these things out? Where is the infrastructure at? And then how mature are the people vendors that are offering the market and the technology in this space? So it's a market maturity that then drives a lot of the trust factors that Dave just mentioned.
Ravi Ramachandran: where I'm sitting as a software engineer, one of the things that comes to me the least naturally with this LLM technology is the fact that The visibility, observability of it is a little limited. and by little, quite. Most of the time, I don't know why it went wrong. So I get the argument that you can do redundancy and maybe launch multiple requests and maybe?
Miko Pawlikowski: do some kind of vote consensus and pick the one direction [00:06:00] that goes with it.
Miko Pawlikowski: But is that a problem that can actually be solved or is that just something that we're going to have to get used to it?
Dave Brewster: I think that observability is quite an interesting problem. If there's observability within a single LLM, that's hard. knowing where that data came from, that answered that question and why it happened. I think,GenAI researchers struggle with why answers are even given, right? the whole concept of what is really going on.
Dave Brewster: I know it's a pick next word game, but why did it pick that word from a vector point of view? from, all of these things, like some of that I think is not really known well, but observability on where my answer came from, with respect at the LLM level, is quite possible, right?
Dave Brewster: again, today, the, or the way a lot of people think of LLM calls is it's a single call and those days are really going to be [00:07:00] numbered. So if you can observe, not only. the LLM made this response, but what data did it have to make that response? what tools did it have access to that it chose to call or not call? And if it did choose to call them, what did it ask them?
Dave Brewster: Things like that. That is, super important for a software developer to be able to understand, but within In an LLM, I don't know of any good research that's available?
Dave Brewster: that's a really difficult problem inside, maybe Open AI, maybe, meta, anthropic, maybe some of the other, vendors can figure this out, but,observability should come or needs to come at the tool level as well. And it's really not there either today.
Miko Pawlikowski: so we've got some of the problems here, and you guys are obviously heavily invested in the AI agent thing. So what angle do you attack it at Eidolon? What's your unique selling point?
Dave Brewster: I think we've come at it very [00:08:00] much from the perspective of people are building agents. again, what can people consider agents varies? and really a lot of the focus has been how do you end up building these things? The less so about how they really deployed. And we've seen the big problem being these agents never seem to make the, make it to the end into production and really thinking about that problem as very key part of an agent framework, but also an agent servers. How is it going to get deployed? How do these services communicate? How do they coexist in an existing environment where, there's many other services, many other applications running, so not just simply building, but also making sure that can be easily deployed.
Ravi Ramachandran: And that's one of the key thoughts that we focus on. And why we call it an AI agent server is one of the key ideas of production workloads that can be deployed and thinking about that up from the ground up, right? And that's one of the key things that we think differentiates us.?
Dave Brewster: I think that is the biggest one. And if you look at the, a lot of other frameworks, they focus on [00:09:00] just the framework part.
Dave Brewster: And if you want things like observability built in, and if you want,any kind of auditing built in and things like that, you can do that as just writing files or hooking into something else. But really the way that the industry has seen from a software developer point of view is to turn those into services and turn that the thing that runs them into a server that monitors and gives access to all of that information for the end user that's running it. we've chosen, an easier way to do that, by plugging into Kubernetes, right? So for large scale deployments of our services, we, you install a K8s operator. And, we're custom resources running?
Dave Brewster: on the Kubernetes cluster. that makes us, super easy to install, super easy to deploy the agents, changes to the agent, et cetera, et cetera, et cetera. And we can leverage everything that Kubernetes gives you, related to that,DNS and a lot of other things to be able to find [00:10:00] the agent machines that are running in a clustered environment.
Dave Brewster: which allows you then to have multiple agent machines running in one cluster that gives you the ability to have different, RBAC rules associated with each, each agent that's running in each one of those servers. or even at the whole agent machine level, really clamped down the security.
Dave Brewster: If that, say that machine might have access to an S3 buckets that are. that only the CEO of a company should see, you could lock down that whole agent virtual machine concept so that only that process, only that node, only that machine, however you want to deploy it, has access to, to that data. the whole concept of an agent, AI agent server really brings those, ideas together, in a way that, makes it simple to, manage and monitor and meter your agent deployments.
Miko Pawlikowski: Got it. So you build up on top of Kubernetes and you provide this glue that makes it easy to deploy [00:11:00] this things and put the right bumpers around that. And in terms of the actual agent technology,whatever is making the external API calls or effectuating this changes, Is that, your open source stuff doing that,?
Dave Brewster: That's right, Miko. There's two parts. There's the agent server and the agent SDK. And so the SDK is the open source project that's there.
Dave Brewster: and the Kubernetes operator is open source too. It's all there together. But,that is what you code against. and it's interesting. For a lot of things in our SDK,we believe, as I said earlier, that an agent is an atomic unit, but it's also a collection of agents that build something more complex.
Dave Brewster: So what you'll see when you download and get our SDK is a bunch of pre built agents built in, and, they allow you to simply assemble them together to make something more complex. I'll give you an example of that. we have an API agent built in that is an autonomous API agent.
Dave Brewster: You point it at config [00:12:00] time to,an open API JSON file swagger. JSON file and, it will learn, what APIs are available, what the endpoints are and how you call them. And there is also a RAG agent that's there, which, you can plug different loaders into, and a RAG agent for people that don't know is what you use?
Dave Brewster: to process large number of documents that won't fit in the context. and, putting those two together with a GitHub loader, you can very easily, like just a few files of config. Create something that lets you search your docs, monitor your issues of GitHub, et cetera, et cetera, et cetera,for changes so that, when you're creating a pull request, for example, you can ask the API endpoint, what should I put in this PR?
Dave Brewster: What should the description be? And, and voila, you would get, an LLM guided,response that matches your code from the ability to search your code, the ability to search your docs and matches [00:13:00] the changes related to the files that are in there, straight from the LLM that you can copy paste or put into,your GitHub pull requests.
Dave Brewster: Built of smaller agents that are available lets you assemble things that are there. We really believe in this Lego building block approach. and if we don't provide the agents that you want, the SDK is also code so that you can write code for your own agents, for your own building blocks to put things and build things together.
Dave Brewster: Every part of the SDK is completely pluggable. Everything's behind an interface that makes it very easy to, swap out a different implementation for.
Miko Pawlikowski: So that makes sense to me, but I'm still wondering, so between the LLM that produces some text and an action actually happening, is there like a standardized interface, that describes actions that it's supposed to spit out? Is that how it works?
Dave Brewster: you have an agent, and to us an agent's a service. So every agentinput that can be described by JSON schema, input [00:14:00] obviously is much more important than output here for use cases. If that agent wants to talk to an LLM, We do it through this concept that we created called an, an agent processing unit, and an agent processing unit takes care of all communication with the LLM, not just the simplistic LLM level, but also everything related to memories, tool calls, any kind of IO transformation, storing things on the local file system that you need, that are temporary.
Dave Brewster: Et cetera, et cetera, et cetera, such that, when you're coding against the LLM, it really just feels like another API endpoint that you're calling. and, given the fact that you can describe the input and output schema of what you give to the LLM and how you want it formatted back, it really allows us in many different environments
Dave Brewster: to [00:15:00] tailor that to the output that you get. So for example, say that you're using a model that doesn't support tool calling out of the box. RAPIPU will wrap that call, in something that will pull out and discern the JSON that's there and treat it as a tool call, right? we'll do all the heavy lifting to put All of the prompting in there that's needed, everything that's needed in order to fake tool calls, if you will, which it does include multiple LLM route trips and, call the tools, get the responses, give them back to the LLM, et cetera, et cetera, et cetera, in the loop, until the response comes back.
Dave Brewster: I'll give you another example. Many models don't support images, so image to text, I'll use that as an example. The APU has an optional, component you can register on it that will do any image to text translations automatically for you and vice versa, and speech to text, audio to text. and the corollary of that as well.
Dave Brewster: so that all you need to do is register a more complex [00:16:00] APU where you give it the base LLM,and you give it an image converter, or an image processor, and an audio processor,optionally and it will,and it will make all that work automatically for you. the APU also handles document upload RAG as well, both in context and search, just like Claude and, and ChatGPT do as well.
Dave Brewster: so the APU really is there to abstract away the complexities of the LLM, not just at the simple LLM call level, but the whole thing, tool calling, image translation, et cetera, et cetera, et cetera, for you. you can have multiple APUs in an agent. the only use case for that is if you have a chatbot interface where you want to switch between different APUs dynamically, but you can if you're doing something really complex in an agent.
Miko Pawlikowski: and again, it's just a separable component that can be plugged in like anything else. Got it. So. let's walk through a use case that's currently brewing in my brain. if I [00:17:00] go and check out your stuff, will I be able to implement an agent that can look at my Slack history of the day and A, order me the right kind of coffee, depending on how bad the day is, maybe a double or a triple?
Miko Pawlikowski: estimate how long I'm going to take and how late I'm going to be from work and text my wife, or would I need anything custom for that now?
Dave Brewster: Yeah, we don't have a slack loader, so you would need that. and, we don't have the ability to text, so you would need that. But everything else in the middle, probably not.
Miko Pawlikowski: I'll just plug in Twillo or whatever.
Dave Brewster: Yeah, exactly. We haven't built the Twilio integration yet. yeah, it's not difficult.
Dave Brewster: And like I said, everything is component based. yeah,
Miko Pawlikowski: There you go.?
Dave Brewster: You can have it autonomously happen where you register the text, your wife as a tool. if you're feeling really brave, you can have the LLM automatically create the text message and send it for you. There you go. Without your approval. I'm not sure I would do that, but,?
Ravi Ramachandran: I think Miko's on the right track in the sense [00:18:00] of, we're already seeing things that are going to be done very differently, right? From a personal point of view, how you live your life, how you do your day to day tasks, get your coffee, get, whatever your favorite beverage or your restaurant meal is.
Ravi Ramachandran: Those things are happening today and just accelerate. Of course, if you have the right kind of tools that you give the LLM to get the agent to be able to access, a lot of different things are possible. So for sure.
Miko Pawlikowski: Yeah, and obviously I'm hoping that Tim Ferriss is watching this and he's proud for the four hour work week here and saving all this time. but on a more serious note, I think that would actually be pretty cool if I could have something that Like reads my emails overnight and preps on the morning for me with a summary of important stuff and the stuff that I need to tackle I'd, pay money for that right away.
Miko Pawlikowski: But I wanted to go back to pick a little bit on your sales pitch, Ravi. You mentioned that if you looked at how things are being deployed today, [00:19:00] they basically aren't, they never go into production.
Miko Pawlikowski: If I had like a magic wand that would tell me what everybody is doing with AI agents in general. What would they be? What would they be doing? And what percentage would you estimate actually makes it into production use?
Dave Brewster: Oh, wow. look, I think today, definitely we're seeing a lot of POCs and a lot of the effort has been around showing the art of the possible, right? So pilots, POCs, and they could be across different functions. It could be, in a particular domain. And I think there's been a lot of interest because there's a top down mandate for people to be able to use and test agents and actually show what's possible.
Ravi Ramachandran: that's more from an enterprise perspective. I think from a personal perspective. A lot of this is on personal productivity. You mentioned things like automating your calendar, reading your email, responding automatically to things that, are today just tedious for people to handle,on a day to day basis.
Dave Brewster: so those are, I think there are two different worlds. I think the former, which is, doing things, giving access to web browser, to your [00:20:00] calendar,again, trust not withstanding, those are very much already in motion. I'd say much probably ahead of more of an enterprise use case where those things are much more in the POC pilot stages that are moving towards production, I think, over the next 12 to 18 months.
Ravi Ramachandran: but yeah, that's the state of play as we see it. So are there production deployments? Yes. there nearly is as many as they should be now. And that's a big change that we're going to affect. and we're trying to think about it from the end state, not just from the, out of the possible.
Miko Pawlikowski: Do you have any funny stories on that subject? I know there's a lot of stories about chatbots going wrong. starting to insult clients and telling them the wrong things. Are there any cautionary tales for obviously why people should use your stuff instead of doing it badly.
Dave Brewster: Yeah,
Dave Brewster: the biggest funny story I have is I think every developer that does Gen AI runs into this once. hopefully not more than [00:21:00] once. you start a process that is running, to discover, whatever, to do RAG or, I think for me particularly, I was, loading a schema and trying to do a hierarchical graph representation of the schema.
Dave Brewster: And, you start it, you code it, you program it, you run it, you debug it, you run it, you debug it, you run it. And it's late at night, it's 1am, you run it, and it's not producing the results you want, you're frustrated, you go to bed. And you didn't stop it. And it runs overnight, runs into the morning, and you wake up to 50 emails of your account has been funded from OpenAI with more money. And, and voila, there you go. that's my cautionary email, like to developers getting into gen AI. If you create an agent system that can recursively call itself, it will, and it will rack up [00:22:00] some dollars real quick on you,
Ravi Ramachandran: And that won't be funny, but it'll
Dave Brewster: it is not very funny. Yeah. But no, that's the biggest one that I've personally run into.
Dave Brewster: Yeah. We're boring. We do enterprise stuff,?
Miko Pawlikowski: Yeah. My brain is just like searching through the images and the memes I've seen over the years of all the things about leaving the credit card on your cloud account and accidentally leaving an instance when you went on a holiday.
Miko Pawlikowski: It's a real thing.
Dave Brewster: multiply that by a thousand with LLMs.
Dave Brewster: Yeah,
Ravi Ramachandran: It's very interesting, right? I think the cost aspect of it is something that people maybe don't talk about as much, which is what Dave's mentioning. and I think in general, the agency talk about the productivity aspects of it, and it's been a lot of discussion on how productive it makes all of us.
Ravi Ramachandran: there's a cost aspect of it, which is, making sure that's done in a responsible way. in a fiscally responsible way. But there's also I think the creative aspect of it. If you just think about how agents are literally changing everything that we do, it just gives a lot more freedom of,people to be able to not focus on the [00:23:00] low level task, but really get much more creative about thinking about problem solving. And that's not something that we talked a lot about too, I think that's one very specifically a benefit of AI and agents. Which is letting people's mind be open, letting them create and really be able to express themselves in software and other aspects of life,
Ravi Ramachandran: which I think is really pretty cool.
Ravi Ramachandran: I don't know how you measure that, but Hey, not everything needs to be measured in that sense.
Miko Pawlikowski: And your stuff specifically, do you have any cool use cases you can share of people using it to do cool stuff at the moment, actual production usage?
Dave Brewster: so you start with the simple, right? one of the things I find interesting that people are doing, is, using it as an API gateway, right? as developers of business apps, one of the things that is the most boring, and on some of the hardest, by the way, is you're writing a service that needs to call a bunch of different API endpoints and coding to a doc that likely wasn't updated any time [00:24:00] recently, and that doc is incorrect and it doesn't adequately represent what you need to do.
Dave Brewster: But when you hit the endpoint and you look at the docs on the endpoint, they're right, they're there. A developer, while they were editing the code, of course, editing, edited all of The open API stuff that's related right next to it, if you're using a reasonable framework and everything's up to date, everything's there.
Dave Brewster: super frustrating. So one of the cooler, smaller things that I've seen people do and put in production is using an LLM as an API gateway. So you have your code call into an agent. Again, it's just a service call, in our framework. And that service call is your own personally fixed end point, right?
Dave Brewster: Like it has the JSON schema that you want it to have. You're gonna call it from code. It's never gonna change. It's what you need it to be. but that then calls into, a remote service, and you let the LLM make that call. So you load the OpenAPI schema into the LLM, either using RAG or [00:25:00] directly in context, And then you let the, LLM give you back a tool call that would be the remote call.
Dave Brewster: You make that remote call. that's a really interesting use case,for lots of things. In particular, what this guy was using it for is to call into multiple states, have basically the same endpoint for regulatory stuff. And,they semantically have the same input, but syntactically, they're completely different.
Dave Brewster: and you need to file a document with each state. They're using it like that, right? So call into this one,etc. I have an LLM do it autonomously. That's like a simple, really cool use case. Something more complex, is in a claim system, right? you're an insurance company. Say you're a, an auto insurance company, right? You're, you write policies for auto insurance and a claim comes in. And one of the first things that an insurance company does is determine is this a fraudulent claim or not a fraudulent claim, and historically that's been done completely by humans.
Dave Brewster: I [00:26:00] think historically they did it all as a statistical exercise. but later on as computers got more and more involved, it involved rule triggers. And even today, the state of the art pretty much is rule triggers at most insurance companies. They're starting to use AI to detect that as well.
Dave Brewster: But LLMs are much better at doing that than standard, machine learning models. For one reason, and that's a lot of the time, the things in the claim are written in the notes, like 90 percent is written in notes or it's an image or series of images or whatever. And, using standard machine learning to do that, while it's great and possible, it pales in comparison to using an LLM to do that because of its, natural language abilities.
Dave Brewster: using it as a bloom filter for, insurance companies on claims is something that is. Extremely cool. and then I think if we start thinking about the future, which is a little bit of a segway of, where do we go from here? what's the next big major milestone in LLM?
Dave Brewster: we, we then would start [00:27:00] talking about memory and adding in context memory to what's going on.?
Miko Pawlikowski: I like the segway. So let's do that. Let's say that we solved a memory problem. How does the future of humanity change?
Dave Brewster: Yeah, exactly. wow, it's dramatic. Big, dramatic question. I think a problem we have with LLM toolkits today, I'll start there, is that they're bare metal. So if I'm using lang chain and I'm trying to code up my app.
Dave Brewster: it feels like everything I do is different, the way I approach the problem is different. To me, I would call it like a bare metal implementation, to give a good example, I started my career many years ago at a company called net dynamics, which is an app server company,and just fast API or whatever you use today,those roots came from one of this company or a couple of other companies.
Dave Brewster: And when we were developing the first app server, people were using what I would consider very similar to the calls that they do for opening [00:28:00] Open AI today or any bare LLM call in that they're just very low level. They're very direct operating system,and they're not higher order packages.
Dave Brewster: So if I'm using an app server today and I want to, I don't know, forward, my call and code to something else. It's one API call. It's super easy. Or if I want to return a JSON package, on my response, it's super easy to do. and I'm still being pretty bare metal ish here, but.
Dave Brewster: As we've developed the app server over the last 20 years or even more, the packages that are available to you to do that have gotten larger and larger and do more and more for you. The LLM, where we're at today with the SDKs that talk to an LLM is they're still very bare metal.
Dave Brewster: and I would also include Eidolon somewhat into that, right? Like our goal is to up level the API so that you're working with these larger level Lego blocks instead of having to implement RAG yourself or, think about it at a low level. and make those decision [00:29:00] trade offs between do you want to do multi query, single query, et cetera, et cetera, et cetera for that.
Dave Brewster: memory. Is to me, the breaking point where simple versus complex, being able to do bare metal and then having to use something of a toolkit that takes care of a lot of things for you, memory will be that breaking point, in my opinion. Why is that? When you introduce in context learning into a series of agent calls, You change the way the agent is working dynamically over time, right?
Dave Brewster: So let's start real simple, I have a chatbot and that chatbot, remembers me as a user personally and what I like to talk about or what's there, et cetera, et cetera, et cetera. That's pretty simple and can be coded, right? But you got to make some pretty big decisions on what's important? What's not important? What's the context of [00:30:00] what's there? How do I roll that context in over and over time?I could tell you the sky was blue today and now it's tomorrow. Is that an important memory? Probably not. It might be private. I tell you my name that likely is an important memory.
Dave Brewster: If I tell you that the sky was blue today, what is it tomorrow? And I'm a weather person and I'm really into weather. maybe the sky being blue means it's clear and it is more important. So how do you work context into that? Now let's bring that back into my, agentic use case where. It's not a simple chatbot remembering memories.
Dave Brewster: It's an agent over time. And let's bring that back to the claims example. I've written a series of agents that determine if a claim is important to this company or not and should be litigated. and there is one particular trigger, me as an insurance company, finds fraudulent. I [00:31:00] can make one up, but I'm not going to, if you tell that agent over and over again, that no, that claim was fraudulent, either through reinforcement learning or lots of other techniques, then that system, if it remembers that claim being fraudulent can then have a higher accuracy over time.how do you do that in a very simple bare bones system without writing a ton of code? It's pretty difficult. We don't have long term memory yet in Eidolon, but it's in a pull request and it will be here very soon.
Dave Brewster: But those are the things that, we're starting to think about. How will memory affect what you're doing over time? And then once you add memory, how do I resolve those memories over time?in order to allow reinforcement learning again, or some other system to bring the important memories up and push down the unimportant memories, right?
Dave Brewster: Humans do this when they dream. you [00:32:00] resolve your memories of the day when you're dreaming. But what you're really doing is understanding what's important and not important and a lot of other emotional things that hopefully we don't have to worry about with LLMs. But, You do build a level of importance,for your psyche, during that time.
Dave Brewster: So how do we do that? What does that mean in an agentic system? Not only at the user level, but the agent level. What about the agent user combination level? Should an agent remember some things globally, but other things per user, this is all going to lead to a lot of code and a lot of logic.
Dave Brewster: Where if you're not working in a system that handles that for you.me as a coder, I, that's a lot of code that's going to be a lot of work to do in a one off case all the time. that's a little bit, I think of where we're going and what's coming on and how memories will affect agents.?
Miko Pawlikowski: The memory, the way that people would think about it, like the stuff you describe with humans, where we remember the right things and then we can recall and it all just works out. That's [00:33:00] science fiction today, right?
Dave Brewster: there's a lot of research people playing with it. the seminal paper that happened quite some time ago, was a research paper out of, I believe Stanford where the guy, basically created a version of the Sims, like the old classic computer games of Sims. using ChatGPT-3.
Dave Brewster: I don't even think it was three, five. I think it was three. And he added memory to that. And, that kind of led a lot of people, including myself to really start thinking about it, over a year ago, way back in the LLM days of over a year ago. but,there's this thing called MGPT that came out quite a while ago and now mem zero, which is a rev on it.
Dave Brewster: I don't think it's science fiction. I think it's still at the research level, still at the top of people's brains. but it's honestly not a difficult thing to do. what's difficult is understanding contextually what's important and not, but that's what LLMs can do that. So it's just a multi cycle LLM loop.
Dave Brewster: You store the memories as [00:34:00] RAG and documents in a vector store. And,the important part is what is the supporting data, the memory, and what's the memory itself. That's the thing that mem0 finally got right, like the prompt in order to generate the things that are useful for a prompt in the future sounds confusing, but, that's what they got right. but it's not, I wouldn't say science fiction. I would say it's going to be a huge rage over the next three, three to six months. Huge. You're gonna start seeing, hearing a lot about in context, learning memories for LLMs. you're going to start seeing it in Claude.
Dave Brewster: and you're going to start seeing it in, ChatGPT for sure. I may already be in claude. there are some things that I've searched recently that let me believe. If you've used claude, a project in claude is the memory boundary for the user, start a new project, you get a new memory, context.
Dave Brewster: I don't know if that's true. I could just be making that up. Sorry, anthropic people, if I'm just making
Miko Pawlikowski: I'll fact check to you a bit later.
Miko Pawlikowski: Okay. what else [00:35:00] would be a major breakthrough other than the memory? What's next on your list?
Dave Brewster: I think that's one of the hard parts for anyone using an LLM, it seems so science fictiony, what can I imagine I can do next? so a problem with an LLM, right? it's a learning model. It's a model that has learned. It did learn and is not learning anymore. LLM calls are stateless. So by adding memory and learning over time, preferences of it's either a system, a human, or, even a subject matter, and then storing those memories off, will lead to major breakthroughs. Like I, for example. Everyone talks about RAG score,
Dave Brewster: this is something I want to try, but what if you learn the series of concepts that are in the documents you're processing so that you can learn the semantic hierarchy of the documents that are there, the ontology of the documents that are there.
Dave Brewster: And then when questions are asked, you can give that ontology to an LLM. I think you would get a much higher rag,recall score [00:36:00] on that. but that's a very tactical thing, right? Like memory will tactically change how people approach problems because you can create systems that learn in the environment that they're deployed in versus in a lab somewhere.
Dave Brewster: that will lead to. Smart assistance at home, I think, for the everyday person that will lead to like I hate using the word robots, but systems that will learn your behaviors and what you do. we do the same things every day. We had our tendencies are very strong.
Dave Brewster: And it wouldn't take very long for something to learn that whether people want it to or not, it wouldn't take very long. So I think, what you were saying, writing emails for you, sending text messages at the appropriate time, paying attention to the environment around,the human host.
Dave Brewster: Wow. I'm going to start getting creepy here around the human host. is an easy thing for LLM to do if they have memory associated with them. So I think you're going to see those tools a lot, but in the [00:37:00] enterprise, learning your vulnerabilities at a security level, at a CISO level, it's not going to be very difficult.
Dave Brewster: that might be expensive, but it wouldn't be as very difficult. like I said, that the RAG use case is a very. common use case and the problem with it is the recall scores are bad as you move context around. but that, could change,learning the behavior of an employee, or multiple employees over time by a system from a security point of view is also something that can happen again, whether good or bad, like how we use this is more important than what's getting created, right?
Dave Brewster: memory adds a whole new level and a whole new dimension to what's going on. and in past that I personally have a hard time grokking what is next. Like a lot of people, what can this thing do? I don't know. I'm a coder. I tell computer to do something and it does it. But when it's acting autonomously,it's like. How do you assemble a [00:38:00] human brain so that it's useful in my mind? That's the way I think of
Miko Pawlikowski: Oh wow, a lot of scientists would?
Miko Pawlikowski: be very upset about you comparing a human brain to it, and argue that it's nothing alike
Dave Brewster: know they would, but
Miko Pawlikowski: But before we
Dave Brewster: won't get the same results, but we're trying to mimic, right? So let's think about how the right way to mimic.?
Miko Pawlikowski: Dave, why do you hate robots?
Dave Brewster: I guess my desire to have independence without being independent. How about that? that's a great question. I don't like using that word because I don't like using words that compare LLMs and agents to human behavior, because it's not the same thing. And a lot of people are afraid, oh, this is gonna take over the world.
Dave Brewster: This is gonna do this is gonna do that. It's not like that's not the way it is, but if it's about information gathering and information sneaking and learning things, maybe you should learn, they're very good at it. And that concerns me that,I live in the Midwest. I lived in the Bay Area for 30 [00:39:00] years.
Dave Brewster: The level of employee autonomy in the Bay Area versus here are two radically different things. You work for a company here, not in the tech industry, but if you're in a manufacturing warehouse, they know what you're doing all day long. They monitor your email. They monitor where your packets you send.
Dave Brewster: They do everything like for good or bad, right? It's to protect themselves, but they do it. unfortunately I think people use that for bad reasons. LLMs will allow them to do that ubiquitously across what the employee does, where their mouse clicks, how often their mouse moves. If somebody wanted to code that program up, that series of agents up, they easily could.
Dave Brewster: So then when you move that into the realm of a robot and you take that level of independence away from somebody and have something monitor. I just don't want people to get this dystopian view of the world of robots will attack and kill us. Because that's not the way it's going to be. But [00:40:00] then you see information gathering and information, snooping on people, that very easily could be the way.
Dave Brewster: So that's why it causes me that internal struggle. To
Miko Pawlikowski: we're going to talk
Dave Brewster: go this deep.
Dave Brewster: the Unitree robots. Yeah, no, sorry.
Miko Pawlikowski: proper think about it, once I realized, okay, fine. So you can actually buy this now. It's like a small humanoid robots that's supposed to be able to carry things around and bring you a beer, finally.
Miko Pawlikowski: And, it no longer costs a hundred grand. I think it costs like 16,000 now, which means they'll probably drop in price in a few years now.
Ravi Ramachandran: Yeah, I don't mind getting the beer myself. I want to do the dishes to be done. That would be really awesome.
Miko Pawlikowski: There you go.?
Dave Brewster: I'll go with the beer. I'm good with the beer.
Miko Pawlikowski: The robots are sneaking into my house anyway?
Miko Pawlikowski: my vacuum cleaner now has a voice recognition feature that I didn't ask for and disabled. And I hope it's disabled. it's working around vacuuming. it's got like a little lighter in it. [00:41:00] I hope this is only what it's using it for.
Miko Pawlikowski: So I I guess got past the fact that there's like an autonomous thing, that will hopefully not murder me in my sleep,
Dave Brewster: There you go. Hey, I have an autonomous idea for an autonomous agent that somebody could write something that monitors your web presence and the packets going out of your house to see who is snooping on you. How about that one?
Dave Brewster: fire with fire.
Dave Brewster: Anyone wants to create that one? I'll help the work on it with him. How about that??
Miko Pawlikowski: the good guy, the good hacker.
Dave Brewster: Where are you taking that? So you said this is still pretty close to bare metal. You're hoping to up the game. what's the end goal here? What do you want to do with it? I think there's the boring parts, which is adding more connectors to more systems and blah, blah, blah, blah, blah. That's always going to happen. I mentioned memory. We're adding long term memory next. We'll add a memory consolidation in the [00:42:00] future. That's going to take some time to get right.
Dave Brewster: We'll integrate memory into all the other existing agent types. That's going to get right. Of course, we're monitoring all the papers that are out there and we will implement, what's needed there. One thing that we won't do. Is what some of the other frameworks do and add implementations of every little fad that exists that as a developer using those toolkits, wow, that is so annoying of command completion, bring something up and fad of the day shows up.
Dave Brewster: That is so annoying. so from an idle on SDK point of view, it will be adding more of those building blocks, as things get larger, we have rag right now. we will add different kinds of RAG. We'll add RAG for different contexts that are very specific. We have text to SQL. we will continue to enhance that.
Dave Brewster: we have an API agent that, like I said, calls autonomously in any API. We'll probably add a series of agents on top of that for existing systems that are out there. that again just adds context to the prompt and [00:43:00] things like that. we might 1 percent potential thing that we're discussing internally.
Dave Brewster: right now it's Python only, but the ability to add, an agent for any language that's out there, since it's modularly built component based built, we can stand up an API for any kind of surface area of components, if you will, and then allow you to code agents and Java or agents in Go, or whatever you want pretty easily.
Dave Brewster: it's just us creating a facade for all of those APIs that we have available, or the right surface area, like I said, so those are some of the things on the open source project, but Robbie, do you want to go into, as we transition, the parent company, August data into, an enterprise company.
Ravi Ramachandran: Eidalon will stay a hundred percent open source. Guaranteed, but maybe Robbie, you want to talk about some of those. you covered the product piece of it. I think if we just take a step back, we're building a company, a project and company to [00:44:00] make sure that everyone has access to AI think, broadly speaking, if you think about the mission of the company, which is to make sure that it's not just for the few, which is a type of language program or the type of role in your company or an individual, right?
Ravi Ramachandran: You should be able to access that broadly. And I think that's an overarching goal.?
Ravi Ramachandran: And the other piece of it is really build a very open company. So not just the open source and the licenses that we've actually built software on, make it available, but, ensure that we can get the inputs of a wide variety of people, both who are in the company today, which is a handful, but also the community as well.
Ravi Ramachandran: And so just, one example of that is that, the way we run the company is very much open public. we have daily standups,
Ravi Ramachandran: which is different from office hours. We actually have standups where anyone can join our standups, any day. So Miko, you're more than welcome to sit in on our standups and see both how we're building the product, what's working in the [00:45:00] company, what's not, how we tackle issues.
Ravi Ramachandran: So it's very open and very embracing,the community at large. that's one of the reasons why we got together to start the company because we very much had the shared vision of opening and doing something that's bigger than us, ourselves. So I'd say that's more philosophy and where we're taking the company.
Ravi Ramachandran: they've described the open source projects. Clearly we're building the product for the enterprise. So individual developers today, but also taking this to the enterprise, because that's been our background. And one of the things you mentioned earlier, Miko, is, what makes us different, and we touched a little bit about the technology aspects of it. A lot of this comes down to teams of people, and it's really all about the people that are building these technologies, especially since they're new and having a background in building for the enterprise is really what we've done. And I feel like that's super critical to eventually being successful.
Ravi Ramachandran: so the technology is great, what do you do with it? How do you think about problems? And that comes from experience. So that's really super important to us. Anyway, that's just mattering of things. I don't know if it all made sense, but hopefully you [00:46:00] could put together the product vision with the vision of the company and see why we are different, not better.
Miko Pawlikowski: And I always think about that as why are we different and what makes us unique, not necessarily what makes better than, other solutions that today, frameworks that today. I think I forgot to mention how you're going to make humanity a multi planetary species and also how you're going to enhance human condition there. You might want to revise your pitch a little bit.
Ravi Ramachandran: if you have us back, we'll, we'll make sure we cover that next time.
Miko Pawlikowski: There you go. Dave, Ravi, it's been a pleasure to have you. Thank you for coming.
Dave Brewster: Thank you so much.
Ravi Ramachandran: Thanks so much. Thank you.