Docker Labs: GenAI No. 11
Docker, Inc
Docker helps developers bring their ideas to life by conquering the complexity of app development.
Docker Labs @Github Universe
Whether or not you’ve been following along with our journey of using AI Assistants which leverage Dockeried tools, attendees of GitHub Universe 2024 will be able to try out all of our experiments in GenAI. In keeping with the theme of the Software World’s Fair, we’re showing off our vision for the future of SDLC, with new VSCode extensions, an AI content catalog for Docker Hub, and new features for our GitHub Copilot agent.
Where:
Docker Booth
When:?
Tuesday, October 29th 8:00am - 12:30pm
Wednesday, October 30th 8:00am - 12:00pm
A Summary of Labs Experiments
At Docker Labs, it feels like just 4 months ago since we were playing around with runnable markdown. That’s probably because 4 months ago was our first LinkedIn article post where we covered generating a runbook with markdown. Since we’ve covered so much since then, we wanted to highlight some discoveries we made up to now.
LLMs don't need anything but Markdown
In our first post, we experimented with the single use case of generating a “runbook” for a Docker project. We used a simple call to OpenAI, with code actions to run terminal blocks inside the markdown so that the runbook could be runnable.?
Now, we have expanded well beyond “runbook” use cases, but we’re still using markdown as the language of choice for prompts and their outputs.
Prompts are worth tracking
Prompts represent a lot of work. The final prompt to achieve a use case is something you want to save. The prompts that work best synthesize an expert’s knowledge about a tool or task, and that knowledge contains value. Furthermore, as a prompt changes, it’s worth maintaining the version history. We immediately noticed a trend where sharing prompts could be done on GitHub.
Prompts should be composable
A prompt in SDLC should be able to be expressed as a tool. We allow prompts to be composed together to allow multiple agents to exist in a conversation.
Prompts work best with a specific model
When authoring a prompt, you gravitate towards a specific model to run the prompt. We found that using frontmatter metadata to “pin” a prompt to a model is important. When running multiple agents, different models could be invoked.
领英推荐
The best source of context for SDLC is a the project
Even though many applications go for a “chatbot” or human-in-the-loop approach, we tried to think towards a world where the developer is less directly involved. Instead of a developer asking “Can you lint my backend with pylint”, a developer might ask “Can you improve my source code”.
We also learned that the software project is a distilled set of decisions regarding a problem. As the project changes, then a good engineer can read between the lines to understand what decisions have been made to represent those changes.?
The LLM can achieve complex use cases when provided with simple tools
Early experiments might look like?
“I need you to run cat on package.json to look at the dependencies of my npm project”,?
but this is naive and unnecessary. Instead,?
“Here’s cat, and an npm project, what are the dependencies?”
The LLM is able to sort itself out, so we don’t need to guide it so hands-on. Much like an expert can wield the right set of tools to accomplish amazing things, our assistants should be able to become that expert when given the right set of tools by an expert.
It’s easy to overwhelm current models with source code
Despite trying to avoid having a “state” in our prompts outside of the conversation, we found that there are certain problems which need to be solved with a thread of information. We used ephemeral Docker volumes inside of the prompts to help our assistants store lots of information without overwhelming the token limit.
Agents can learn how to use a tool from existing documentation
Up until this point, we had been focusing on defining the right JSON “handle” for our tools; defining a schema for the LLM to adhere to when sending arguments. However, we thought that LLMs might be good at reading existing documentation, and when given the tools to learn tools, it learned the arguments on its own.
LLMs can build their own tools to automate themselves out of the loop
A lot of use cases involve context-heavy tools, like migrating source code or bulk-fixing lint issues. These types of use cases quickly overwhelm the context token limit of most models. To solve that, we tried to get the LLM to automate some of the process without having to send everything through the assistant. What we found was that the assistant was able to construct its own data pipelines to effectively automate itself out of the entire process, just like any other developer.
?
| Administrador LINUX LPIC-1 | DevOps | ADS | Dev Python |
3 周Incrível
| Administrador LINUX LPIC-1 | DevOps | ADS | Dev Python |
3 周Agrade?o por compartilhar