Best Practices for Aligning AI to your Technical Strategy
AI-generated illustration of a human programmer and an AI entity collaborating on software development, created by DALL-E.

Best Practices for Aligning AI to your Technical Strategy

Developers rely more and more on AI tools. These tools make them more efficient, keep them in flow, and help them write better code. But as we come to lean on them, how do we ensure AI-generated code is aligned with our technical strategy, norms, and culture?

The 5 Best Practices

We’ll start by listing out the best practices and then spend the rest of the article digging in to the motivations and nuances behind them. So, diving right in:

  1. Don’t outsource code reviews or QA to AI tools.
  2. Encapsulate architecture and technical design decisions into your codebase.
  3. Make the use of AI a more public event.
  4. Train developers on AI.
  5. Use a mix of AI tools.

What Alignment Means

First let’s talk about what AI alignment means in the context of a software development organization. It’s easiest to first think about human alignment in an org: that we’re all pulling in the same direction. Each team member puts the overall goals of the organization on par with their own personal goals. Ideally we each find ways to express and achieve our personal desires while working together to build something bigger and better than we could on our own. Managers and leaders play a role in helping each team member find and re-find this alignment. If this alignment gets too out of whack, the developer or the company or both can decide to part ways. This alignment is a fine balance– rarely are organizations marching along a smooth path to a clear objective. So it’s not just a matter of everyone getting in line. The best software organizations achieve alignment to a direction and vision while encouraging individuals to exercise their unique strengths in realizing that direction and vision even if sometimes that means getting out of line.

But I digress. Let’s get more tactical and focus on technical alignment within a software development organization. This typically covers a more manageable set of things:

  • System DesignMonolith, Microservices, ServerlessREST, Message-based, RPCArchitectural process (e.g. RFCs, ADRs)
  • Components and DependenciesPreferred data stores and systemsPreferred cloud providersManaging dependencies
  • Testing and QualityTesting strategy and requirementsTest in prod / staging / developmentTest-driven developmentUnit and integration testingCode reviewObservabilityProduct usage metricsDocumentation
  • ScalabilityPerformance and load testing strategyAuto-scaling approachCurrent / projected real world usage metricsCost constraints
  • Security and RiskVulnerability detection and remediationCode reviewUser data managementAccess control
  • ToolingSource code managementAutomationPlanning
  • Technical Assertions“This is the way we do things here.”Coding styleChoose boring technologyAPI or UI first
  • Leveling and AssessmentSkills matrixInterviewing / HiringLeveling / promotionPeer reviewPerformance management / firing

In most organizations, a combination of formal and informal processes play out every day in order to reinforce (or not) a given set of technical choices. Team meetings, code reviews, Slack conversations, comments on tickets, RFCs, planning processes, pair programming, and countless others. Nowadays there is another process happening with greater frequency during software development– the use of AI coding assistants and conversational UIs. This presents a great opportunity for us to re-evaluate our approach to ensuring technical alignment.

Technical Alignment and AI

As a starting point, we can continue to use the best practices and processes we’ve already got in place since many of them take the code as the point of collaboration and governance, regardless of how that code got written in the first place (inspiration striking on a mountain top or copy/pasting from Stack Overflow). In many ways generative AI is just another tool for writing code. But we need to adapt our approach to account for five key changes that are coming about thanks to AI:

  • The volume of code is about to increase dramatically – this will push the limits of our current approaches to technical alignment. Any process that relies on code reviews will become a bottleneck.
  • Motives – the profit motive of AI providers can create misalignment. Though potentially true of any cloud or devtool provider, the scale and scope of AI’s impact on your codebase makes it especially susceptible. Think about social media for an example where profit motive damaged our values. Not because of clearly labeled sponsored posts but because of the more insidious algorithms that over-optimized for the wrong outcome– engagement.
  • Confidently wrong – because gen AI has a tendency to be confident whether it’s right or wrong, we lose a signal we often get from human authors– “can you give this a quick look” vs. “i could really use your help with this”
  • Unclear authorship – when code is written by AI but copy/pasted by a human we get a false signal on authorship. I may have a certain set of expectations of Joan when she’s writing backend code that may not apply to the portions of code she’s authored with the help of AI.
  • GenAI features in your own applications – as we start using generative AI more and more in the applications we build, application behavior will become less predictable. These models have inherent non-determinism built in, and so deterministic testing strategies won’t work.

Diving in on the Best Practices

So let’s look at five best practices we’ve been using for years and how we should think about them in the age of AI coding assistants and conversational UIs, especially given these new challenges.

Don’t outsource code reviews or QA to AI tools

Because there will be much more code, much more AI-generated code, and much more confidently wrong AI-generated code, quality checks like code reviews, automated testing, and even manual testing will become dramatically more important. If we rely exclusively on AI tools to perform these quality checks then we are just kicking the can of alignment down the proverbial road. At a minimum we should be using different AI tools to help us perform quality checks than the ones we use to author code (see best practice #5).

Since the rise of agile, we’ve pushed more parts of the traditional development process into quality checks. System design, security/risk assessments, technical assertions, and cultural norms are often enforced within code reviews or automated tests. On the positive side, this reduces the number control points for software validation and alignment. But on the negative side, if we cede control and understanding of these points to our tools (e.g. AI), they are no longer effective as human control points.

Create and reinforce a culture that assigns value, importance, and prestige to rigorously reviewing, understanding, and testing system changes, regardless of where they come from. Recognize the added challenge of increasing non-determinism in systems as generative AI is used not only for building software but also as a key component of these systems.

Encapsulate architecture and technical design decisions into your codebase.

AIs today mainly use the current codebase along with developer prompts as their context when answering questions or suggesting code. This includes a lot of valuable content (READMEs and comments as well as the code itself), but misses many key inputs that are more ephemeral in nature (whiteboards, design docs, UI mockups, roadmaps, legacy dependencies, and team conversations).

With developers carefully prompting and closely guiding AI assistants, those ephemeral inputs can be taken into account. But as our use of AI scales, they can increasingly get overlooked resulting in code that strays from our technical norms– needlessly introducing new dependencies, making code less readable, introducing technical risks or security vulnerabilities, or impacting scalability.

Our first best practice above acts as a way to catch these issues after the fact, once code has been written but before it’s been merged and deployed. But we should also strive to improve the context upon which AI assistants operate. ChatGPT, Copilot, Gemini, and others are outdoing each other by increasing the size of the context window. But that’s only part of the battle; if we don’t fill that context window or if we fill it with the wrong content, like outdated Google and Notion docs, we’ll get suboptimal results (the old garbage in, garbage out issue). The knowledge contained within foundational models goes a long way to combat this. But it can’t make up for missing or inaccurate company and organization-specific context.

What to do? We actually need a mechanism for “publishing” the parts of these ephemeral architectural artifacts that matter. Code, Config, and Tests all now get “published” to source control and can serve as context for AI prompting. If there are non-code inputs we want everyone to use for context, we should publish those as well to the same source control system. Open source projects do this already with files like README.md, ARCHITECTURE.md, and contributor guides. Other good approaches include the use of RFCs (requests for comments) or ADRs (architectural decision records)– though both of these are typically point-in-time documents and not evergreen system descriptions.

In general, with these types of non-code documents, we’ll face the inevitable challenge of keeping them up to date even as the system changes. The Code as Documentation philosophy attempts to address this challenge by eschewing extraneous specifications, architecture docs, and non-code artifacts. But in a world where more and more of the code is generated using AI, we need to make the ongoing investment in maintaining human-authored versions of our intent. We might even imagine a day when most code is written by AI agents; in this future vision, specifications and architecture/design docs become the primary artifacts that we humans author and maintain.

Make the use of AI a more public event

Today, AI coding assistants and conversational interfaces are mainly used in private by a single individual. While developers may occasionally share their sessions with others, its more common for them to engage with AI privately and then share their resulting work product– code, comments, or documentation.

Because of the signaling challenges of unclear authorship noted above, this approach is not ideal from an organizational perspective. When John submits a PR, It helps reviewers and colleagues to know how much thought and consideration John put into his work. Based on past interactions, colleagues can make some reasonable assumptions about where John is an expert, where he’s more of a novice, where his strengths lie and where a more thorough review may be needed. Use of AI coding assistants can invalidate those assumptions.

Many great engineering organizations favor transparency in their work. For example, technical and product discussions in public Slack channels and shared docs are preferred over DMs and private conversations. This increased transparency leads to better results in two ways: 1) more likelihood that decisions will happen with full context and 2) communicating in the open brings more sharpness to our thoughts.

If we think of AI as just another tool, then it shouldn’t matter whether developers engage with it publicly or privately… does it matter if teammates know you use VSCode vs. Emacs? No. But coding AIs are somewhere between just another tool and a full-fledged colleague. They certainly have the power to influence our thinking and resulting work product more than “just another tool” could. Given this, we should err on the side of collaborating with coding AIs more transparently.

Developers can face an emotional challenge when doing this– you can ask any question of AI with no judgement. Even if you feel you’re expected to be the expert on a topic, you can safely ask an AI about the basics of that topic. Move all of those conversations into a public forum and a lot of that safety goes away. So a subtle and thoughtful approach is needed. Start by relying on another tried and true practice, pair programming, and extend it to include pair prompting. Incorporate AI into existing public channels through integrations like the ChatGPT app for Slack. And don’t force all AI conversations to be public, but work to make it safer and more normalized for your team to ask and prompt AIs in more open settings.

Train developers on AI

We often expect our developers to build expertise with their tools on their own. But this results in missed opportunities for creating more alignment across our team and with the AI agents they use. As noted, AI is more than just another tool. It’s a productivity booster, a subtle influencer, a workflow change, and a potentially large dependency all rolled into one.

As team members gain benefits from their use of AI tools, ask them to reflect on their experiences and think about what works well, what doesn’t, and what creates the potential for future challenges. Support brown bags and other forms of knowledge sharing among colleagues to improve everyone’s mindful use of these tools. Even just planting the seeds of thoughtful usage with your team members can have a positive influence on alignment.

Don’t shy away from documenting guidelines and best practices. If you don’t want employees sharing proprietary code or secrets with AI bots, or want to limit usage to certain vetted AI tools, let them know. If you want to require them to use corporate accounts instead of personal accounts, make it so. New technology can be exciting to use, but it’s important to put guardrails in place that align that usage with the needs of the organization.

Finally, ensure you’ve got support channels in place for the use of these tools. Setup a Slack channel where people can turn for help with guidelines and best practices. Create clear responsibilities for managers and other leaders to help everyone on the team to get the most out of AI.

Use a mix of tools

While it will increase costs, using a mix of AI tools will help mitigate some of the effects of misalignment. Different tools, especially if they rely on different underlying models, can have different behaviors with the biases and errors of one sometimes canceling out or highlighting the biases of another. Specifically, avoid using the same tool you use for authoring code when reviewing code. Your code reviews should ideally be a human-heavy process anyway, but using a different AI tool (and model) to summarize and explain aspects of the code to a reviewer can augment the human review process.

Encourage developers on your team to also use multiple tools when designing, coding, or testing your system. If there ever is a true AI uprising, our only hope will be to pit AI vs. AI as we watch from a safe distance ??.

Extend What’s Working Today

In general, these recommendations are meant to extend engineering best practices that you’re likely already following. The addition of AI tools into the mix perhaps just creates more urgency and need for scale. Another way to approach the technical alignment challenge is to ask yourself, “What people, processes, and tools would I need to add or change if my engineering org were to double in size next week?”. Though it may take more than a week, change is coming. And the typical indicators of organizational scale, like engineering headcount growth, may not clearly reflect the size and scope of this coming change.

要查看或添加评论,请登录

社区洞察

其他会员也浏览了