Navigating the Hype: Realities of GenAI in Software Development and beyond

Dr. Rupert Rebentisch

Director / Head of Digitalization and IT-Innovation at KfW Bankengruppe

发布日期: 2024年3月21日

Just read a post by Nils Müller-Scheffer on “inflated expectations of GenAI in SW-Development”. The author formulates 5 hypotheses for using GenAI in the Software Development Lifecycle [1].

I can relate to the basic proposition that GenAI will not (!) render the developer superfluous. As he puts it, “any GenAI output must be owned by a human.” I believe this is true for many use cases of GenAI, extending beyond Software Development.

GenAI is a tool that enhances productivity and, to a certain extent, creativity. In my opinion, a “human-in-the-loop” approach is necessary for most of the use cases I’ve encountered in recent months.

Thus, we need to consider, the appropriate and responsible design of task distribution between humans and AI applications (“Angemessene und verantwortungsvolle Gestaltung der Aufgabenverteilung zwischen Mensch und KI-Anwendung”). This has been defined as an important criterion for building trustworthy AI applications, as highlighted by the Fraunhofer IAS in their excellent work on the trustworthiness of AI [2].

So, what are effective patterns for GenAI usage? What does automation with a “human-in-the-loop” look like? We need patterns for “attended automation”. So we need to build UI and workflow integration into our AI-Systems, potentially requiring frontend development skills not typically found in AI teams.

I agree with Nils Müller-Sheffer that a larger context window will not suffice to make developers obsolete (Hypothesis 3). However, in my opinion, larger context windows will significantly enhance our use of AI systems. Currently, we struggle with small context windows. Fine-tuning requires a lot of training data (not always available). RAG faces many challenges similar to those seen in semantic search. Larger context windows could simplify many tasks for the AI system.

Nils Müller-Sheffer points out that even with larger context windows, we need to direct the GenAI system to the right “context spots” for given tasks. Again, I believe this is also true for many use cases beyond software development. Although there are promising approaches to handling context, and limited context windows automatically [3], user guidance in steering the context usage of a GenAI system seems essential. Instead of trying to remove the human from the system, we should aim to enhance their interaction with it.

With Hypothesis 5, Nils Müller-Sheffer states that code generation is the least interesting use case in the SDLC. Similarly, one could argue that focusing on simple steps in business processes won’t generate significant value. However, aiming to completely automate the human task is not only an “inflated expectation” but also linear thinking. I believe the value of GenAI lies in a different interaction between human and system. We need to explore and enable this interaction in new ways to guide GenAI and assist us with tasks at hand. A simple “do-my-job” prompt won’t suffice.

So, do we have inflated expectations regarding AI? - Absolutely. Will we experience a trough of disillusionment? - Most likely.

Instead of asking whether we’ll see a hype cycle, a better question might be, “What are the patterns evolving for the plateau of productivity?”

A hypothesis driven approach might be helpful!

[1]: Nils Müller-Sheffer, Cutting the Crap inflated expectations of GenAI in Software Development and why they are unfounded, https://www.dhirubhai.net/feed/update/urn:li:activity:7175511914867740672?updateEntityUrn=urn%3Ali%3Afs_feedUpdate%3A%28V2%2Curn%3Ali%3Aactivity%3A7175511914867740672%29

[2]: Dr. Maximilian Poretschkin et al., "Leitfaden Zur Gestaltung Vertrauenswürdiger Künstlicher Intelligenz - KI Prüfkatalog." https://www.iais.fraunhofer.de/content/dam/iais/fb/Kuenstliche_intelligenz/ki-pruefkatalog/202107_KI-Pruefkatalog.pdf

[3]: Packer, Charles, Vivian Fang, Shishir G. Patil, Kevin Lin, Sarah Wooders, and Joseph E. Gonzalez. "MemGPT: Towards LLMs as Operating Systems," October 12, 2023. https://arxiv.org/abs/2310.08560.

Nils Müller-Sheffer

Software and Platform Engineering Lead (DACH)

5 个月

First of all, thanks for reading my post and sharing your view, Rupert! I believe this is so valuable and I hope other peers and thought-leaders will follow suite! We need real-life expertise of people who early-adopt the tools have a good understanding about their strength and weaknesses and can see a little bit around the bend. Reading your thoughts (I am on board with everything you say) - what resonates most it with me is: " ... aiming to completely automate the human task is [...] linear thinking. I believe the value of GenAI lies in a different interaction between human and system." This rings very true. I am not sure if this is contradicting my first hypothesis, "GenAI is not a replacement of existing processes or tools, or a “revolution” in the way software is build. It is an evolution." We need to discuss this further (and a bit of "wait-n-see", I guess), but that is why I chose Hypothesis as structure. They may be wrong. Lastly, I do hope we find other collaborators to formulate the positive set of hypothesis on "what will be the most useful and near-team achievable" model of human-AI collaboration. I am very optimistic in this regard, actually!

Michael Beckmann

Lecturer at Baden-Wuerttemberg Cooperative State University (DHBW)

Rupert, thx for clarification!

查看更多评论

社区洞察