The New Shiny
Images generated by Stable Diffusion model

The New Shiny


As many others here have done, I looked at the beginnings of ChatGPT early on. There was a blurb on a research paper, a video on Two Minute Papers, and other tech demos which sparked my interest.

Leading up to this we had seen “traditional” approaches. I gave a talk on TensorFlow some years ago and it seemed almost magical how quickly we could develop an object recognition application. In the course of two hours we successfully built a facial recognition tool for the class. Compared with older methods (at the time) it seemed groundbreaking. Around that same time, I gave a session to our LUG about GPU enablement with literally a header change to some C code and how it could dramatically speed up R stats code.

Fast forward a couple years later and myself and other team members were in a hackathon using an online service that could do this in moments. In the space of a few hours we’d built a sentiment analysis app and demoed it to the participants. Others in the event built things such as a fashion analysis app, Terminator HUD prototype and other cool stuff. The other teams included high schoolers and teams-of-one. The fact that we could build working prototypes in a few hours was eye opening. I even had the chance to participate in a Hackathon on Microsoft’s campus in Seattle where their AI tools were made available as just another web service. I’m definitely a hack when it comes to writing code, but was easy even for me.

And then the diffusion models hit.

This was incremental, of course. Many of us have taken the Stanford online data science and machine learning classes. We went from ways to aggregate and analyze “big” data (now it’s just data) with regression models (linear, cubic, etc.) to gradient descent to clustering approaches. At the tail end of these lectures were talks about the new tools such as GANs. Now I sort of understood how GANs worked based on the previous experimentation but it was rapidly becoming too complex for a weekend deep dive and for this bear of very little brain. Up to this point I’d been working with manually curated training data. It was doable but tedious. A GAN took away the human (meaning me) and allowed the training to proceed without me manually tweaking anything. ?The idea seemed simple: instead of me marking something as good/bad, have one model try to trick another model. When it could no longer distinguish real from fake, then promote that model. But each got better with each iteration. ?

Diffusion models seemed to build on this. Start with noise then iteratively make the space less lossy (in a loss function context). The “correctness” of the model would diffuse through the random space until it was indistinguishable from a real input. Maybe this understanding is flawed and at the very least overly simplistic, but it was just enough to dispel notions of it being magical despite the seeming incomprehensibility.

I spent hours as a user playing with Stable Diffusion and MidJourney and other generators. This seemed groundbreaking and particularly relevant to me. I am aphantasic, meaning that I do not process/generate visual imagery as most others do. These tools allowed me to describe something and have the computer generate the image that I wish I could imagine. Wow.

Enter GPT and ChatGPT. ?

At introduction there was a tremendous amount of enthusiasm and lots of over-enthusiasm. But it was -- and is -- groundbreaking. It was no longer the demesne of researchers and coders and weekend ML warriors, but was available to everyone with a browser. It could write descriptions of images, summarize websites, explain the themes of short stories, translate from language to language. I imagine that for non-technical folks who have not exposed to the previous scaffolding it would seem magical.

And there’s the pitfall. As IT folks we know how these developments are incremental. There was no tearing away of the curtain because we were already behind the curtain. We had directed goals in our use of these stepping stone tools. Whether it was leveraging CUDA to more quickly analyze a large data set, or using an anomaly detection tool to detect unusual network activity, or answering questions in a chatbot, we saw AI/ML as mere tools and not magic.

And it’s difficult to be the wet blanket when everyone wants to throw money and resources at the new shiny. We saw this with containers and Kubernetes and serverless.?Without understanding the underlying technology we tend to be either overly enthusiastic or fearful or worse, immediately dismissive.

Wouldn’t it be cool to use these new tools to make technology more approachable to a wider audience? Absolutely. If it allows a robotics researcher to short-cut learning how to open a file in a new language then it has saved the most precious commodity of time. ?If it enables a group of high school students to build a social media app by suggesting appropriate frameworks then it’s useful. If it allows artists to build an immersive installation to realize something in their imagination then it’s useful.

No alt text provided for this image
Example of auto-generated Ansible code using an AI assistant


Are the generated artifacts testable??Probably not with traditional methods. That is, if we are expecting a reproducible artifact from a fundamentally stochastic process we’re not understanding the tool. Does this materially affect where the tool should be applied. Yes, very likely. If I need reproducible, verifiable and certifiable results then I probably need a different tool or a reset of expectations.

In short, enthusiasm is great when cultivating mindshare. It’s great when it allows you to think of a goal rather than impediments. Fear, when manifested as caution, is great when evaluating what a new tool can and cannot do especially when hard dollars are involved. But being dismissive without understanding the benefits and disadvantages is what brought down more than a few multinational concerns.

On a side note, I was watching a YouTube video about how some expert pilots had averted a disaster by being experts. ?On an evaluation flight there was a series of mishaps that seemed almost comical with a series of failures including failed engines (plural), bad sensors, bad weather. One specific idea stuck: Instead of listing all the failures, the flight crew looked at what was working. What did they have at their disposal to land the plane? My takeaway is to see the new shiny for what they are: Tools. See the possibilities but temper it with appropriate caution.

Marian Boricean

Saling the seven seas

1 年

Amazing article

回复

要查看或添加评论,请登录

Kwan Lowe的更多文章

  • Hammers and Screwdrivers

    Hammers and Screwdrivers

    There's an old adage that says, "If the only tool you have is a hammer, every problem begins to look like a nail." In…

    1 条评论
  • Spotlights and Floodlights

    Spotlights and Floodlights

    There's an old Internet fable about a plumber charging an obscene amount of money for tapping a pipe with a hammer…

    2 条评论
  • OODA Loops Revisited

    OODA Loops Revisited

    A gifted engineer once explained to me the concept of OODA loops. As many of you may know, the OODA loop is a cycle of…

  • Repurposing Old Hardware

    Repurposing Old Hardware

    Repurposing Old Hardware I'm writing this at 3AM on a Saturday morning in April 2020. Because of COVID-19, we are…

    1 条评论
  • Adventures in Golang

    Adventures in Golang

    Kwan Lowe (February 18, 2019) Over the long President's Day weekend, I decided to learn Go. The Go Programming Language…

    7 条评论
  • Ockham's Razor and IT

    Ockham's Razor and IT

    Ever heard of Ockham's Razor? Of course you have. No, it's not a new gadget that will topple the billion dollar…

  • Linux Containers with the Cockpit Utility

    Linux Containers with the Cockpit Utility

    Linux Containers with the Cockpit Utility Just thought I'd share what I've been working on over the weekend. Some…

    1 条评论
  • Basic Linear Optimization with Gnu Octave

    Basic Linear Optimization with Gnu Octave

    https://docs.google.

  • R For SysAdmins: Working with SysStat

    R For SysAdmins: Working with SysStat

    LinkedIn's editor is not brilliant. Please use the Google Docs link below for a better formatted version.

  • Game Theory in TEOTWAWKI

    Game Theory in TEOTWAWKI

    https://docs.google.

    6 条评论

社区洞察

其他会员也浏览了