Game GANNS

Game GANNS

I was on a call with a group of CTO’s working on advanced new AI models.? I’m not one of them, my interest in neural networks died of boredom with them in the 1980’s.? I was invited as a resident authority on gaming, crypto and media technologies.? During the introductions, I said I was just there to learn about the state of the industry. ? What followed however surprised me.? It was a lot of discussion about how limited and stuck modern AI models are.? Apparently, the magic feature of generative AI models is that they can train themselves given enough properly classified information to learn from.? The internet provides a near-bottomless supply of photos surrounded by related text. It was possible to train an AI on the scale of Chat-GPT because the adversarial network had a bottomless supply of contextualized photos and text to draw on.? We don’t have such a vast qualified supply of human-characterized training data in many other areas to draw on.??

Take the idea of creating AIs that can code for example. It’s wonderful that we have this vast sea of open source code on GitHub to draw on but there isn’t a vast body of qualifying information about WHAT it’s supposed to do to enable a GANN to self-train against it.? It needs lots of working code and a lot of context for WHAT that code is supposed to do in order to “learn” from it automatically.? In a sense, Chat-GPT is a novel achievement in highly specialized AI functionality given that it was born and raised inside a solitary confinement cell where its exposure to learnable information was highly constrained.? It’s a sort of technological Hellen Keller story.?

I had an epiphany during the conversation… “Oh, that’s why it’s simultaneously able to produce these amazing works of art while never quite managing to get hands or eyes right.? Chat-GPT also appears to be able to interpret text but can’t read it in images.? It really doesn’t understand depth or lighting!”? Hands are famously hard for humans to learn to draw because they are such complex 3D objects with very complex lighting properties.? What followed was some discussion about how to train an AI to perform photogrammetry in order to bring AI into our world of 3D depth and interactivity.??

One famous CTO guy discussed creating a reward system to build a network of mobile phone users to use their phones to scan 3D scenes and objects and tag them to help train a depth and lighting-aware GANN.? As soon as he said it, I realized that I knew how to solve the problem without any of that.? Since I’ve written enough patents for one lifetime, I’m just going to put it out there.?

Here is how to construct a self-training GANN that can extract 3D depth and lighting data from any arbitrary video.? You use a modern game engine or photorealistic renderer to generate random scenes of 3D objects, materials, and lighting properties.? Fly the camera around the scene randomly and record it all as a video.? Automatically tag the video with the parameters that the renderer used to generate the scene.? Use the video to train the adversarial CNN models.? Now you have an infinite supply of highly qualified 3D video suitable for unsupervised learning.? Better still, the video is very compact to store and train with because it can be regenerated from its parameters on demand, vastly reducing the storage and network infrastructure required to train such a model.

The resulting GANN should be able to do things like watch all the videos on the internet and extract all the models, textures, lights, and materials from them.? It should also be able to generate very realistic entirely artificial videos.??

I’ve done a ton of work on artificial vision systems over the years and the problem with seeing things in drone or satellite video is never actually having enough sample video to test with.? The way I’ve always solved it by building a virtual model in a game engine like Unreal or Unity first and using that to generate training video for the vision system.? This is classical CCN kind of training stuff.? Because you know the absolute location of all the objects and features in the virtual scene you can automate measuring how well any human-authored algorithm performs at identifying it.? It’s a natural step further to apply the idea to training a GANN.??

We, humans, have figured out how to generate very realistic artificial 3D worlds for games and movies, we don’t have to use real tagged video to train a depth-aware GANN when we have very powerful tools to generate an infinite body of realistic 3D video randomly and perfectly tagged.

The same idea works for teaching a GANN to read text from the real world.? You generate text from random fonts with random backgrounds into textures and place them in a 3D scene.? Fly the virtual camera around the scene to generate a perfectly tagged video, then train the GANN on that tagged video.

Speech is a little more interesting to think about because when we watch a movie the dialog is obviously very contextual to the entire movie not just the frame of video the speech is occurring in.? Before the GANN could be trained to recognize and author plausible movie dialog it would need to understand enough about the overall context of the movie and its setting to have a real chance at producing something plausible.? So I would theorize that in addition to solving some other GANN training problems you need to solve the 3D lighting and depth one in order for the GANN to be able to contextualize movie dialog with the movie video content.? Fortunately, we already have great speech-to-text technology and we have GANNs that understand the text so… you could watch all Youtube and Twitch videos, extract all speech into text, and then tag the original video with the text in order to help the 3D aware GANN relate the 3D scene context to what is being said.

It follows that in order to create AIs that can truly relate to humans on our terms, we need a way to train them to understand our physical world, however, that would require a lot of expensive robots. A better approach might be to train them inside a game world with game physics and materials. You give the AI a virtual robot body to interact with virtual 3D objects, lighting, and physics in order to train it to relate to our reality automatically.

Anyway, here’s a freebie for you AI guys trying to figure this out. I'm not an expert on adversarial AI yet, but... give me 5 minutes. Got a thought on how to generate actual games this way as well but that’s another article. ?

Mark Wilcox

Owner Operator at 21e8

2 年

just use 21e8

ASA BAILEY

Filmmaker, Virtual Production Supervisor, Executive Producer.

2 年

Yummy. Synth 3D data made in a game engine.

回复

要查看或添加评论,请登录

Alexander St. John的更多文章

  • The Disney Lion King Disaster

    The Disney Lion King Disaster

    https://www.youtube.

    2 条评论
  • Gaming illusions for delusional players

    Gaming illusions for delusional players

    One of the odd things about being an OS engineer in the video game industry is that there are millions of gaming fans…

    9 条评论
  • The Xbox Failure

    The Xbox Failure

    I read a LinkedIn post today that tried to analyze Microsoft’s successful Activision acquisition as some sort of…

    12 条评论
  • Visionary or Delusional?

    Visionary or Delusional?

    A long time ago in an earlier stage of my career, I had the occasion to work closely with Bill Gates. I worked for…

    2 条评论
  • Detecting Intent to Purchase

    Detecting Intent to Purchase

    Intent to purchase What is it, why is it important to measure, and how do you do it? One of the concepts I’m surprised…

    3 条评论
  • MOMitization!

    MOMitization!

    Monetizing kids via mom This is a huge topic but one I’ve been immersed in for years. I have seven kids ages 2-30.

    13 条评论
  • Monetization via In-Game Adverting

    Monetization via In-Game Adverting

    WildTangent successfully pioneered in-game advertising in the late 1990’s. We earned millions in huge premium brand…

    2 条评论
  • Money for Nothing and your Clicks for Free

    Money for Nothing and your Clicks for Free

    If you're making Web 3.0 games, better read this quick One of the mistakes online game developers commonly make is…

  • Better Web 3 Game Design

    Better Web 3 Game Design

    I was trying to fix my underfloor heating system the other day when I ran into the problem of manually adjusting the…

    4 条评论
  • Productivity from adversity

    Productivity from adversity

    I was out working in the yard with my 4 year old son yesterday, teaching him to pull rusty nails out of an old fence…

    1 条评论

社区洞察

其他会员也浏览了