Microsoft + Netflix + OpenAI              
Is it just me or is Satya Nadella is putting together a team

Microsoft + Netflix + OpenAI Is it just me or is Satya Nadella is putting together a team

I started writing this article based on a question I have been receiving a lot in recent months.

Why are 'generative art' platforms attracting so much investment?

In answering this simple question the result was me reviewing my notes like Charlie Day in the infamous Pepe Silvia skit. I realised to properly answer that question I would need to make the value of these AI projects and research papers a little more tangible, and the benefits more measurable and this would require a different format to my recent posts online


Why are so-called generative art platforms attracting so much investment?

That’s a question I’ve been asked a couple of times in recent months and to be totally honest, I completely understand the confusion.

First of all, a better question would be 'what is generative ai?'

Gartner defined?generative AI ?as AI that “learns from existing content artifacts to generate new, realistic artifacts that reflect the characteristics of the training data, but do not repeat it.” In simple terms, it can produce entirely new content, like images, videos, text [audio] and code, with very simple inputs.

Although I try to share one or two generative AI-related projects per week (particularly those that lend themselves to the design studio), I appreciate how overwhelming it can be to monitor and review every machine learning development with the time and attention they deserve given the volume of information published on a weekly basis.

Within my creative circles, most of the interest has been understandably focused on image-generation tools. While text-to-image (T2I) platforms have indeed opened up new ways to explore design, my concern is that most within my peer group are concentrating on one area at the expense of the wider benefits AI can offer. To understand why this space is attracting investment, it's important to appreciate the real value of these paradigm-shifting platforms lies in their underlying technology. Recent advances in neural networks (NNs), large language models (LLMs), natural language processing (NLP), diffusion models, generative?adversarial networks (GAN), transformers etc are the driving force behind this T2I revolution, but these technologies are far more sophisticated than image creation alone. This appreciation of the wider ecosystem should help justify the level of enterprise investment, but that extra knowledge will also make it easier for creatives to decide which AI tools/models to include in their own pipeline to primarily enhance the creative output they enjoy, but also automate, simplify and potentially replace the more mundane tasks. Members of my own team have already optimised how they generate pitch decks, tax reports, training syllabi, fielding support, technical writing, blogs, software plugins and even patent templates with AI-powered services, all of which result in more time for creativity.

Although every tech revolution has casualties, in my opinion, these technologies will replace tasks, not people. If you're a novice in a given field (not just design) AI can accelerate you to the point of productivity. If you've already done your '10,000 hours ' and possess significant domain expertise and experience in a given field, these tools can make you an order of magnitude more efficient. In my opinion, this is more of a 'rising tides raise all boats' event and less of a job tsunami, and I believe that 'rising tide' will impact the shoreline of every sector that values efficiency.

Tangible Examples

With two of the biggest players Stability AI & Midjourney dominating the headlines with the release StableDiffusion 2.1 and V4 respectively, all things connected to the generative AI space are showing no signs of slowing. Although I'm a user of both, ignore OpenAI at your peril. With a portfolio of projects including #Dalle2 , #GPT3 , #WebGPT , #Whisper , the Playground and #GPT4 seemingly just around the corner, despite the competition, OpenAI still currently presents the most compelling, enterprise focussed offering.

Don't take my word for it, Replit & Jasper.AI built directly on top of GPT-3 already boast valuations of around $1B. Although not universally loved GitHub Copilot has fundamentally disrupted the way programmers write code, and Microsoft has already integrated functionality from OpenAI into other solutions under their umbrella such as Bing, Designer and Excel but the Azure OpenAI service introduced last year provides the clearest signal of long-term intent from Microsoft.

Short Term Concerns

I also have to caveat the optimism of this article (and some of my recent posts) by also firmly stating that I don't believe all generative AI applications are created equally.? With a rising number of solutions providing tangible value to their users, the advances in AI, particularly NLP, diffusion and LLMs are very real. With that said, I expect the vast majority of projects that emerge in the coming months won't have much impact at all.? For every Jasper, Adept or Copilot offering genuine utility, there will be another 20 (some good, some bad, some scams) destined for short-term hype and eventual failure. That's just the nature of any perceived gold rush, when society seeks a quick buck (be that .com, crypto, metaverse or tulips) vaporware will inevitably follow. So I encourage us all to develop some critical thinking toward the space and identify the 'picks and shovels' platforms that can provide you and your organisation demonstrable utility. I also completely understand the deep routed public concerns and scepticism surrounding many of the generative AI platforms that have dominated the conversation. In addition to the genuine ethical debates that require continued discussion, many of us of a certain vintage still suffer from the PTSD inflicted by our first introduction to an 'AI assistant'..... Clippy.

No alt text provided for this image
I still don't trust him

Although many headlines might have you believe otherwise, the emergence of accessible AI models is neither messianic or the herald of disaster, the polarisation we are witnessing is a social phenomenon that occurs during every major technological shift which impacts all of us. Things are never as black and white as they are presented with the truth and clearest minds typically residing somewhere in the grey (or the blue below).

No alt text provided for this image

So to avoid perpetuating any confusion on the benefits generative AI can provide let us try something different.



Unlocking Hyper-Personalisation

So rather than discussing another abstract AI paper (or project) in isolation, I thought I'd try something a little different.

The remainder of this article aims to bring together a collection of seemingly disparate AI projects into a cohesive strategy for a company, and demonstrate the tangible value these tools offer. It would be easy to pick any company and replace a few tasks, but to see the far-reaching impact of generative AI the company selected should be able to feasibly implement the suggested solutions. Therefore the chosen company should possess the appropriate culture, data, distribution model and appetite to adopt this technology into their pipeline.? The selected company must also be well-known enough for readers (of all technical levels) to understand how AI can enhance their business and/or staff in their roles. That way we can more accurately examine the impact of generative ai tools, while hopefully making the complexities of Ai more digestible when discussed in the context of a company (or specific job role). My aim in this article is to make it easier for anyone new to the space to see tangible use cases and hopefully demystify some of the jargon.

The company selected for this exercise will be Netflix and ill explain why below. Remember this is just a fun, hypothetical scenario discussing how generative AI, large language models, natural language interfaces and ML, in general, could be used within an established organisation like Netflix to unlock what I’m calling hyper-personalisation for their subscribers

No alt text provided for this image

A Perfect Match

In July, Netflix named Microsoft as the 'exclusive technology and sales partner' to help power their first ad-supported tier. While my interest back then was little more than a passing one, a lot can happen in a few months and today that partnership looks a lot more interesting. I appreciate the announcement from Netflix refers to their sales partnership, but let's imagine this partnership extends beyond that to include OpenAI (and the wider 微软 ecosystem). Now when reviewing all the players involved, it feels like a perfect relationship greater than the sum of its parts. One that could see Netflix leverage their platform, IP and an established culture of using big data, with #Microsoft 's perfectly positioned technology partnership with #OpenAI (and 英伟达 but more on that later). In this thought experiment, let's examine how the products and research developed by Open AI could be deployed by Netflix to improve customer experience and business efficiency.

No alt text provided for this image
Mr Nadella did not tweet this


Set the Scene - Netflix, Champions of Data-Driven Decision making

Like OpenAI , the Netflix business model is built on a foundation of big data. Netflix monitors every aspect of a subscriber's viewing habits such as when they watch, pause, rewind or fast-forward. The platform tracks everything from your location, your device, and whether or not you leave a show or film before completing it. Combine a built-in rating system and compound across over 200 million subscribers and this translates into a huge amount of high-quality data.

With this information, Netflix knows what content to produce (and what to cancel) which results in more satisfied customers and better capital allocation. The most notable example of these unique data-driven insights in practice was House of Cards.

Netflix identified that the British version of House of Cards was attracting a large audience of subscribers.?Those members who watched the British version of House of Cards also seemed to favour movies starring Kevin Spacey. Identifying these signals in the noise led to Kevin Spacey being cast in the lead role of the modern reboot. That same data was also instrumental in how most of the characters were cast, the script was developed and how the overall narrative progressed. The show was a massive hit.

In the age of TikTok, the idea of high-end content designed, built or curated by the algorithms might seem standard, but back in 2011 this was revolutionary. Cable networks couldn't dream of having the insight Netflix were able to gather on their viewers and the streaming giant would go on to leverage data to its advantage across the entire business.

That's just one of many examples highlighting the existing culture of innovation at Netflix but a more details on that in the video below includes more:

With Netflix's innovative culture and credentials of implementing machine learning established, generative AI feels like a natural progression. Let's look at a few areas where they might implement some of the technology.


Potential Applications

Algorithmic Home Screens

Most people are well aware Netflix already employs sophisticated recommendation algorithms to keep you engaged on its platform. It's obvious when one user logs into someone else's account and the home screen that greets them is drastically different to their own. What is less understood is that these algorithms extend much deeper than ‘watch next’ suggestions. If you've ever wondered why the artwork for your favourite shows changes so regularly let me explain.

Artwork - Customisation Culture

The underlying science and development behind how Netflix creates artwork deserves an article of its own such is the complexity. Here is an overview of how it works today, why it is a critical part of the company strategy, and why it is challenging to maintain.

Explaining why artwork is important to Netflix is the easy bit. Just imagine the service without it. In the absence of artwork, the platform becomes the soulless vacuum of nothingness below and very unlikely that one program would catch a subscriber's attention over another.

No alt text provided for this image

Now we know why artwork is mission-critical for engagement let's discuss how Netflix make art into a science.

Take a Netflix Original like Stranger Things for example. Netflix will generate multiple potential thumbnails. Each thumbnail will suggest a subtly different story element (romance, horror, adventure, 80s nostalgia, sci-fi etc). Armed with a number of variants, Netflix reviews your history and attempts to serve up the most appealing and enticing artwork based on your viewing habits. Do you watch a lot of romcoms or horror? Well, that’s the mirror Netflix will hold back up to you.

No alt text provided for this image

Now try to imagine what the next generation of this recommendation engine could look like. For me, a logical step up would be to supercharge some, if not all of this process by leveraging some ai enhanced workflows and the wisdom of Microsoft's strategic partner OpenAI .

At present Netflix semi-automates the creation of artwork used for thumbnails, with a designer typically hand-selecting frames from the film or show in question and editing images according to the marketing guidelines. Although I have great respect for the designers doing this, it's not scalable. There is only so much bandwidth available to apply this method to a growing content slate currently weighing in at over 13,000 titles. This becomes more challenging when you add in the complexity of local exclusives and other territory-specific nuances where artwork appropriate in one region might not work in another.? What would an AI-enhanced method look like?

At this point, most people are aware that text-to-image platforms accurately generate images but may be unaware that this process is bi-directional. Why is that useful here? Image-to-text (let's just say classify this under computer vision) would let designers use object detection and image classification to quickly find a specific frame with a particular character doing a specific thing using something like OpenAi's CLIP .

No alt text provided for this image
image classification model

Now artists can 'instantly' search through an entire film for an exact frame based on parameters such as the camera lens, lighting colour, actor(s) and/or onscreen activity. This could streamline the existing process making it easier to find the appropriate scenes needed for artwork, ultimately giving the graphic designer(s) more time to generate higher-quality artwork.

Fan Art

But why stop there, I've long held the belief some of the incredible fan art created by viewers should be allowed on the service to further personalise the experience, but with #Dalle2 that is possible for anyone regardless of their artistic ability to participate. An almost infinite number of image variations could be generated to cater to the tastes and preferences of every Netflix member.

No alt text provided for this image
Amien Juugo - https://www.behance.net/gallery/43706275/Stranger-Things

Where things potentially get really interesting is if Dalle2 was used to generate an entirely new hyper-individualised menu of thumbnails for each individual viewer. Based on data Netflix has already collected such as viewing habits combined with some optional user preferences (like art, music, gaming, special occasions, and hobbies), the interface could become entirely bespoke for the user. Like renaissance era artwork? Well, all of the thumbnails could take on that style temporarily. Like Akira/Anime? All of the artwork could take on the style of Katsuhiro Otomo. Upload a few images and why not have you or your family appear in the thumbnail of their favourite film for special occasions? Add a couple of additional prompts and the scope of this hyper-individualised UX grows exponentially. This illustrates how integrating 2 products from the OpenAI portfolio in the shape of the Dalle2 and elements of CLIP could be beneficial for Netflix subscribers and also optimise some back-office workflows for staff generating imagery

Accessibility

One way to maximise the value of each piece of content the streamer hosts is by making it as accessible as possible to the highest number of people with subtitles (and voiceovers) in multiple languages.?Squid Games is a perfect example of taking a regional story and making it a global hit.?As the streaming giant enters new territories it stands to reason Netflix will look to repeat that success again. This is where I think OpenAI's Whisper could come in handy.

Whisper is an incredibly powerful transcription tool able to convert almost any spoken dialogue into text. This makes generating English subtitles for content originating from the new territory far easier, but also makes generating the existing, predominantly English library into subtitles for a new language far easier. The benefit of having dialogue transcribed as text is primarily for the subtitles but there is a growing number of sophisticated Text to Speech (TTS) services that are entering the market. The best I have used so far is play.ht which can create incredibly realistic output without a voice actor. The added benefit of this option is if you don't like someone's voiceover in a show, you can swap them out with the click of a button, maybe even replace the voice of a star character with your own using voice cloning

Conversational Search

Netflix has an impossible task with their existing interface. The more choices you have, the more you struggle to choose. That is the paradox of choice which creates decision fatigue for Netflix users. So I wonder how many viewers abandon the the platform while searching for something??With an ever-expanding content library, it is increasingly difficult for a user to find appropriate content, especially those obscure gems. In addition to the existing ‘set menu’ UI, where an increasingly frustrating amount of time is spent looking for something to watch, could searching for content be far more conversational???I don't mean play x film or TV show, could users identify appropriate content via natural language?

“I’m looking for something like Superbad but more modern, preferably under 90 minutes and with actor ‘[X]’.?

There’s no reason why Netflix couldn’t apply another layer of analysis to ascertain mood, the amount of time you have available, whether you are alone, with friends, with your partner or child or both.? What might that interface look like with an additional context filter? Take it a step further and what if users could ask what a cast member is wearing and where it could be purchased, I know this is a level of functionality my wife wants.

Less Scrolling More Viewing

At present Netflix subscribers watching on TV don't have the ability to skip chapters (like in the good old days of DVDs) and are forced to fast forward/rewind through whole episodes to the moment they fell asleep to the previous evening without going too far and seeing the spoilers (yes I know 1st world problems). With the technology available this need not be the case.

Once again using OpenAIs Whisper, Netflix could deploy Automatic Speech Recognition (ASR) to record and time stamp every word of every show meaning a viewer could easily get back to the last piece of dialogue they remember. This way a small amount of information would identify the film, and the moment a viewer is looking for.

Go to the part when he said "I'm?funny?how? I mean,?funny like?I'm a?clown? I amuse you?"

Alternatively, we could use a visual search (computer vision). In much the same Whisper can transcribe every piece of audio, CLIP could be repurposed for video classification where it is used to capture and archive exactly who and what is present in every frame of a show at any given time. For example, if a viewer said

"Go to the part when Rick first becomes a pickle"

a user would immediately be taken to the correct show [Rick & Morty], the correct episode [Pickle Rick] and the exact point where we first see Rick as a pickle. I had to go for an abstract example as this works less if a user said 'go to the explosion'

No alt text provided for this image

The idea of audio or visual search isn't as complex as it seems. Adobe Project Blink is currently available from Adobe Labs and is already capable of delivering all of this functionality. I suspect Teams and other video conference tools will automatically tag recording in a similar fashion moving forward, with some platforms doing this in real time.

Bespoke Synopsis

Even the synopsis for each piece of content could be tailored towards each user and made more entertaining.?With some tweaks to the weights and biases, OpenAI's language model GTP-3 could in theory 'sound' like any chosen persona or pitch the program according to an individual's taste.?What would a Tarantino synopsis for the Wizard of Oz look like for example??

No alt text provided for this image

What if all of the horror synopsis were replaced with slapstick romcoms?

No alt text provided for this image

User-generated content

Why stop at thumbnails, why not tailor the entire UX? Could text-to-video be used to create individualised micro trailers, possibly staring you and your family? In fact, on the subject of Al-powered trailers let’s rewind back to the initial focus and the ad-supported tier. Could the ads themselves be generated using elements of the OpenAI portfolio to maintain this theme of hyper-personalisation, something advantageous both for the user and the advertiser? Just like the thumbnails discussed earlier there is no reason Netflix and Microsoft couldn’t hypothetically sprinkle a little Dalle2 and GPT-3 to make that happen.

All of the content below was generated by my friend CoffeeVectors using generative AI tools like #Dalle2 (in this case #midjourney ) and free apps on a smartphone, nothing the typical subscriber wouldn't have access to. As a result, there is nothing to stop a user or content creator from altering faces, or adjusting the theme to make the content (be that entertainment or advertisement) more engaging and impactful to an individual user


Choose your own Aventure

With the choose you of adventure storyline Bandersnatch, the streaming giant blurred the lines between traditional content and gaming with the interactive film. I applaud the ambition of the Black Mirror episode due to the complexity it must have taken to develop the script and the technical work to deploy it, but what would Bandersnatch 2.0 look like? Now Netflix has the infrastructure in place, there might be some potential to make the entire storyline and visuals generative. Just like the trailer above created by CoffeeVectors, a huge amount can be communicated with only a few well-delivered frames but as a parent, I could see this working equally well if not better for children's television.

No alt text provided for this image

Bespoke UI

Unlike the artwork used for thumbnails which are tailored to each user, the general UI is identical for all users but hypothetically there could be another opportunity for customisation. Netflix had first mover advantage in the streaming industry but the competitors have quickly replicated or even copied the UI best practices Netflix developed and as a result, they all look fairly similar. The incredible examples below created by @ikergnz demonstrate how generative AI could create completely unique interfaces which would be difficult to replicate. Although this is one of the more challenging concepts to implement it would be quite interesting to see a UI where all of the key information is available, but inspired by a show that a subscriber is currently binging on.


A New Microsoft Flywheel?

Reviewing all the solutions presented by Microsoft and Open AI (with their IP, domain expertise, cloud support and legacy of innovation in artificial intelligence), there are some compelling ways Netflix could employ a small selection of generative AI and ML to enhance the experience for users on the platform, but also improve the efficiency of staff behind the scenes.? Typically this is where the hypothetical conversation would stop because discussing concepts is far more difficult than implementing them......but the most recent announcement, a partnership between Microsoft and NVIDIA to build one of the world's most powerful AI supercomputers makes things a little more feasible. Now many, if not all of the concepts discussed above are not only viable but potentially even logical (to some extent) for Netflix to deploy.

In creating an article primarily focussed on Netflix every thread that I pulled led back to Microsoft and that is where I would like to conclude. Satya Nadella has long been one of, if not the most respected CEO in big tech so it is interesting to observe Microsoft quietly forge relationships and build business units in the shape of #Azure , Azure OpenAI Service , OpenAI , 英伟达 , Netflix #Activision (if it goes through) #xbox #copilot London Stock Exchange and even a GPT-3 powered LinkedIn that offer such a compelling combined value. I feel like Mr Nadella is collecting all of the Infinity Stones. Moving forward, Microsoft is perfectly positioned to provide AI services to any number of organisations and sectors with direct vectors to gaming, finance, entertainment, enterprise business and the cloud which are all generating huge volumes of data to train better models which can be used to enhance all services to attract more users to better models and rinse and repeat (in theory). I'm not sure what Microsoft is cooking, but the ingredients point to them serving something special just as the global audience is developing an appetite for generative AI and AI-enhanced services.

No alt text provided for this image

In summary, this is a small indicator as to why 'generative art' part of a far larger generative AI category is attracting so much investment.?In this example, I tried to illustrate how a small number of OpenAI's existing products (we haven’t even discussed the eagerly anticipated GPT-4) could be used to enhance and advance an existing business model.? For the purpose of this thought exercise, I chose to stay within the Microsoft OpenAI portfolio to make this article as realistic as possible, but in truth, many of my partners (ranging from automotive design to surgeons) are already exploring latent space with the equally powerful Stability AI and Midjourney . Regardless of whether you are a designer, copywriter, accountant lawyer, marketer, teacher or receptionist, there is a selection of products available that can enhance your skills and automate aspects of your existing process. My advice would be to start exploring yourself with an open mind and build your own intuition as to what these tools can do for you and whether they provide utility.

If you liked the format of picking a company to consolidate projects around I'll pick another and create another article soon

Let me know your thoughts.?



-------------------

Yes, it would have been significantly easier to simply ask #ChatGPT for a detailed article answering the original question and discussing the results but I'd already made the memes!

In fact, I did ask while about to complete this article and it not only produced a better more concise argument it completely rug-pulled what I thought were original ideas.

If you want to see the ChatGPT version please find it here: https://sharegpt.com/c/3YY3X1b

To expand beyond the scope of this article and get an appreciation of how much work goes into the Netflix process check out the incredible NetflixTechblog

https://netflixtechblog.com/selecting-the-best-artwork-for-videos-through-a-b-testing-f6155c4595f6

Nicholas John

Industrial Design and Visualization Consultant | Emerging Tech Strategy | Business Development

1 年

Gennaro Cuofano this is what I was writing Ronee Marie Walsh this is the infamous article written on the train

要查看或添加评论,请登录

社区洞察

其他会员也浏览了