登录查看更多内容

Multimodal AI meets architecture & design

Cedric Teissier

Investor | Fintech

发布日期: 2023年2月4日

Multimodal AI refers to systems that can process and understand multiple modes of data such as natural language, images, videos, and speech. Its applications range from conversational AI to computer vision and speech recognition.

The increased adoption of multimodal AI is likely to bring about significant changes in various industries and aspects of human life. Some of the expected changes include:

Improved accessibility: Multimodal AI systems can help people with disabilities or difficulties in using traditional input methods to interact with technology.
Enhanced customer experience: Companies can use multimodal AI to provide personalized and efficient customer service across multiple channels.
Augmented human capabilities: Multimodal AI can assist people in their tasks by automating repetitive processes, making recommendations, and providing real-time feedback.
Changed nature of work: The increasing use of multimodal AI may lead to the creation of new jobs, as well as the displacement of certain occupations.
Advancements in research and education: Multimodal AI can help researchers in various fields by analyzing vast amounts of data and providing new insights, and it can also be used to enhance the learning experience for students.

Tech companies are making strides in multimodal AI

Tech companies are now making strides in multimodal AI to improve search and content generation.

Today, an AI model trained on video data can be used for predicting video content, a model trained on text can be used for text predictions, and so on.

To go beyond this, multimodal AI research aims to be more holistic, using a single AI model to conceptualize information across multiple types of data like text, 2D images, and videos to make a prediction.

For example, in early 2021, OpenAI trained an AI model called DALL-E to generate images based on a text prompt.

In the image below, the AI generates avocado-shaped armchairs following a prompt for the same.

In January 2022, OpenAI released DALLE-2, which improves the original model’s output image resolution by 4x.

In May 2022, 谷歌 launched Imagen, a text-to-image project that reportedly outperforms OpenAI’s model in terms of the quality of images generated, as well as the alignment between the input (text) and output (AI-generated image).

Earlier this year, Meta published a paper called “Omnivore: A Single Model for Many Visual Modalities.” The paper details an AI model that, when trained to recognize 2D images of pumpkins, can also recognize pumpkins in videos or 3D images without requiring additional training for the latter 2 media types.

Multimodal AI is growing beyond academic research labs to find practical applications. 谷歌 , for instance, is using multimodal AI to improve search. In the future, a user could take a photo of their hiking boots and ask a query like, “Can I use these to hike Mt. Fuji?” The search engine would recognize the image, mine information on the web about Mt. Fuji from text, image, and video data, and connect the dots to provide a relevant answer.

Multimodal AI research is poised to go beyond corporate research labs to power the next era of search and content generation, among other applications.

Digital Tails Group 2 个月前

The Future of Artificial Intelligence: Multimodal AI

Vishal Prasad 3 个月前

AI Adoption Survey

Sue Lynn Teh 1 年前

But where does the money flow?

Sequoia Capital released an interesting piece a few months ago on this brave new world, and seems to go all in.

Below is the most recent mapping of the space - as you can see, no one is expected to be left on the side of the impact.

What can be expected in architecture & design?

Clearly, AI has the potential to augment and assist architects and designers in their work, but I believe it is unlikely to replace them in the near future. While AI can generate designs, it is still limited (and to some extent, is bound to remain limited) in its ability to understand and incorporate the nuances and complexities of human preferences, cultural context, and ethical considerations.

I mean, every ChatGPT prompt has a thesis, an antithesis, and a conclusion. That’s not artistic expression, that’s just resources being curated about the so-called ‘state of the art’.

The same goes with architecture & design generative AI solutions. While feeding from words, prompts, or images, they simply ‘generate’ – they don’t ‘create’. Yes, DALLE-2, Midjourney or Models Lab (formerly Stable Diffusion API) are fun to 'generate', but that's not an architecture or design creation per se. It's just a digital asset born from the ashes of other assets and training models.

Architects and designers bring unique creative visions and critical thinking skills to the design process, and their expertise and judgment are needed to evaluate the outputs generated by AI algorithms. While AI can serve as a tool for architects and designers, helping them to save time and improve their efficiency, the final decisions and creative direction of the design process will still be left to human architects and designers.

Truth be told, and that’s a good news – architecture and design are not purely technical fields and require a deep understanding of human behavior, culture, and aesthetics. AI algorithms may not ever be able to fully grasp these aspects of design, and human architects and designers will likely continue to play a critical role in shaping our built environment.

Should architects & designers revert to blockchain to protect their IP?

Most certainly, blockchain technology has the potential to help IP in various industries By creating a secure and decentralized digital ledger, blockchain allows for the creation of tamper-proof records of ownership, provenance, and transfer of IP rights.

For architects and designers, using blockchain to manage IP rights is bound to be the way to go - thereby, it can provide a reliable and transparent way to keep track of creations and ensure that creators are properly credited and compensated for their work. It will also simplify the process of licensing and selling their designs, as well as preventing unauthorized use or infringement of their IP. That's a big big issue when we see the current turmoils surrounding Getty Images , Models Lab (formerly Stable Diffusion API) , Adobe and others - software providers feeding on the intelligence of others to train AI-powered models without protecting the original rights.

Sure, blockchain is a relatively new technology, and its applications for IP protection are still evolving. There will also be some challenges to overcome, such as ensuring the accuracy and completeness of the information recorded on the blockchain, as well as the issue of interoperability between different blockchain systems.

But that's the way forward.

That's, right here, right now - a massive use case for enterprise blockchain, architects, designers and AI-engines to collaborate.

要查看或添加评论，请登录

查看全部

Multimodal AI meets architecture & design

Cedric Teissier

Investor | Fintech

Tech companies are making strides in multimodal AI

领英推荐

But where does the money flow?

What can be expected in architecture & design?

Should architects & designers revert to blockchain to protect their IP?

更多精彩文章

社区洞察

其他会员也浏览了

AI Trends in 2024: Shaping the Future of Technology and Beyond

Introduction to Artificial Intelligence (AI January Series)

Generative AI Market Size, Share, Analysis, Trends, Growth Drivers, and Future Outlook 2024-2032

THE EVOLUTION OF AI ASSISTANTS

Exploring Multimodal AI: Bridging the Gap Between Text, Image, and Speech

Turing Test & how it helps IT

Scaling Up for AI: The Growing Demands of Large Generative Models and Their Implications

The Promise and Peril of AI: Transforming Industries and Challenging Society

Artificial Intelligence & Small Business

AI: Beyond the Hype: Is it a Feature or Product

Tech companies are making strides in multimodal AI

领英推荐

But where does the money flow?

What can be expected in architecture & design?

Should architects & designers revert to blockchain to protect their IP?

Transitioning to Intent-Based Architectures in Blockchain: Exploring Projects Pioneering the Shift

2023年9月25日

It's now all about RWAs (and infra, of course)

2023年9月18日

S1 2023 x venture capital - it's rocky out there

2023年7月13日

Q2 2023: crypto M&A deep dive

2023年7月11日

Cryptocurrency vs. the SEC: a fight for fair digital investing

2023年7月10日

What's going on with stablecoins

2023年7月7日

Valuable lessons for NFT collectors from the apes

2023年7月6日

FTX and the tale of the crypto bankruptcy costs

2023年7月5日

Reshaping the definition of money: Central bankers' role and the evolutionary journey

2023年7月4日

'That' digital € is your privacy down the drain

2023年7月3日

社区洞察

其他会员也浏览了

AI Trends in 2024: Shaping the Future of Technology and Beyond

Introduction to Artificial Intelligence (AI January Series)

Generative AI Market Size, Share, Analysis, Trends, Growth Drivers, and Future Outlook 2024-2032

THE EVOLUTION OF AI ASSISTANTS

Exploring Multimodal AI: Bridging the Gap Between Text, Image, and Speech

Turing Test & how it helps IT

Scaling Up for AI: The Growing Demands of Large Generative Models and Their Implications

The Promise and Peril of AI: Transforming Industries and Challenging Society

Artificial Intelligence & Small Business

AI: Beyond the Hype: Is it a Feature or Product