Applied AI is the top up-and-coming tech trend, McKinsey says

Applied AI is the top up-and-coming tech trend, McKinsey says

No alt text provided for this image

Applied AI is the top up-and-coming tech trend, McKinsey says

www.mgireservationsandbookings.co.uk



McKinsey report: Two AI trends top 2022 outlook

McKinsey’s newly released Technology Trends Outlook 2022 named applied AI and industrializing machine learning as two of 14 of the most significant technology trends unfolding today.

Specifically, applied AI — which is considered by McKinsey as based on proven and mature technologies — scored highest of all 14 trends on quantitative measures of innovation, interest and investment. The report notes that applied AI has viable applications in several industries and is closer to a state of mainstream adoption than other trends.


McKinsey report: Two AI trends top 2022 outlook

McKinsey’s newly released Technology Trends Outlook 2022 named applied AI and industrializing machine learning as two of 14 of the most significant technology trends unfolding today.

Specifically, applied AI — which is considered by McKinsey as based on proven and mature technologies — scored highest of all 14 trends on quantitative measures of innovation, interest and investment. The report notes that applied AI has viable applications in several industries and is closer to a state of mainstream adoption than other trends.


OpenAI is slashing the price of its GPT-3 API service by up to two-thirds, according to an announcement on the company’s website. The new pricing plan, which is effective September 1, may have a large impact on companies that are building products on top of OpenAI’s flagship large language model (LLM).

The announcement comes as recent months have seen growing interest in LLMs and their applications in different fields. And service providers will have to adapt their business models to the shifts in the LLM market, which is rapidly growing and maturing.

The new pricing of the OpenAI API highlights some of these shifts that are taking place.

A bigger market with more players

The transformer architecture, introduced in 2017, paved the way for current large language models. Transformers are suitable for processing sequential data like text, and they are much more efficient than their predecessors (RNN and LSTM) at scale. Researchers have consistently shown that transformers become more powerful and accurate as they are made larger and trained on larger datasets.

In 2020, researchers at OpenAI introduced GPT-3, which proved to be a watershed moment for LLMs. GPT-3 showed that LLMs are “few-shot learners,” which basically means that they can perform new tasks without undergoing extra training cycles and by being shown a few examples on the fly. But instead of making GPT-3 available as an open-source model, OpenAI decided to release a commercial API as part of its effort to find ways to fund its research.

GPT-3 increased interest in LLM applications. A host of companies and startups started creating new applications with GPT-3 or integrating the LLM in their existing products.?

The success of GPT-3 encouraged other companies to launch their own LLM research projects. Google, Meta, Nvidia and other large tech companies accelerated work on LLMs. Today, there are several LLMs that match or outpace GPT-3 in size or benchmark performance, including Meta’s OPT-175B, DeepMind’s Chinchilla, Google’s PaLM and Nvidia’s Megatron MT-NLG.

GPT-3 also triggered the launch of several open-source projects that aimed to bring LLMs available to a wider audience. BigScience’s BLOOM and EleutherAI’s GPT-J are two examples of open-source LLMs that are available free of charge.?

And OpenAI is no longer the only company that is providing LLM API services. Hugging Face, Cohere and Humanloop are some of the other players in the field. Hugging Face provides a large variety of different transformers, all of which are available as downloadable open-source models or through API calls. Hugging Face recently released a new LLM service powered by Microsoft Azure, which OpenAI also uses for its GPT-3 API.

The growing interest in LLMs and the diversity of solutions are two elements that are putting pressure on API service providers to reduce their profit margins to protect and expand their total addressable market.

Hardware advances

One of the reasons that OpenAI and other companies decided to provide API access to LLMs is the technical challenges of training and running the models, which many organizations can’t handle. While smaller machine learning models can run on a single GPU, LLMs require dozens or even hundreds of GPUs.?

Aside from huge hardware costs, managing LLMs requires experience in complicated distributed and parallel computing. Engineers must split the model into multiple parts and distribute it across several GPUs, which will then run the computations in parallel and in sequences. This is a process that is prone to failure and requires ad-hoc solutions for different types of models.

But with LLMs becoming commercially attractive, there is growing incentive to create specialized hardware for large neural networks.

OpenAI’s pricing page states the company has made progress in making the models run more efficiently. Previously, OpenAI and Microsoft had collaborated to create a supercomputer for large neural networks. The new announcement from OpenAI suggests that the research lab and Microsoft have managed to make further progress in developing better AI hardware and reducing the costs of running LLMs at scale.

Again, OpenAI faces competition here. An example is Cerebras, which has created a huge AI processor that can train and run LLMs with billions of parameters at a fraction of the costs and without the technical difficulties of GPU clusters.?

Other big tech companies are also improving their AI hardware. Google introduced the fourth generation of its TPU chips last year and its TPU v4 pods this year. Amazon has also released special AI chips, and Facebook is developing its own AI hardware. It wouldn’t be surprising to see the other tech giants use their hardware powers to try to secure a share of the LLM market.

Fine-tuned LLMs remain off limits — for now?

The interesting detail in OpenAI’s new pricing model is that it will not apply to fine-tuned GPT-3 models. Fine-tuning is the process of retraining a pretrained model on a set of application-specific data. Fine-tuned models improve the performance and stability of neural networks on the target application. Fine-tuning also reduces inference costs by allowing developers to use shorter prompts or smaller fine-tuned models to match the performance of a larger base model on their specific application.

For example, if a bank was previously using Davinci (the largest GPT-3 model) for its customer service chatbot, it can fine-tune the smaller Curie or Babbage models on company-specific data. This way, it can achieve the same level of performance at a fraction of the cost.

At current rates, fine-tuned models cost double their base model counterparts. After the price change, the price difference will rise to 4-6x. Some have speculated that fine-tuned models are where OpenAI is really making money with the enterprise, which is why the prices won’t change.?

Another reason might be that OpenAI still doesn’t have the infrastructure to reduce the costs of fine-tuned models (as opposed to base GPT-3, where all customers use the same model, fine-tuned models require one GPT-3 instance per customer). If so, we can expect the prices of fine-tuning to drop in the future.

It will be interesting to see what other directions the LLM market will take in the future.







Deep Dive: Why 3D reconstruction may be the next tech disruptor

Unlike the widely available 2D data, 3D data is rich in scale and geometry information, providing an opportunity for a better machine-environment understanding.


Data-driven 3D modeling, or 3D reconstruction, is a growing computer vision domain increasingly in demand from industries including augmented reality (AR) and virtual reality (VR). Rapid advances in implicit neural representation are also opening up exciting new possibilities for virtual reality experiences.


Artificial intelligence (AI) systems must understand visual scenes in three dimensions to interpret the world around us. For that reason, images play an essential role in computer vision, significantly affecting quality and performance. Unlike the widely available 2D data, 3D data is rich in scale and geometry information, providing an opportunity for a better machine-environment understanding.?

Data-driven 3D modeling, or 3D reconstruction, is a growing computer vision domain increasingly in demand from industries including augmented reality (AR) and virtual reality (VR). Rapid advances in implicit neural representation are also opening up exciting new possibilities for virtual reality experiences.

Fireside chat: Harnessing the power of data and machine learning to unlock transformative customer experiences

3D reconstruction generates a 3D object or scene representation by combining a sparse set of images of the object or scene from arbitrary viewpoints. The method allows for accurate reconstruction of shapes with complex geometries, as well as higher color reconstruction.?

3D reconstruction in the era of digital reality?

With the rise of digital experiences and emerging virtual concepts such as the metaverse, it’s critical to have tools that can create accurate 3D reconstructions from image data. Real-world applications of this technology allows users to virtually try on clothing while shopping in AR and VR, as well as to process medical image data. It can also be used for free-viewpoint video reconstruction, robotic mapping, reverse engineering and even reliving memorable moments from various perspectives. According to a SkyQuest survey, the global 3D reconstruction technology market will be worth $1,300 million by 2027.

3D reconstruction is now a priority by tech and ecommerce giants, as it not only lays the groundwork for a future presence in virtual worlds but also provides immediate tangible business benefits in advertising or social commerce.?

Recently, Shopify reported that merchants who add 3D content to their stores see a 94% conversion lift, a far more significant impact than videos or photos as 3D representations provide customers with details that images alone cannot.

To develop 3D reconstruction implementations, intelligent context-understanding systems must recognize an object’s geometry as well as its foreground and background to accurately comprehend the depth of scenes and objects depicted in 2D photos and videos. Advanced deep learning techniques and increased availability of large training datasets have led to a new generation of methods for capturing 3D geometry and object structure from one or more images without the need for complex camera calibration procedures.

How 3D reconstruction is aiding computer vision

Synthesizing 3D data from a single viewpoint is a fundamental human vision functionality that computer vision algorithms struggle with. Furthermore, as 3D data is more expensive to acquire than 2D data, it has been challenging to access textured 3D data to effectively train machine learning models for predicting correct textures. To address these requirements, 3D reconstruction solutions seamlessly combine real and virtual objects in AR without requiring large amounts of data to learn from or being limited to a few perspectives.?

3D reconstruction uses an end-to-end deep learning framework that takes a single RGB color image as input and converts the 2D image to a 3D mesh model in a more desirable camera coordinate format. The perceptual features in the 2D image are extracted and leveraged by a graph-based convolutional neural network, which produces a 3D mesh by progressively converting the input into ellipsoids until it reaches a semantically correct and optimized geometry. The rough edges in the derived 3D model are fine-tuned using a dense prediction transformer (DPT), which employs visual transformers to provide more fine-grained output.

?Image Source: Pixel2Mesh

Current Implementations of 3D reconstruction

Meta recently released Implicitron, a 3D reconstruction architecture that enables fast prototyping and 3D reconstruction of objects. Implicitron uses multiple shape architectures to generate implicit shapes, where a renderer further analyzes the input image to convert the 2D input into a 3D model. To facilitate 3D experimentation, the model includes a plug-in and configuration system that allows users to define component implementations and enhance configurations in-between implementations.

The open-source model “3Dification” can recalibrate camera angles and frames post-video capture by processing a collection of videos as input and further reconstructing the environment and scene present in the video through 3D reconstruction and 3D pose estimation methods. The processed output enables reliable re-identification in cases of challenging shot transitions, where camera viewpoints are not significant enough in certain scenes.


Future opportunities?

3D research is critical for teaching systems how to understand all perspectives of objects, even when they are obstructed, hidden or have other optical challenges.?

Further development of sustainable and feasible approaches will increase access to larger scientific communities and audiences while improving interoperability. Incorporating 3D reconstruction with deep learning frameworks such as tactile sensing and natural language understanding can help AI systems understand three dimensions more intuitively, just like humans do.

要查看或添加评论,请登录

社区洞察

其他会员也浏览了