AI in Tech: Applications, Challenges, and Opportunities | FutureX Seminar
from Stable Diffusion

AI in Tech: Applications, Challenges, and Opportunities | FutureX Seminar

With the introduction of ChatGPT, the IT industry has taken a significant leap towards a new paradigm. This has led to a proliferation of downstream applications, resulting in a flourishing array of AI-based solutions. The Chinese and American tech giants, along with research organizations globally, have joined the large model arms race, marking the beginning of the AI 2.0 era. In early March, FutureX Capital organized an investment seminar titled "Seizing Opportunities in the AI Wave." The event witnessed the participation of several industry heavyweights and founders of portfolio companies. The discussion primarily revolved around two core topics - What caused the ChatGPT craze, and what is the difference between AI 2.0 and AI 1.0? The past months saw the most exciting period in the history of large models. Tech giants, startups, and tech organizations made daily updates on the progress of the large model industry. Notable developments include the release of multimodal GPT-4, Microsoft 365 Copilot, and Google's launch of the Bard beta version. These advancements in the large model industry suggest that the pace of AI 2.0 may far exceed previous technological waves.


Below are the key takeaways from the seminar:

Large Models: The Operating System of the AI 2.0 Era

In the PC and Internet era, operating systems acted as universal systems, managing and connecting software and hardware resources, enabling efficient operation of diverse applications and facilitating various inputs and outputs. Today, large models (such as GPT-4, LaMDA, and Ernie Bots) possess universal underlying capabilities, such as understanding, reasoning, and generation. As large models mature, they provide a wide range of functions for new and existing applications, intelligently handling various tasks and beginning to play a role similar to operating systems.

  • Universality: Large AI models, like GPT-4, have extensive universality. These models can be used for various NLP tasks such as text generation, summarization, translation, and question-answering. This makes them a universal "intelligence engine," similar to operating systems providing basic functions for various applications.
  • Scalability: Large AI models can be expanded by fine-tuning or adding datasets for specific tasks to adapt to particular application requirements. This is similar to operating systems allowing users to customize and extend their functionality to meet specific needs.
  • Platform Support: With the proliferation of large AI models, more and more platforms and tools are beginning to support these models, providing developers with convenient APIs and interfaces. This makes AI models increasingly like operating systems, serving as the infrastructure for many applications.
  • Ecosystem: Similar to operating system ecosystems, large AI models are also starting to form a vast ecosystem, including developers, researchers, enterprises, and end-users. This ecosystem drives innovation and technological progress, enabling large AI models to better meet various needs.
  • New Interaction Modes: Large AI models allow people to interact with computers through natural language, enabling a more intuitive and flexible way of interaction. This is similar to the role of operating systems in providing user interfaces, offering simpler and more efficient operation methods for users.

Data: The Flywheel Effect

As a new generation of operating systems, the data flywheel effect of large AI models will be more pronounced. As shown in the figure above, under the current model, user/preprocessed data will accumulate within the model, driving performance improvements and further enhancing user experience and quantity. A positive feedback loop forms between user data and model performance. (Data privacy and security issues arising from this are also among the barriers for industry customers to accept large models.)

  • Data accumulation: As more and more users use applications based on large AI models, these applications will collect vast amounts of user-generated data. This data typically includes input text, model responses, and user feedback on model outputs.
  • Model improvement: By analyzing and annotating the collected user-generated data, new training datasets can be generated. These datasets can be used to further fine-tune and optimize large AI models, thereby improving their performance and adapting to specific tasks.
  • Enhanced user experience: As model performance improves, users will have a better experience when using applications based on large AI models. This may lead to users being more willing to continue using these applications and recommend them to others.
  • Increased user numbers: As user experience improves, the number of users of the applications may increase. This will lead to more user-generated data being collected, further strengthening the data flywheel effect.

No alt text provided for this image


Applications: Imagination is all you need

Large models, with their versatility, comprehension, and multimodal capabilities, bring countless possibilities to applications. Rapid response from mobile internet giants and start-ups in the last generation has enabled large models to quickly penetrate various scenarios, such as search (e.g., New Bing, Bard), social (e.g., SnapChat), e-commerce (e.g., Shopify), creativity (e.g., Adobe), and office (Slack), etc. Start-ups are also pushing the envelope in various scenarios such as social, chat, live streaming, and intelligent customer service, dramatically improving experience and efficiency through AI-native applications.

Open-source models and closed-source model APIs enable start-ups and existing businesses (including many FutureX portfolio) to quickly integrate large models (mostly GPT series) into their applications, achieving functional and efficiency improvements. The main methods currently used include:

  • Direct API calls: Fast and easy, directly integrating the model into existing applications, but may not meet the needs of some specific domains or tasks;
  • Preprocessing and post-processing: Loosely coupling the general model to better adapt to specific tasks and improve performance in specific scenarios. Preprocessing enables the model to better understand data; post-processing makes output content more understandable to users. Some additional work is required (but significantly reduced compared to traditional NLP workload), and not all problems can be solved;
  • Fine-tuning: Using a small amount of specialized data to teach the model specific domain knowledge, improving accuracy in specific scenarios. This also requires some data preparation and additional work, and may require multiple iterations and fine-tuning, with limited effectiveness in specific scenarios.

The current methods allow businesses to adapt to AI 2.0 era based on existing general-purpose large models, but are still not the optimal solution for vertical industry applications. In the future, there may be lightweight solutions optimized for specific industries (based on general-purpose large models).

Large models not only bring improvements in AI capabilities but also provide new modes of interaction. In the AI 2.0 era, natural language becomes the standard interface for interaction, and the form and application boundaries of intelligent hardware will change. Natural language interaction makes it possible for more diverse intelligent terminals to change their positioning, shifting from toys and auxiliary devices to having certain core functions. Glasses, headphones, speakers, and smart home appliances are expected to see significant improvements in functionality and importance. Application software will also transition from automating simple functions to intelligently completing complex tasks.

No alt text provided for this image


Infrastructure: Continuous Demand for Computing Resources

The application and ecosystem of large models are still in their early stages, while the demand for underlying infrastructure presents relatively certain opportunities. The huge demand for computing resources in the training of large models will drive the long-term development of cloud computing, data centers, and underlying computing and networking hardware markets.

Improvements in the efficiency of underlying hardware and reductions in price will greatly promote the training and promotion of large models. On the one hand, this drives the industry to continuously invest in more efficient and powerful GPUs and HBM products, and on the other hand, it also drives industry innovation, including optimized architectures for deep learning training, more innovative semiconductor processes (such as compute-storage integration), and more efficient transmission hardware (such as silicon photonics). The domestic market ban has brought enormous opportunities for domestic GPU, high-performance storage, and high-performance transmission manufacturers.


Challenges Faced by Enterprise-level Applications

Compared to the rapid response and embrace of the AI 2.0 era by the internet industry, traditional industries (especially those with high technological strength, such as finance and high-end manufacturing) are closely monitoring the development of AI, but there are still many challenges in the implementation of large models in enterprise-level applications:

  • Data security issues: 1) The versatility and wide application scope of large models may lead to unintentional leakage of industry information by users (industry employees), resulting in some sensitive industries banning employees from using ChatGPT recently; 2) Under the current model, data flows through and settles within the model, which is contrary to the data security concerns of current enterprise-level applications. The use of technical means (such as privacy computing) to separate data content and computing power, making the model usable but not able to understand data, will be an important prerequisite for large models to enter enterprise-level applications;
  • Data integration issues: Traditional industries usually have deep but closed data accumulation, with a large amount of core data protected only by internal closed systems. To fully realize the value of large models, it is necessary to connect large models with internal data systems of enterprises, and private deployment of large models may be an option in the future;
  • Data credibility: Enterprise-level applications have extremely high requirements for the rigor and accuracy of results. At present, ChatGPT, for example, cannot guarantee the accuracy of output content. If applied to core enterprise-level scenarios, the application output combined with large models needs to be supported by a rigorous evidence chain and reasoning process.


Our take on Large Models

Universal large models are the core systems of AI 2.0 and are a battleground for major tech giants. There will be several dominant players, including not only international tech giants but also China's own large models. At the same time, a rich application layer will bring enormous opportunities to various industries by implementing large models, enabling more intelligent solutions. There will also be continuous opportunities in the infrastructure layer, including hardware, cloud computing, energy, and more, providing ongoing support and promotion for the development of AI 2.0.

actively investing

回复

要查看或添加评论,请登录

FutureX Capital的更多文章

社区洞察

其他会员也浏览了