Meta vs OpenAI: Shaping the Future of Open Source AI

Meta vs OpenAI: Shaping the Future of Open Source AI

Meta leads open source AI push. 英伟达 and 英特尔 set new MLPerf speed records. 微软 releases new FP8 mixed precision framework.

Let’s dive in!

ML Engineering Highlights:

  • Forget ChatGPT, why Llama and open source AI win 2023: In 2023 Meta released Llama and Llama 2, which are free and open source models, leading to hundreds of derivatives and a broader open source ecosystem in AI. Llama and open source AI have had a tremendous impact on the generative AI landscape but have also sparked debate due to concerns over safety and security. The open source community is said to be essential for the long-term impact of generative AI and has the potential to transform many aspects of work and life.

Image created with DALL-E 3 for VentureBeat

  • Nvidia, Intel claim new LLM training speed records in new MLPerf 3.1 benchmark: The MLCommons MLPerf 3.1 training benchmark, which includes more than 200 performance results, has become a barometer for the entire industry's progress in AI training. Training AI models in 2023 is significantly faster than ever before, with some companies achieving 50% to 3x performance improvements, particularly in large language model (LLM) training. Intel and Nvidia have reported significant gains in the field of LLM training, showcasing substantial improvements in speed and performance.
  • OpenAI announces customizable 'GPTs' for businesses and consumers: OpenAI has launched customizable AI agents called "GPTs," which allow users to create tailored versions of ChatGPT for specific purposes without coding. These tools are available to paying subscribers of ChatGPT Plus and ChatGPT Enterprise and can be used for a variety of tasks in both personal and professional settings. The introduction of GPTs represents a major step forward in the personalization and democratization of AI technology, opening up new possibilities for AI application.

Research Highlights:

  • FP8-LM: Training FP8 Large Language Models: This paper by researchers at 微软 Azure and Microsoft Research presents a new FP8 automatic mixed-precision framework for training large language models (LLMs). The framework offers three levels of FP8 utilization, gradually incorporating 8-bit gradients, optimizer states, and distributed learning. Experimental results show that the FP8 framework reduces memory usage by 42%, runs 64% faster than the Megatron-LM framework, and surpasses the speed of the Nvidia Transformer Engine by 17%, making it cost-effective for training large models. The FP8 framework is also applicable to other tasks such as instruction tuning and reinforcement learning.

  • Neural MMO 2.0: A Massively Multi-task Addition to Massively Multi-agent Learning: This paper by researchers at 美国麻省理工学院 , Carper AI, and Parametrix AI introduces Neural MMO 2.0, a massively multi-agent environment designed for reinforcement learning research. The updated version includes a flexible task system that allows users to define various objectives and reward signals. With procedurally generated maps and support for up to 128 agents, the platform challenges researchers to train agents capable of adapting to unseen tasks, maps, and opponents. The paper also mentions the competition at NeurIPS 2023 to encourage initial research on the platform.
  • Pretraining Data Mixtures Enable Narrow Model Selection Capabilities in Transformer Models: This paper by researchers at Google DeepMind explores how transformers, particularly large language models, can adapt to new tasks without explicit training. The study focuses on transformers trained on pairs of input-output sequences rather than natural language. The findings suggest that while transformers can effectively identify and learn new tasks within their pretraining data, they struggle with out-of-domain tasks and show limitations in generalization abilities.

Don’t Miss the Submission Deadline

  • CVPR 2024: The IEEE/CVF Conference on Computer Vision and Pattern Recognition 2024 Submission Deadline: Sat Nov 11 2023 02:59:59 GMT-0500
  • ICAPS 2024: The International Conference on Automated Planning and Scheduling 2024 Submission Deadline: Thu Dec 14 2023 06:59:59 GMT-0500
  • ICML 2024: International Conference on Machine Learning Submission Deadline: Thu Feb 01 2024 11:59:00 GMT-1200
  • CHIL 2024: Conference on Health, Inference, and Learning Submission Deadline: Mon Feb 05 2024 23:59:59 GMT-0500
  • ECCV 2024: European Conference on Computer Vision 2024 Submission Deadline: Fri Mar 08 2024 06:59:00 GMT-0500

Want to learn more from Lightning AI? “Subscribe” to make sure you don’t miss the latest flashes of inspiration, news, tutorials, educational courses, and other AI-driven resources from around the industry. Thanks for reading!

要查看或添加评论,请登录

Lightning AI的更多文章

社区洞察

其他会员也浏览了