Simply Phi-nominal ??

Simply Phi-nominal ??

Microsoft recently released the new Phi-3.5 models: Phi-3.5-MoE-instruct, Phi-3.5-mini-instruct, and Phi-3.5-vision-instruct.?

Phi-3.5-MoE-instruct is a 42-billion-parameter open-source model that demonstrates significant improvements in reasoning capabilities. It outperforms larger models such as Llama 3.1 8B and Gemma 2 9B across various benchmarks.

There’s more.?

The Phi-3.5-MoE-instruct model, with precisely 41.9 billion parameters, handles more advanced reasoning. The Phi-3.5-vision-instruct, with 4.15 billion parameters, is designed for vision tasks like image and video analysis.

Meanwhile, the Phi-3.5-mini-instruct, with 3.82 billion parameters, is built for basic and quick reasoning tasks.?

Despite its competitive performance, Phi-3.5-MoE falls slightly behind GPT-4o-mini but surpasses Gemini 1.5 Flash in benchmarks. The model supports multilingual applications, although the specific languages covered remain unclear.

Phi-3.5-MoE features 16 experts, with two being activated during generation, and has 6.6 billion parameters engaged in each inference. It supports multilingual capabilities and extends its context length to 128,000 tokens. The model was trained over 23 days using 512 H100-80G GPUs, with a total training dataset of 4.9 trillion tokens.

Its development included supervised fine-tuning, proximal policy optimisation, and direct preference optimisation to ensure precise instruction adherence and robust safety measures. The model is intended for use in memory and compute-constrained environments and latency-sensitive scenarios.

Key use cases for Phi-3.5-MoE include general-purpose AI systems, applications requiring strong reasoning in code, mathematics, and logic, and as a foundational component for generative AI-powered features.?

The model’s tokeniser supports a vocabulary size of up to 32,064 tokens, with placeholders for downstream fine-tuning. Microsoft provided a sample code snippet for local inference, demonstrating its application in generating responses to user prompts.

With 3.8 billion parameters, this model is lightweight yet powerful, outperforming larger models such as Llama 3.1 8B and Mistral 7B. It supports a 128K token context length, significantly more than its main competitors, which typically support only up to 8K.

Recently, NVIDIA announced the launch of Mistral-NeMo-Minitron 8B outperforming other SOTA models like GPT-3, BERT, and T5 across nine key benchmarks, delivering top-tier accuracy in a compact form that enables real-time generative AI on edge devices, while reducing computational costs and enhancing security.

Microsoft Phi-3.5 vs the world?

Microsoft’s Phi-3.5-mini is positioned as a competitive option in long-context tasks such as document summarisation and information retrieval, outperforming several larger models like Llama-3.1-8B-instruct and Mistral-Nemo-12B-instruct-2407 on various benchmarks.?

The model is intended for commercial and research use, particularly in memory and compute-constrained environments, latency-bound scenarios, and applications requiring strong reasoning in code, maths, and logic.?

The Phi-3.5-mini model was trained over 10 days using 512 H100-80G GPUs. The training process involved processing 3.4 trillion tokens, leveraging a combination of synthetic data and filtered publicly available websites to enhance the model’s reasoning capabilities and overall performance.

Phi-3.5 Vision is a 4.2 billion parameter model and it excels in multi-frame image understanding and reasoning. It has shown improved performance in benchmarks like MMMU, MMBench, and TextVQA, demonstrating its capability in visual tasks. It even outperforms OpenAI GPT-4o on several benchmarks.?

The model integrates an image encoder, connector, projector, and the Phi-3 mini language model. It supports both text and image inputs and is optimised for prompts using a chat format, with a context length of 128K tokens.?

The model was trained over six days using 256 A100-80G GPUs, processing 500 billion tokens that include both vision and text data.

The Phi-3.5 models are now available on the AI platform Hugging Face under an MIT licence, making them accessible for a wide range of applications. This release aligns with Microsoft’s commitment to providing open-source AI tools that are both efficient and versatile.

Check out the full story here.?


Cursor AI Codes Better Than GitHub Copilot Ever Will

It’s crazy how AI tools are making everyone a developer. Earlier, it was GitHub Copilot, though its allure among developers has slowly been dying. Now, it is Cursor AI, which is being celebrated as the best developer tool for AI at the moment.

To give an example, Ricky Robinett, the VP of developer relations at Cloudflare, posted a video of his eight-year-old daughter building a chatbot on the Cloudflare developer platform in just 45 minutes using Cursor AI, documenting the whole process, even the spelling mistakes while giving prompts!

Learn more about Cursor AI here.?


AI Bytes?

  • OpenAI has partnered with Condé Nast to enhance news access by integrating AI with top publications like Vogue and The New Yorker, outpacing previous collaborations with media giants like Vox Media and Financial Times.?
  • NVIDIA, in collaboration with Calsoft, recently automated India’s tollbooths using advanced vision AI, achieving 95% accuracy in licence plate detection despite non-standardised plates and environmental challenges.?
  • Meta recently launched Metamate, an AI-powered assistant designed by Soumith Chintala and the team to boost internal productivity with custom agents, document summarisation, and workflow-specific tools.?

要查看或添加评论,请登录

AIM Events的更多文章

社区洞察

其他会员也浏览了