OmniML

软件开发

Effortlessly Empower AI Everywhere. Acquired by NVIDIA.

关注

关于我们

OmniML is an enterprise artificial intelligence (AI) company that aims to effortlessly empower AI everywhere.

所属行业: 软件开发
规模: 11-50 人
总部: San Jose
类型: 私人持股

地点

主要

US，San Jose

获取路线

动态

OmniML转发了

Di Wu

Making AI scalable and accessible for all
6 个月
举报此动态
Inference efficiency is the key enabler for the #genai revolution. Sitting in the center of it is algorithm optimizations such as quantizing the model to lower precisions, or pruning the model weights with structured sparsity. The newly announced #nvidia TensorRT Model Optimizer is the beginning of a unified platform for algorithmic inference optimizations, starting for all the best quantization recipes such like AWQ, SmoothQuant, etc. Many years of research and engineering on #modelcompression have been dedicated to this topic and there are much more to add. We are super excited that our product is finally launched. Check out the blog post for details, and follow our public examples and documentations on github: https://lnkd.in/gwaTDHj7
NVIDIA AI

1,127,059 位关注者
6 个月已编辑

NVIDIA TensorRT Model Optimizer…the newest member of the #TensorRT ecosystem is a library of post-training and training-in-the-loop model optimization techniques: ?Post-training quantization ?Quantization-aware training ?Sparsity Read our blog ?? https://nvda.ws/3Wt7nUA
1 条评论

赞评论分享
OmniML转发了

Di Wu

Making AI scalable and accessible for all
8 个月已编辑
举报此动态
Learn more about the best ways to run GenAI on NVIDIA by combining algorithmic techniques and system optimizations. Stay tuned for more exciting capabilities yet to come. Kudos to the old OmniML team members Shengliang Xu Keval Morabia Kai Xu Jingyu Xin Zhiyu (Edward) Cheng Wei-Ming Chen etc for helping to make this happens.

Ashwin Nanjappa

DL inference performance @ NVIDIA
8 个月已编辑

Dive into our MLPerf Inference v4.0 post on the optimizations that delivered 3x perf for NVIDIA Hopper on GenAI benchmarks: inflight batching, paged KV cache, FP8/INT8, sparsity, pruning, deep cache and more! ??????

NVIDIA H200 Tensor Core GPUs and NVIDIA TensorRT-LLM Set MLPerf LLM Inference Records | NVIDIA Technical Blog

developer.nvidia.com

1 条评论

赞评论分享
OmniML

2,022 位关注者
8 个月
举报此动态
Come check us out at #GTC for what the team has been up to.

Asma Beevi K T

Accelerating Gen AI @ Nvidia (acquired) | Georgia Tech
9 个月已编辑

Generative models are expensive to train, but their inference, a recurring cost is much more expensive! Quantization is a very effective model optimization technique which reduces the model size and increases inference speed without much accuracy degradation. TensorRT/TensorRT-LLM quantization toolkit from Nvidia provides fast and accurate quantization algorithms for deep learning models via easy to use python APIs. We work hard to bring the latest quantization formats and best in class algorithms to you in no time! I am very excited to give a tutorial on "Achieving Optimal Inference Speed with Quantization in TensorRT-LLM and TensorRT" for GTC 2024 happening March 17-21 with Zhiyu (Edward) Cheng and Chenjie Luo! See https://lnkd.in/ghMDcZ3G to learn more ?? Register for in-person attendance at GTC with 25% discount https://lnkd.in/ghjzxwQ5 or attend virtually for free! ?? ?? #gtc2024

GTC 2024: #1 AI Conference

nvidia.com

赞评论分享

相似主页

查看职位

融资

OmniML 共 1 轮

上一轮

种子轮 2022年4月29日

US$10,000,000.00

投资者

GGV Capital +6 其他投资者

在 Crunchbase 上查看更多信息

随时了解OmniML最新信息

OmniML

软件开发

Effortlessly Empower AI Everywhere. Acquired by NVIDIA.

关于我们

地点

动态

立即加入，查看您错过的职场动态

相似主页

OctoAI (now NVIDIA)

英伟达

Metis

Databricks Mosaic Research

oPRO.ai

Falcon Computing Solutions, Inc

Fixie.ai

Neural Magic

Modular

Temporal Technologies

查看职位

交易分析师职位

专员职位

机器学习工程师职位

工程总监职位

工程师职位

手机工程师职位

首席产品专员职位

科学家职位

公共关系总监职位

伙伴关系主管职位

高级软件工程师职位

销售副总裁职位

高级经理职位

运营总监职位

产品经理职位

广告文案撰稿人职位

Python 开发员职位

总监职位

融资