Quotient AI

软件开发

Boston，Massachusetts 2,316 位关注者

Build better AI products fast.

关注

查看全部 9 位员工

关于我们

Build better AI products fast.

网站: www.quotientai.co
Quotient AI的外部链接
所属行业: 软件开发
规模: 2-10 人
总部: Boston，Massachusetts
类型: 私人持股

地点

主要

US，Massachusetts，Boston

获取路线

Quotient AI员工

Abbas Haider Ali

SVP @ GitHub ($2B+ ARR) leading post-sales incl. Customer Success, Professional Services, Support, Renewals, Strategy, and CS Engineering orgs |…
Deanna Emery

Founding AI Researcher at Quotient AI | Associate Board Member at Illinois Science and Technology Coalition
Julia Neagu

CEO & Co-Founder | Quotient AI
Jessica Carlson

Building exciting things with excellent people.

查看全部员工

动态

Quotient AI转发了
Julia Neagu

CEO & Co-Founder | Quotient AI
2 周
举报此动态
“When we first started hooking Copilot Chat in, we realized we’d get everything under the sun—people asking for random stuff that had nothing to do with code. We had billions of requests, so we had to cluster the logs just to figure out what was actually happening. That’s how we discovered real usage patterns—and that’s how we got serious about building our eval harness [...] At the end of the day, without evaluations, you’re flying completely blind. If you can’t measure it, you can’t improve it.” Check out Freddie Vargus and Reid Mayo from OpenPipe dropping some knowledge in the latest podcast ??

Reid Mayo

Founding AI Engineer @ OpenPipe (YC23) | The End-to-End LLM Fine-tuning Platform for Developers
2 周

Engineers who know me know I’ve been on a Evals kick for a few months now – interviewing top Founders, Staff AI Engineers, and thought leaders in the space. I’ve traveled all over the US going to AI conferences back to back to back to back to back and I keep doubling down on the Evals topic for one practical reason. It continues to pop up in conversation after conversation as the single most challenging problem in the Applied AI engineering space. This was my experience in late 2023 – and it’s still true in 2025. For this reason I’m incredibly excited to announce my interview with one of the leading minds in Evals – Freddie Vargus. Freddie Vargus (and his Co-founder Julia Neagu) led the team that built the Evals for the first significant LLM-backed product post ChatGPT. You know, that one whose name defined all "human in the loop" AI products ever since? Github Copilot So they’ve been deeply serious about this topic for years. Post Github they decided to go all-in by founding evals company Quotient AI. Their mission since has been to make SOTA Evals techniques accessible for builders (who want to get sh*t done, but in a way that doesn’t compromise the future of their tech). Key Insights from our convo: - CIAI (Continuous Improvement of AI): Audit usage logs to surface gaps in your Knowledge Base or other canonical sources of Ground Truth. Patching gaps to systematically improve Agent/Copilot quality. (Jared Scheel knows all about this) - Monitor Outcome Distributions: Map real-world output distributions to expected output distributions to surface potential issues (Agent has four tools but calls one 99% of the time? Look into that) - Evals ARE your product: Measuring and monitoring quality is more than just “tech debt reduction.” Unless you are OpenAI, Anthropic, Deepseek or some other SOTA lab building foundational AI, the fundamental value of your GenAI product is its ability to STEER foundational AI and ALIGN IT to your end-customer’s needs. Evals are critical to both. - Two-Week Evals Sprint: Bootstrapping evals for a project can feel daunting. Take (balanced) action by predefining the evals objectives/tasks you will execute on, and set hard deadlines to avoid quagmires. - Evaluate Subcomponents: Don’t just evaluate final outputs, isolate and test retrieval pipelines, tool calls – everything upstream from final output has potential side-effects on output. Freddie is a hardcore technical founder and he’s unusually hardcore on this topic – don’t miss it. https://lnkd.in/gHUx3wbD

Evals Best Practices for GenAI Apps | w/ Freddie Vargus Co-Founder and CTO @ QuotientAI

https://www.youtube.com/

3 条评论

赞评论分享
Quotient AI转发了
Julia Neagu

CEO & Co-Founder | Quotient AI
1 个月已编辑
举报此动态
I'm excited to launch Evaluations in Quotient's Python SDK ?? Whether you're testing different models, iterating on prompts, or validating outputs against ground truth data, our platform makes evaluating AI applications so easy it's a no-brainer to integrate it in your routine developer workflows. Why did we build this? Talking with engineers from small AI-native startups to large Fortune 500s, we kept hearing the same challenge: the barrier to setting up a reliable evaluation workflow is so high that most prefer to circumvent that process entirely and test manually or in production. This leads to unpredictable performance, undetected regressions, and ultimately, poor user experiences. We believe developers should focus on building amazing products, not babysitting infrastructure. We've been committed from day one to making comprehensive evaluation as simple as possible - just a few lines of code and a few minutes. Behind the scenes, our distributed infrastructure handles all the heavy lifting asynchronously, so you can get back to what matters. Best of all? This is all included in our comprehensive free tier. We provide ?? 10,000 evaluation rows / month, with inference costs included to get you started. ?? Want to see Quotient Evaluations in action? We've published a practical cookbook showing how we used our SDK to evaluate OpenAI's o1 and DeepSeek R1 models on tax-related questions. Find the release announcement and cookbook in comments ?? Curious to see how Quotient's can transform your AI development workflow? Let's chat! Email me directly at julia @ quotientai.co.
4 条评论

赞评论分享
Quotient AI转发了
Julia Neagu

CEO & Co-Founder | Quotient AI
1 个月已编辑
举报此动态
Just spotted something that made my day: Miguel from On Project Labs in Spain ???? made a kick-ass video about Quotient AI! Miguel described how Quotient helps him solve real challenges in his AI development process: finding the right prompts, comparing different models, and making smart decisions about which providers to use. He even demonstrated how he used it to generate LinkedIn posts! It's so exciting to see Quotient being discovered and shared organically by developers across the globe. Building tools that actually solve problems for people is what it's all about!
Miguel García Tenorio

AI engineer
1 个月

??Herramientas que te salvarán la vida: Parte 1 - Quotient A la hora de desarrollar una aplicación que incluye un LLM o una llamada a OpenAI, siempre me he enfrentado con estos problemas: ? Este es el mejor prompt que podría usar para conseguir lo que necesito? ? Que modelo debería usar? ? Que provider es mejor? OpenAI, Anthropic ? ? gpt-o vs gpt-4?vs gpt-XXXX? Hasta ahora nunca he tenido realmente la respuesta,?y decidir cual es el prompt más optimo o cual es el mejor modelo en cuanto a calidad y precio que podemos utilizar es un proceso de prueba y error infinito. ? Pero, hace unas semanas investigando posibles soluciones encontré esta increíble herramienta. Se llama Quotient AI y te permite evaluar todo esto de una manera muy sencilla. Entre otras cosas ofrece: ? La posibilidad de testear un mismo prompt con distintos modelos con un solo click para ver como se comportan en cada caso ? Te permite dar feedback sobre la respuesta y en base a ese feedback analiza el prompt y propone una mejora de tu prompt ? Analiza el tiempo y el coste de cada modelo para tu prompt ? Permite almacenar prompts y tener un repositorio de prompts que puede compartido por tu equipo e integrado en tu aplicación con su SDK En definitiva, esta herramienta se está convirtiendo en una herramienta indispensable para mi y me permite de manera muy rápida testear mis prompts y decidir que modelo y que prompt utilizar en cada caso. He probado esto con un prompt para generar Posts en Linkedin para que veáis lo sencilla que es esta herramienta y el gran valor que puede aportar ?? Conoces a alguna herramienta parecida? #promptengineering #OpenAi #Anthropic #Evals #InteligenciaAritficial #AI #IA #ChatGPT #GPT #AIengineer #Automation
3 条评论

赞评论分享
Quotient AI转发了
Julia Neagu

CEO & Co-Founder | Quotient AI
1 个月
举报此动态
Curious about how AI search engines like Perplexity, Exa, and Google's Gemini measure up? Check out our new Hugging Face cookbook that walks you through systematically evaluating and comparing their outputs using our open source judges ?? library. This comprehensive tutorial provides step-by-step guidance, real-world examples, and practical tips on using LLM-as-a-judge. Link in comments.
1 条评论

赞评论分享
Quotient AI转发了
Databricks

884,353 位关注者
1 个月已编辑
举报此动态
Keep your dataset a secret – even from yourself. That’s one of Julia Neagu CEO & Co-founder of Quotient AI, “five rules of evaluations” for bringing AI to production. Explore more tips for deploying large-scale AI – including the best tools and infrastructure, methods for reducing bias, and human-in-the-loop systems – in the latest episode of Data Brew. Hosted by Brooke Wenig and Denny Lee. https://lnkd.in/gV6KijTt

2 条评论

赞评论分享
Quotient AI转发了
Data Brew by Databricks

574 位关注者
2 个月
举报此动态
Julia Neagu puts the notion of 'vibe development' to rest in the latest episode of Data Brew. Discover the rules of AI evaluations for bringing products to production in this captivating conversation. Watch the full episode on your favorite podcast platform ?? ?? Apple: https://hubs.ly/Q02Tt0qb0 ?? Spotify: https://hubs.ly/Q02TsYxX0 ?? Youtube: https://lnkd.in/emSpKpuH Data Brew by Databricks is hosted by Brooke Wenig and Denny Lee
赞评论分享
Quotient AI

2,316 位关注者
2 个月
举报此动态
Our CEO Julia Neagu chatted with Brooke Wenig and Denny Lee from Databricks about shipping the best AI products ??

Data Brew by Databricks

574 位关注者
2 个月

New episode loading... ?? Keep your evaluation dataset a secret - even from yourself! Brooke Wenig and Denny Lee are joined by Julia Neagu, CEO and Co-Founder, of Quotient AI in an all-new episode of the Data Brew by Databricks. Tune in tomorrow as we unveil the Secret to Production AI: Tools & Infrastructure.

1 条评论

赞评论分享
Quotient AI转发了
Julia Neagu

CEO & Co-Founder | Quotient AI
2 个月
举报此动态
We've open-sourced two tools to make LLM evaluation straightforward: - judges ??: A collection of research-backed prompts that use LLMs to evaluate other LLMs. Start evaluating your models immediately with battle-tested approaches. - autojudge: Build your own evaluators that match your team's standards. Feed in your labeled data and feedback, get back evaluator prompts aligned with how your team thinks about quality. There's no need to reinvent the wheel – start with proven LLM-as-a-judge prompts, grow into custom ones tailored to your use-case.
6 条评论

赞评论分享
Quotient AI

2,316 位关注者
4 个月
举报此动态
???? we've released `judges` - a OSS library of SOTA, research-backed evaluators for common use-cases like hallucination, harmfulness, and empathy.

Freddie Vargus

co-founder & cto — Quotient AI
4 个月

Today we're releasing judges ?? our new open-source library of LLM-as-a-judge evaluators. judges contains a curated set of evaluators, backed by published research, to bootstrap your LLM projects. Use them out-of-the-box or as a foundation for your own custom evaluators??? Why judges? We've spoken to a lot of folks who want to use LLM-as-a-judge but don't know where to start. judges is here to help! These evaluators are not silver bullets, but they're good starting points and sources of inspiration for building your own. Give it a try and let us know what you think!?https://lnkd.in/e2GWN_CR?-- we also invite people to request improvements, suggest papers, and report bugs on our GitHub. What's next -- we want to integrate with more models and are keeping track of single and multi-step research for evaluators.We also want to hear from people on what use cases for LLM as a judge they have, and are looking to integrate our work from SMELL to make it easier for people to bootstrap custom LLM judges (blog:?https://lnkd.in/e6wWBqks)

GitHub - quotient-ai/judges: A small library of LLM judges

github.com

赞评论分享
Quotient AI转发了
Qdrant

34,689 位关注者
4 个月
举报此动态
??? RAG Systems: Build It, Then Break It (For Science) Building a RAG system is just the beginning. The real challenge? Making sure it performs consistently under pressure. Missed retrievals? Hallucinated answers? Poorly optimized pipelines? These are the bottlenecks that turn promising systems into frustrating ones. In our latest blog, we break down what it takes to go from “it works” to “it works well.” ? Spot retrieval blind spots using precision metrics and relevance testing. ? Fine-tune embedding strategies to ensure accurate context is passed to your LLM. ? Measure and reduce hallucination rates, so your LLM generates grounded, factual responses. We’re talking real frameworks (Ragas, Quotient AI, Arize AI Phoenix), specific fixes for underperforming pipelines, and concrete metrics like NDCG and Recall to measure success. ?? Learn how to evaluate your RAG system: https://lnkd.in/ez2DEEhj
赞评论分享

相似主页

查看职位

登录看看您认识Quotient AI的哪些人

Quotient AI

软件开发

Boston，Massachusetts 2,316 位关注者

Build better AI products fast.

关于我们

地点

Quotient AI员工

Abbas Haider Ali

SVP @ GitHub ($2B+ ARR) leading post-sales incl. Customer Success, Professional Services, Support, Renewals, Strategy, and CS Engineering orgs |…

Deanna Emery

Founding AI Researcher at Quotient AI | Associate Board Member at Illinois Science and Technology Coalition

Julia Neagu

CEO & Co-Founder | Quotient AI

Jessica Carlson

Building exciting things with excellent people.

动态

Evals Best Practices for GenAI Apps | w/ Freddie Vargus Co-Founder and CTO @ QuotientAI

https://www.youtube.com/

立即加入，查看您错过的职场动态

相似主页

Quotient

Abacus.AI

GitHub

Maven AGI

Qdrant

Databricks

Patronus AI

Arcee AI

Stealth Startup

Fireworks AI

查看职位

科学家职位

机器学习工程师职位

分析师职位

实习生职位

工程师职位

Ruby On Rail开发者职位

助理软件工程师职位

人才引进职位

Python 开发员职位

高级产品设计师职位

设计总监职位

高管职位

前端开发工程师职位

产品设计师职位

运营职位

定价总监职位

项目管理实习生职位

解决方案顾问职位

解决方案工程师职位

IT 支持专员职位