MAIHEM (YC W24)

数据基础架构与分析

San Francisco，California 1,200 位关注者

Industry-leading quality testing for mission critical AI applications | YC W24 | www.maihem.ai

关注

查看全部 7 位员工

关于我们

At MAIHEM, we create AI agents that continuously test AI products, such as conversational AI chat- and voice bots. We help companies improve and stress-test their AI products – automating quality assurance, red-teaming, and customer experience optimization.

网站: https://www.maihem.ai/
MAIHEM (YC W24)的外部链接
所属行业: 数据基础架构与分析
规模: 2-10 人
总部: San Francisco，California
类型: 私人持股
创立: 2023
领域: LLM analytics、generative AI analytics、AI safety、AI robustness、AI user analytics、NLP、LLMs、Predictive customer analytics、Business analytics、AI analytics、AI fairness、AI regulation readiness、risk evaluations、performance evaluations、fine-tuning、synthetic data、synthetic data simulations、AI risk simulations、AI agents、AI quality assurance、quality assurance、AI compliance、AI regulation和automated quality assurance

地点

主要

2261 Market St

STE 5732

US，California，San Francisco，94114

获取路线
124 City Road

GB，England，London，EC1V 2NX

获取路线

MAIHEM (YC W24)员工

Lorcan Delaney

Principal at firstminute capital | ex-Morgan Stanley IB | Oxford CS
Eduardo Candela

Co-Founder @ MAIHEM (YC W24) | PhD AI Safety, Imperial | MIT alum | ex-Tesla
Gabriele Morello

Founding AI Engineer @ MAIHEM (YC W24) | prev. CERN, KTH
[Jayce] Lye Jia Jun

AI Red Team Engineer @ MAIHEM.ai (YC W24) | AI Safety Researcher | SMU Information Systems Undergraduate | NYP Top Cybersecurity Graduate | Former…

查看全部员工

动态

MAIHEM (YC W24)

1,200 位关注者
1 个月
举报此动态
We had a fantastic time hosting our “How (not) to test AI” event at TECH WEEK by a16z. A big thank you to all our wonderful attendees ??. We weren’t able to host everyone as we had received hundreds of applications. But we promise to host more events of this sort soon - in both San Francisco and London. Follow us to stay tuned! ??The tl;dr of our event ?? The key framework to implementing successful and controllable AI applications: 1. Generate test datasets relevant for your AI use case and company 2. Test your AI application systematically with those datasets 3. Improve your AI models based on those tests e.g. via adjusting prompts or fine-tuning 4. Run this process continuously At Maihem, we provide you a platform to automate all of that, so you can build AI that works - responsibly and at scale. Reach out to us to learn how we can help your organization set up our industry-leading AI quality assurance and security testing solution. A big shoutout to our sponsors and supporters for this event at Urban Innovation Fund and Presidio Legal, P.C. See you at our next event! ?? #SFTechWeek #AI #AIsafety #AIsecurity #AIquality #qualityassurance #LLMs #YC
赞评论分享
MAIHEM (YC W24)

1,200 位关注者
5 天前
举报此动态
Exciting to be at KI.ckstart this week! Meet our CEO Max Ahrens there as he is speaking about the future and best practices for testing AI and making genAI applications ready to be deployed by companies.
?? Joshua Heller

KI in der Softwareentwicklung ?? | Full-Stack Entwickler | Innovationsberater @ generic.de
6 天前

?? Nicht der Hype, sondern der Nutzen z?hlt – was generative KI wirklich bringt Zusammen mit Artur Felic hatte ich die Gelegenheit gestern, beim KI.ckstart Gen AI Event dabei zu sein – und ich muss sagen: Es war eine geballte Ladung Inspiration, Praxiswissen und Netzwerken rund um generative KI und ihre Anwendungsm?glichkeiten. ?? Eine zentrale Botschaft: ?Nicht bei der L?sung anfangen, sondern beim Problem.“ Dieser Satz von Alexander Britz (Microsoft Deutschland) zog sich wie ein roter Faden durch die Keynotes und Diskussionen. Auch Rémi Denoix und Kirstin Heinl (Zukunftszentrum Süd) betonten, dass der Schlüssel zu erfolgreichen KI-Projekten darin liegt, nicht bei der L?sung anzufangen, sondern beim Problem. ?? Mein Highlight: Der praxisnahe Einblick in die sichere Entwicklung von LLM-Anwendungen. Max Ahrens von MAIHEM (YC W24) zeigte, wie wichtig Testing und Monitoring sind, um Risiken wie Halluzinationen, Prompt-Injections oder sensible Themen, wie Politik in LLM-Applikationen zu reduzieren. Für mich, der viel im LLM-Bereich prototyped und forscht, ein sehr spannendes Thema! KI ist ein m?chtiges Werkzeug – aber nur, wenn wir es richtig einsetzen, wird es uns und unsere Unternehmen voranbringen. ?? Mein Takeaway: KI bietet gigantische Chancen, aber die Frage sollte immer zuerst lauten: Welches Problem will ich l?sen? ?? Wie sieht’s bei Euch aus? Welches Problem geht Ihr gerade mit generativer KI an?
赞评论分享
MAIHEM (YC W24)

1,200 位关注者
2 周
举报此动态
Learn how to detect and mitigate AI hallucinations with our industry-leading AI testing and monitoring solutions. ?? In our latest article, we discuss how we leverage cutting-edge AI research findings to build AI hallucination detectors that beat all benchmark models. ?? Our solution: A Two-Pass Detection Method inspired by Map-Reduce. Multiple LLM systems analyze and categorize each claim as supported, unsupported, contradicted, or inferred. Results are then automatically consolidated and discrepancies being resolved – yielding an AI hallucination detector that yields state-of-the-art results in the most challenging claim verification benchmarks. ? Benefits: 1. Highest accuracy: Catching critical hallucinations in your AI product. 2. Scalable & efficient:?Cost-effective processing at large scale. 3. Reliable & interpretable outputs:?Enhancing the trustworthiness of your AI-generated content and of our evaluations. ?? Read the full article here to learn how we can help your organization with our industry-leading approach to AI hallucination detection: https://lnkd.in/e-AdfG5P #AI #genAI #AItesting #AIqualityassurance #hallucinations #AIreliability #trustworthyAI #AIsafety

Detecting Hallucinations in Retrieval-Augmented Generation (RAG) Systems: A Two-Pass Approach

maihem.ai

赞评论分享
MAIHEM (YC W24)

1,200 位关注者
1 个月
举报此动态
Nothing excites us more than happy customers! ?? That‘s why we‘re putting in the hard work every single day: building something people want - building the future for testing AI applications. Meet Risotto (YC W24): they are building at the bleeding edge of technology, creating AI-powered IT co-pilots to streamline IT support. Companies like Medium and Retool use Risotto already. We‘re incredibly proud to help Risotto build AI products that meet the highest quality and security standards. If you’re building AI-powered applications - from chatbots, to co-pilots, to AI agents - check out Maihem to future-proof your AI development and testing: ??test your AI products systematically, continuosly, and automatically ? ship AI products and features faster ? build AI products that meet the highest quality standards. ?? Book a call with us to learn how we can help your company: https://lnkd.in/e-YBay2p
Chris Paul

Co-Founder and CTO @ Risotto (YC W24) | AI-powered IT support
1 个月

Let agents test your agents! ?????? Imagine my surprise when a test agent found an edge case in Risotto (YC W24) I hadn't considered: someone having trouble accessing an app they should have access to. Risotto applied the IT adage: turn it off and back on again.???? This week, I turned to MAIHEM (YC W24) to help us automate our smoke tests to prevent regressions because we're shipping fast and growing more complex by the day. It was easy to set up and the MAIHEM team has been super responsive. This feels like the future of testing, especially if you're building a product with a natural language interface. How much of your QA process is automated? How’s it working?
赞评论分享
MAIHEM (YC W24)

1,200 位关注者
1 个月
举报此动态
Check out how Chris Paul the Co-Founder of Risotto (YC W24) uses Maihem to test their very novel and complete IT support AI agent! Risotto is one of our favorite AI use cases to test – mission critical AI agents.
Chris Paul

Co-Founder and CTO @ Risotto (YC W24) | AI-powered IT support
1 个月

Let agents test your agents! ?????? Imagine my surprise when a test agent found an edge case in Risotto (YC W24) I hadn't considered: someone having trouble accessing an app they should have access to. Risotto applied the IT adage: turn it off and back on again.???? This week, I turned to MAIHEM (YC W24) to help us automate our smoke tests to prevent regressions because we're shipping fast and growing more complex by the day. It was easy to set up and the MAIHEM team has been super responsive. This feels like the future of testing, especially if you're building a product with a natural language interface. How much of your QA process is automated? How’s it working?
赞评论分享
MAIHEM (YC W24)

1,200 位关注者
1 个月
举报此动态
This is a great short article to learn about the top 10 vulnerabilities of LLM applications, and how to test and address them.

MAIHEM (YC W24)

1,200 位关注者
1 个月

AI applications are transforming industries, but they also bring new security challenges. In our latest article, we dive into how to test and secure the OWASP Top 10 vulnerabilities for LLMs using Maihem’s automated AI testing platform—so you can deploy your AI applications with confidence! ???? ?? Read the full article here:?https://lnkd.in/eNKPWkZD #AI #Security #LLM #Cybersecurity #MachineLearning #OWASP #LLMSecurity #AITesting #AISecurity

How to Test the OWASP Top 10 Critical Vulnerabilities for LLMs

maihem.ai

赞评论分享
MAIHEM (YC W24)

1,200 位关注者
1 个月
举报此动态
AI applications are transforming industries, but they also bring new security challenges. In our latest article, we dive into how to test and secure the OWASP Top 10 vulnerabilities for LLMs using Maihem’s automated AI testing platform—so you can deploy your AI applications with confidence! ???? ?? Read the full article here:?https://lnkd.in/eNKPWkZD #AI #Security #LLM #Cybersecurity #MachineLearning #OWASP #LLMSecurity #AITesting #AISecurity

How to Test the OWASP Top 10 Critical Vulnerabilities for LLMs

maihem.ai

赞评论分享
MAIHEM (YC W24)

1,200 位关注者
1 个月
举报此动态
Overwhelmed on how to properly test your AI applications? You have a feeling that 'eyeballing it' might not be industry best practice? Your feeling is right. At Maihem, we provide you with a platform to automate quality and security testing, so you can build AI products that work - responsibly and at scale. Happy Friday, everyone. ??
赞评论分享
MAIHEM (YC W24)

1,200 位关注者
1 个月
举报此动态
Our CEO Max Ahrens spoke at the AI agent panel during TECH WEEK by a16z hosted by Untapped Ventures and Georgian! Check out the highlights in the post below!
George Bandarian

Driving AI Innovation as General Partner, Untapped Ventures | AI Keynote Speaker | Proud Husband & Father of 3 Boys
1 个月

What an incredible event at SF TECH WEEK by a16z, with over 170 attendees! From thought-provoking discussions to building meaningful connections, this was truly a ?? event to remember! We were honored to co-host with Georgian and bring together some of the brightest minds in VC and AI for an unforgettable gathering focused on Agentic and Vertical AI. A massive thank you to our amazing lineup of speakers: Jon Chu (Khosla Ventures), Nahim Nasser (Georgian), Pranav R. (Conviction), Alejandra Vergara (Bee Partners), Arash Afrakhteh (Pear VC), Andrew Brackin (Gradient), Saurabh Sharma (You.com), Ashar Rizqi (Bounti.ai), Jacky W.(Nullify), Aria Attar (TensorStax), Jonas Diezun (Beam AI) and Max Ahrens (MAIHEM (YC W24). Check out some of the highlights below! ?? If you're in LA next week, we're hosting another amazing panel at LA Tech Week and would love to have you there! Register here: https://lnkd.in/grCuhnZt #SFTechWeek #AgenticAI #VerticalAI #AIInnovation #VC #AIVC #techweek
- +3
赞评论分享
MAIHEM (YC W24)转发了

MAIHEM (YC W24)

1,200 位关注者
1 个月已编辑
举报此动态
We're hosting an exclusive #SFTechWeek lunch event for founders and senior technical leaders from VC-backed startups, scale-ups, and established companies that build LLM-based applications. Oct 9, 1pm PT, San Francisco. We'll be keeping the audience size small (max. 12 attendees, 1 attendee per company) to create an intimate gathering, focusing on candid conversations and valuable knowledge-sharing among senior decision makers around quality testing for AI applications. Chatham House rules. Attendees will be carefully selected to ensure a diverse mix of leaders actively shaping the future of AI. Apply here to attend: https://lnkd.in/erK6zyR6 Few spots remaining. #AI #AIquality #AIsafety #AIsecurity #SFTechWeek #YC

RSVP to AI pioneers lunch – how (not) to test AI products | Partiful

partiful.com

3 条评论

赞评论分享

相似主页

融资

MAIHEM (YC W24) 共 1 轮

上一轮

种子前 2024年5月3日

US$500,000.00

投资者

Y Combinator

在 Crunchbase 上查看更多信息

查看关于MAIHEM (YC W24)的洞察

MAIHEM (YC W24)

数据基础架构与分析

San Francisco，California 1,200 位关注者

Industry-leading quality testing for mission critical AI applications | YC W24 | www.maihem.ai

关于我们

地点

MAIHEM (YC W24)员工

Lorcan Delaney

Principal at firstminute capital | ex-Morgan Stanley IB | Oxford CS

Eduardo Candela

Co-Founder @ MAIHEM (YC W24) | PhD AI Safety, Imperial | MIT alum | ex-Tesla

Gabriele Morello

Founding AI Engineer @ MAIHEM (YC W24) | prev. CERN, KTH

[Jayce] Lye Jia Jun

AI Red Team Engineer @ MAIHEM.ai (YC W24) | AI Safety Researcher | SMU Information Systems Undergraduate | NYP Top Cybersecurity Graduate | Former…

动态

Detecting Hallucinations in Retrieval-Augmented Generation (RAG) Systems: A Two-Pass Approach

maihem.ai

How to Test the OWASP Top 10 Critical Vulnerabilities for LLMs

maihem.ai

How to Test the OWASP Top 10 Critical Vulnerabilities for LLMs

maihem.ai

RSVP to AI pioneers lunch – how (not) to test AI products | Partiful

partiful.com

立即加入，查看您错过的职场动态

相似主页

Infinity AI (YC W24)

Openmart (YC W24)

Sonia (YC W24)

Upsolve AI (YC W24)

Quivr (YC W24)

Miden (YC W24)

Tusk

GetCrux (YC W24)

Piramidal (YC W24)

CodeAnt AI (YC W24)

融资