Haize Labs

Haize Labs

科技、信息和网络

New York,NY 665 位关注者

it's a bad day to be a language model

关于我们

Haize Labs is the trust, safety, and reliability layer underpinning AI models in every industry and use case. By haizing (i.e. stress-testing and red-teaming) to discover and eliminate all failure modes, we enable the risk-free adoption of AI.

网站
https://haizelabs.com/
所属行业
科技、信息和网络
规模
2-10 人
总部
New York,NY
类型
私人持股

地点

Haize Labs员工

动态

  • Haize Labs转发了

    查看Sahar Mor的档案,图片

    I help researchers and builders make sense of AI | ex-Stripe | aitidbits.ai | Angel Investor

    Two new jailbreaking techniques highlight how fragile state-of-the-art LLMs like GPT-4 are. The first from Haize Labs introduces a new attack method called Bijection Learning. The irony? The more advanced the underlying model is, the more successful the attack is. Bijection Learning uses custom-encoded languages to trick models into unsafe responses. Unlike previous jailbreak methods, it dynamically adjusts complexity to exploit small and large models alike without manual intervention. In their tests, even Claude 3.5 Sonnet, a model heavily fine-tuned for safety, was compromised with a staggering 86.3% attack success rate on a challenging dataset (HarmBench). It works by generating a random mapping between characters (a “bijection language”) and training the model to respond in this language. By adjusting the complexity of this mapping—such as changing how many characters map to themselves or using unfamiliar tokens—researchers can fine-tune the attack to bypass safety measures, making it effective even against advanced models. Full post https://lnkd.in/gtRysbTt The second method, by researchers at EPFL, addresses refusal training. The researchers discovered that simply rephrasing harmful requests in the past tense can often bypass safety mechanisms, resulting in an alarmingly high jailbreak success rate. For instance, rephrasing a harmful query in the past tense boosts the success rate to 88% on leading models, including GPT, Claude, and Llama 3. This mainly happens because supervised fine-tuning (SFT) and reinforcement learning with human feedback (RLHF) don’t always generalize well to subtle linguistic changes like tense modification. Neither of these techniques consistently equips the models to handle adversarial or unexpected reformulations, such as rephrasing harmful queries into the past tense. These studies highlight an alarming trend: as AI models become more capable, they also become more vulnerable to sophisticated jailbreaks. Attack #1: Bijection Learning https://lnkd.in/gtRysbTt Attack #2: Refusal training generalization to past tense https://lnkd.in/ggxnNGQ2 — Join thousands of world-class researchers and engineers from Google, Stanford, OpenAI, and Meta staying ahead on AI https://aitidbits.ai

    • 该图片无替代文字
  • 查看Haize Labs的公司主页,图片

    665 位关注者

    couldn't be more ecstatic to have Constantin Weisser, PhD on the team!

    查看Constantin Weisser, PhD的档案,图片

    AI safety testing @ Haize | MIT PhD Physics, Stats | ex McKinsey AI

    Hi all, After over three exciting years at McKinsey, it's time for me to move on to a new chapter. I'm incredibly grateful for the opportunity to drive business impact by building AI solutions across six different industries. During this time, thanks to the support of many generous colleagues, I've grown from an academic into a professional who can solve business problems with technical solutions, productionalize them, and manage teams. A huge thank you to everyone who has been part of this journey! I am excited to come to New York City to join Haize Labs as a Member of Technical Staff and employee #1. Haize aims to revolutionize safety testing of large language models through automated redteaming, precise evaluations, and guardrails for safer usage (https://shorturl.at/Mfr5j). I can't wait to see where this story takes us. If you are passionate about working towards safer AI systems, please reach out!

    • 该图片无替代文字
  • Haize Labs转发了

    查看Leonard Tang的档案,图片

    co-founder & ceo @ haize labs

    Excited to share a deep dive of the red-teaming research we've been doing for OpenAI at Haize Labs ?? In month before release before the new o1 series, we've been rigorously haizing (stress-testing) the safety and adversarial robustness of their models. Many thanks to Nathan Labenz for having us on the Cognitive Revolution podcast to chat about this work! Listen here for the details about our research and engineering intuitions for automated red-teaming of frontier AI systems. Shoutout especially to Brian Huang and Aidan Ewart for their amazing automated red-teaming efforts ??

    Red Teaming o1 Part 1/2–Automated Jailbreaking w/ Haize Labs' Leonard Tang, Aidan Ewart& Brian Huang

    Red Teaming o1 Part 1/2–Automated Jailbreaking w/ Haize Labs' Leonard Tang, Aidan Ewart& Brian Huang

    cognitiverevolution.ai

  • Haize Labs转发了

    查看Akhil Paul的档案,图片

    Startup Helper | Advisor | Caparo Group

    ??????????????????: Business Insider just dropped a list of 85 of the most promising startups to watch for 2024… ??Some of the largest outcomes (Amazon, Airbnb, Stripe, Uber etc) we’ve seen in ?????????????? ?????????????? have come on the back of hard times. ?? Only the strongest & most resilient teams survive. ?? Business insider asked top venture capitalists at firms including Accel, GV, Founders Fund, Greylock, Khosla Ventures & IVP to name the startups they’re most excited by. ?? Flagging 4 companies from the list I work with that are worth having on the radar?? —— 1?? Gamma - An AI-powered content creation tool for enterprise customers. The tool enables users to create presentations, websites, and documents quickly. ?? Founders: Grant Lee, Jon Noronha, James Fox ?? Funding: $21.5Mn ?? Investors: Accel, LocalGlobe, Script Capital ?? Why it’s on the list? Gamma has already amassed 20Mn+ users with 60M+ gammas created, and is profitable. It is reinventing creation of websites, presentations & documents. —— 2?? Sema4.ai - It is building enterprise AI agents that can reason, collaborate, and act. ?? Founders: Antti Karjalainen, Paul Codding, Sudhir Menon, Ram Venkatesh ?? Funding: $54Mn ?? Investors: Benchmark, Mayfield Fund, Canvas Ventures ?? Why it’s on the list? The company is developing AI agents that can move beyond simple repetitive tasks and actually solve real-world problems, taking into account the unique context of the organization and working seamlessly with existing teams. —— 3?? Mutiny - An account-based AI platform that helps companies unify sales and marketing to generate pipeline and revenue from their target accounts at scale. ?? Founders: Jaleh Rezaei, Nikhil Mathew ?? Funding: $72Mn ?? Investors: Sequoia, Tiger, Insight, Cowboy Ventures ?? Why it’s on the list? Mutiny leverages AI to help B2B companies generate pipeline & revenue from their target accounts through AI-powered personalized experiences, 1:1 microsites & account intelligence - more important than ever in the current software consolidation cycle / market & budget environment. —— 4?? Haize Labs - Automatic stress testing of large language models. ?? Founders: Leonard Tang, Steve Li ?? Why it’s on the list? Haize promises to "robustify" any large language model through automated red-teaming that continuously stress tests and identifies vulnerabilities. As models evolve, the question of how to make sure they're secure becomes increasingly difficult to answer. —— ?? The biggest themes on the list this year? ?? Data infrastructure ?? Security ?? Personalised agents —— ??It’s helpful for ???????????? to keep on top of lists like these to track & source new opportunities. ?? to the FULL list in comments. ?? Any companies not on the list that should be? Tag them in comments. ?? PS- If you enjoyed this & found it valuable, ????follow me Akhil Paul for more! #startups #venturecapital #investing #technology #angelinvesting

    • 该图片无替代文字
  • Haize Labs转发了

    查看Akash Bajwa的档案,图片

    Principal at Earlybird Venture Capital

    A consistent theme in discussions I've had with AI Application founders is a request for red teaming solutions. Combined with regulatory frameworks like the EU AI Act adding further pressure, there's more attention going towards both human and model-based ways of scaling red teaming of LLMs where the immense combinatorial space is difficult to address. Protect AI acquired SydeLabs to expand their AI security suite last week, but there are others like Haize Labs, Promptfoo and more devising innovative new ways of scaling automated red teaming without the associated trade off in model performance. https://lnkd.in/en5_9jMP

    Red Teaming As A Service

    Red Teaming As A Service

    akashbajwa.substack.com

  • Haize Labs转发了

    查看Marktechpost Media Inc.的公司主页,图片

    5,612 位关注者

    Haize Labs Introduced Sphynx: A Cutting-Edge Solution for AI Hallucination Detection with Dynamic Testing and Fuzzing Techniques Haize Labs has recently introduced?Sphynx, an innovative tool designed to address the persistent challenge of hallucination in AI models. In this context, hallucinations refer to instances where language models generate incorrect or nonsensical outputs, which can be problematic in various applications. The introduction of Sphynx aims to enhance the robustness and reliability of hallucination detection models through dynamic testing and fuzzing techniques. Hallucinations represent a significant issue in large language models (LLMs). These models can sometimes produce inaccurate or irrelevant outputs despite their impressive capabilities. This undermines their utility and poses risks in critical applications where accuracy is paramount. Traditional approaches to mitigate this problem have involved training separate LLMs to detect hallucinations. However, these detection models are not immune to the issue they are meant to resolve. This paradox raises crucial questions about their reliability and the necessity for more robust testing methods. Haize Labs proposes a novel “haizing” approach involving fuzz-testing hallucination detection models to uncover their vulnerabilities. The idea is to intentionally induce conditions that might lead these models to fail, thereby identifying their weak points. This method ensures that detection models are theoretically sound and practically robust against various adversarial scenarios..... Read our full take on Sphynx: https://lnkd.in/gQjNJ8Ww Check out their GitHub Page: https://lnkd.in/gZ_fwhEe Haize Labs Leonard Tang

    Haize Labs Introduced Sphynx: A Cutting-Edge Solution for AI Hallucination Detection with Dynamic Testing and Fuzzing Techniques

    Haize Labs Introduced Sphynx: A Cutting-Edge Solution for AI Hallucination Detection with Dynamic Testing and Fuzzing Techniques

    https://www.marktechpost.com

  • 查看Haize Labs的公司主页,图片

    665 位关注者

    introducing Sphynx - the leading hallucination haizing algorithm????? - breaks SOTA hallucination detection models (HDM) - open source, open data - surfaces critical hallucinations in high-stakes domains - enables adversarial training for more robust hallucination detection the failure cases that Sphynx produces can be used as adversarial training data to make HDMs more robust. rather than wait for HDMs to fail in the wild, Sphynx forces these failures to happen in development. only by hazing can you achieve truly reliable HDMs and AI systems check out Sphynx here: https://lnkd.in/gZ_fwhEe twitter thread:?https://lnkd.in/gXi5kRkN

    GitHub - haizelabs/sphynx: Sphynx Hallucination Induction

    GitHub - haizelabs/sphynx: Sphynx Hallucination Induction

    github.com

  • 查看Haize Labs的公司主页,图片

    665 位关注者

    thanks for believing in us Ben ??

    查看Ben Lang的档案,图片

    Angel investing, early at Notion

    *30 companies I’m an investor in are now hiring* I'd love to get you in front of these companies if you're exploring your next move. Reach out, and I will do my best to help. Full list here: 1) PromptLayer - prompt engineering platform (NYC) 2) Hero - AI selling (remote) 3) Natoma - non-human identity protection (Boston / Bangalore / Bay Area) 4) Pocus - signal based selling (Bay Area / NYC / remote) 5) Jitter - motion design (Paris) 6) Masterschool - network of career training schools (Tel Aviv / Germany) 7) Gelt - tax optimization and management (Tel Aviv / remote) 8) Luma - events platform (remote) 9) Gynger - tech payments on your terms (NYC / Tel Aviv) 10) SwagUp - swag platform (NYC) 11) Equals - next generation analysis and reporting (remote US) 12) Landa - fractional real estate investing (NYC / Tel Aviv) 13) Gestalt - human data interface (NYC) 14) ServiceBell - video live chat (remote) 15) Kick - self-driving bookkeeping (remote) 16) Yuzu Health - custom-built employer health plans (NYC) 17) Passionfroot - creator brand deals platform (Berlin / remote) 18) Deeptune - AI dubbing (NYC) 19) Alta Revenue Platform - revenue intel / business observability (Tel Aviv) 20) Trace Machina - automation infrastructure (Bay Area) 21) Modyfi - design platform (remote) 22) SellScale - outbound automation (Bay Area) 23) OwnID - password-free tech (Tel Aviv) 24) Gamma - AI-powered decks / sites (remote) 25) Balance - b2b transaction lifecycle (NYC / Tel Aviv) 26) Givebutter - fundraising platform (remote / Austin) 27) Hotplate - storefront for food drops (Bay Area) 28) Deel - global payroll (remote) 29) Daytona - dev environment manager (remote) 30) Haize Labs - AI safety (NYC) ?? follow me Ben Lang for more lists like these ___ #startups #careers #jobs

  • 查看Haize Labs的公司主页,图片

    665 位关注者

    Today, we emerged out of stealth to tackle the longstanding problem of AI reliability. In the era of grossly excessive AI hype and demoware, it is high time that someone recalibrated and revisited the difficult, unsexy, underlying problem that everybody is avoiding -- the AI reliability and safety problem. We have never really believed in LLMs, or neural nets, or deep learning, favoring PPLs, graphical models, and interpretable architectures that could admit true reasoning. In the murky depths of LLM land, we lose structure and control. And this is how you end up with scenarios like Google AI search telling you to eat rocks, or the Chai chatbot inducing a man's suicide, or the Air Canada lawsuit, just to name a few... This is why we need to haize models. that is -- we need to rigorously test them to discover all of their vulnerabilities, failure modes, and "gotchas" before they get deployed in production. Today we showcased one particular application of haizing to safety red-teaming (https://lnkd.in/eGWnevbb); that is, eliciting harmful behaviors from safety-trained models. This is just a start (albeit a very illustrative start). Preventing misuse of AI systems is an absolute must given the pace at which models are accelerating. But we're not stopping here. In the very near future, will be the underlying robustness and safety layer for any model, in any setting, for any use case. Hope you are excited join us for the long and exciting journey ahead.

    Haize Labs (@haizelabs) on X

    Haize Labs (@haizelabs) on X

    x.com

相似主页