When 1 is Bigger than 4 for AI
I asked ChatGPT about the numbers 1 & 4. Which one is bigger?
Sometimes, 1 was bigger. Other times, 4 was bigger.??
Sharon Zhou ran this experiment at scale?to showing the order of yes & no matters in the response.
This is called a non-deterministic or stochastic answer. Similar inputs do not consistently produce identical outputs. The answers have inconsistent logic.
We live with stochastic systems daily : weather reports, ETAs on Google maps, stock portfolio construction. We are stochastic - humans can be moody, err in our calculations, or change our minds with new information.
领英推荐
In these conversations, the robot is sometimes wrong, but never in doubt. When a system produces an answer, we should verify the answer is correct. It’s not just logical errors that occur: hallucinations, when the system invents answers that don’t exist,?plagued about half of Bing chat results in this Stanford study.
We haven’t calibrated ourselves to the level of doubt to express, yet. Like working with a new colleague, we need to understand their strengths & weaknesses.
For consumers, the universe of acceptable outcomes can be quite broad. A?rabbit on top of a fire truck?has many acceptable answers.
But in the B2B world, consistency matters. Businesses using genAI will demand consistent answers to prompts like these : what is the company’s revenue by region? Or how do I reset my password? Or how much would I pay if I used a 1000 units of a product?
GenAI will need to write, create, & calculate with a significantly better error rate than humans.
I’m working with?ProductBoard to understand how different B2B startups are planning to leverage AI with a survey. If you’re integrating GenAI into your product & interested to hear others’ plans, please fill it out, & we’ll send you the anonymized raw data. Look for the results to be published in a few weeks.
SEO Expert, OnPage Offpage SEO, Blogger, WordPress Editor,Website Designer, Proud to be SEBT 3, Tanveer Nandla Student.
1 年How to Access ChatGPT 4 for Free, https://www.youtube.com/@learnifyai/ visit the YouTube video section, watch the video title there, remember me in your prayers,
Analyst
1 年Vam kar povem..4 sploh ne obstaja..obstaja pa 4 plus 1..
Passionate business coach for start ups and scale ups based in Belgium. Focus 360° - growth strategy, marketing, finance, funding, HR, technology, ?Feet on the ground, Head in the sky?
1 年Thanks for sharing. Quite confronting for many tech companies jumping on AI ??. Your survey outcome will be useful.
Analytics and AI Exec | Innovative mindset, technical depth, and business leadership
1 年Tomasz Tunguz The prompting challenges are solvable by using/calling plugins like the Python code interpreter. The bigger problem for Enterprise I think is that the underlying data will need to be highly abstracted (cleansed, merged etc, outliers removed, field headings clearly labeled) in order for it to be usable by a gpt interface. Data analysts/scientists refer to this data prep as the "80%" of the work and it remains (for now) a highly manual task.
Founder, advisor, investor | Working on LLM-powered products
1 年Great point. I think developer tools for LLMs, especially around robustness and predictability, are so critical and yet very underdeveloped today. We need these to actually integrate LLMs directly into our workflows.