ChatGPT 4.0, first experiences

ChatGPT 4.0, first experiences


First experiences with the new LLM model from Open?AI

I looked into these aspects of the new version 4 of ChatGPT, comparing it to its older sibling ChatGPT V 3.5:

  • Reasoning: Will it be able to explain why is a joke funny?
  • Creativity: Can it tell me a joke? a real new joke?
  • Analytics: Can it solve a riddle?
  • Self know how: How much does it know about itself.

No alt text provided for this image
Chat GPT version comparison as documented by OpenAI

Here are my findings:

  • Reasoning: ChatGPT4.0 got the joke, ChatGPT3.5 did not
  • Creativity: ChatGPT4.0 does a better job.
  • Analytics: ChatGPT4.0 is a better programer than ChatGPT3.5
  • Self know how: Because of outdated data, it does not know who it is.

Below the details:

Reasoning

I went and looked at some articles that had been written about the new capabilities of the algorithm. They mentioned a few interesting facts. The first was that the model was able to explain jokes. So I did an experiment. I asked it the following question to both 3.5 and 4:

Why is this funny: Why did the tomato turn red? Because it saw the salad dressing!

No alt text provided for this image
Explained joke ChatGPT3.5 vs ChatGPT4

I wondered what would an older version tell me about the joke, so I asked old ChatGPT 3 to explain it. This is the fumbled down version that I got:

This joke is funny because it plays with the idea of a tomato changing color as a result of an emotional response. The punchline, “Because it saw the salad dressing!”, is a play on the phrase “Because it saw the light!”, which is often used to describe a sudden realization or change in perception. In this case, the joke is that the tomato turned red because it was excited or stimulated by the sight of salad dressing, which is a common condiment for vegetables. The humor comes from the absurdity of the idea that a vegetable could have such a strong emotional response.

Compare this to the right answer given by ChatGPT4:

This joke is considered funny because it relies on a play on words and anthropomorphism. The humor comes from giving human-like qualities to the tomato, such as the ability to feel embarrassment.
In the joke, the tomato “turns red” because it “sees the salad dressing.” The punchline can be interpreted in two ways:
1) Literally, tomatoes turn red as they ripen, which is a natural process.
2) Figuratively, the tomato feels embarrassed (like a person might blush or turn red) because it “saw” the salad dressing, as if it caught someone undressing.
By mixing the literal meaning with a humorous, unexpected twist, the joke creates a mental image that many people find amusing.

So, if we can judge reasoning by how to explain a Joke, ChatGPT4.0 is the winner hands down.

Creativity

Then I asked both to create a joke that does not exist yet:

No alt text provided for this image
New joke by ChatGPT3 (left) and ChatGPT4 (right)

It looks as if they need to improve on the humor skills of the system.

I then gave it the task to create a digital ad announcing the launch of ChatGPT-4 describing the graphical elements that it would use including captions and text:

Title: “Introducing ChatGPT-4: A Revolution in AI Conversations”
Caption: “Experience the next-generation AI with seamless communication, enhanced creativity, and unparalleled intelligence.”
Visual Elements:
Background: A blend of radiant colors to symbolize the fusion of human-like thinking and advanced technology. The color palette consists of electric blue and purple, transitioning from dark to lighter shades.
ChatGPT-4 Logo: The logo is prominently displayed in the center, showcasing a combination of a speech bubble and a stylized “4”, which indicates the fourth version of ChatGPT. The logo uses a sleek and modern font, with a gradient matching the background colors.
AI Illustration: A holographic human-like figure, with a futuristic headset or glasses, represents the seamless integration of AI and human communication. The figure appears to be interacting with a virtual chat interface, emphasizing the primary function of ChatGPT-4.
Feature Highlights: A list of key features displayed using modern icons and concise text:
Enhanced Context Understanding
Unparalleled Creativity
Improved Multilingual Support
Robust Data Privacy & Security
Call to Action (CTA): A prominent and eye-catching button encouraging users to “Try ChatGPT-4 for Free” or “Learn More”. The button uses a contrasting color to stand out against the background, such as orange or green.
Testimonial Carousel: A rotating carousel featuring quotes from satisfied users or industry leaders to build trust and credibility. The carousel is designed with a clean and modern layout, using circular portraits of the individuals along with their quotes.
Footer: The footer includes the OpenAI logo, a tagline that says “Powered by OpenAI”, and links to the Privacy Policy and Terms of Service.
Text: “Discover a world of endless possibilities with ChatGPT-4. Our latest AI language model brings you an unparalleled conversational experience, empowering you to connect, create, and collaborate like never before. Join the AI revolution today and unlock your potential with ChatGPT-4.”

And then of course, I went to its digital cousin Dall-E and asked it to create the digital art that ChatGPT 4 had described:

No alt text provided for this image
Dall-E version of digital art for a ChatGPT launching poster

I am no marketing specialist, but I have to say that I am impressed with the results. Chat GPT3.5 was a bit less elocuent and convincing about this. It also had the bottom explanation text that seems to be part of every answer and sounds more like a politician than an expert:

Ad Title: Introducing Chat GPT-4: The Future of AI Conversations
Caption: Say Hello to Smarter Conversations
Ad Text: The future of AI conversations is here with Chat GPT-4. Get ready for a more natural and intuitive experience with faster and more accurate responses. Say goodbye to generic responses and hello to smarter conversations.
Graphical Elements:
A futuristic illustration of a robot or AI device
A smartphone or laptop screen displaying a conversation with Chat GPT-4
Bold and modern typography for the title and caption
A color scheme that emphasizes technology and innovation, such as blue and metallic silver
This ad aims to emphasize the cutting-edge technology and advanced capabilities of Chat GPT-4, while also highlighting its potential to enhance the way we interact with AI. The use of a futuristic illustration and a smartphone or laptop screen displaying a conversation helps to bring the concept to life and showcase the practical application of the product. The bold and modern typography and technology-focused color scheme reinforce the message of innovation and progress.

Analytics

Then I had to try it at its riddle solution skills. I have used this interview question for many years:

The four members of U2 (Bono, the Edge, Larry and Adam) need to get across a narrow bridge to play a concert. Since it’s dark, a flashlight is required to cross, but the band has only one flashlight, and only two people can cross the bridge at a time. (This is not to say, of course, that if one of the members of the band has crossed the bridge, he can’t come back by himself with the flashlight.) Adam takes only a minute to get across, Larry takes two minutes, the Edge takes five minutes, and slowpoke Bono takes 10 minutes. A pair can only go as fast as the slowest member. They have 17 minutes to get across. How should they do it?

To get all four members of U2 across the bridge in 17 minutes, they can follow these steps:
- Adam and Larry cross the bridge together, taking 2 minutes (2 minutes elapsed).
- Adam goes back with the flashlight, taking 1 minute (3 minutes elapsed).
- Bono and the Edge cross the bridge together, taking 10 minutes (13 minutes elapsed).
- Larry goes back with the flashlight, taking 2 minutes (15 minutes elapsed).
- Adam and Larry cross the bridge together again, taking 2 minutes (17 minutes elapsed).
This way, all four members, Bono, the Edge, Larry, and Adam, make it across the bridge in exactly 17 minutes.

What actually surprised me was the speed at which it came up with the answer.

What did not surprised me is that ChatGPT 3.5 was able to come with the same solution.

So I went to Interviewcake.com and got their latest programming question:

Write a function fib() that takes an integer n and returns the nth fibonacci number

No alt text provided for this image
Fib(n) code by ChatGPT 3.5 (above) and ChatGPT 4 (Below)

In this case the code of ChatGPT 4.0 wins. It has better error handling and solves the problem in a more elegant way.

Self know?how

What do you do when you want to learn about the new launched ChatGPT 4.0? Of course, you ask the model itself. Except that the model was trained with data from September 2021, so of course, it does not know about itself.

Yet if you ask it which features would it have, it gives a reasonable answer:

1. Model size and architecture: GPT-4 could potentially have a larger model size, consisting of more parameters and layers than GPT-3, which has 175 billion parameters. A larger model size generally leads to better performance, as the model can learn more complex patterns and relationships in the data.
2. Training data: GPT-4 might be trained on a more diverse and up-to-date dataset, which would lead to improved knowledge and understanding of recent events, trends, and technologies.
3. Fine-tuning and task-specific adaptation: GPT-4 might have improved capabilities to be fine-tuned and adapted for specific tasks or domains, making it more versatile and effective across a broader range of use cases.
4. Efficiency and computational requirements: GPT-4 could potentially be more efficient in terms of computational resources, allowing it to be run on smaller devices or reducing the environmental impact of training and deploying large-scale AI models.
5. Ethical considerations and safety measures: GPT-4 might incorporate more advanced safety measures to reduce biases, address ethical concerns, and prevent harmful outputs or malicious uses.
6. Improved context handling and reasoning: GPT-4 could potentially have a better understanding of context, allowing it to provide more relevant and coherent responses, as well as perform tasks that require reasoning and problem-solving abilities.

Looks like a text that a well groomed Product Owner would have written. If you then compare to the way they are promoting it, it looks like they did improve on the reasoning and context handling, yet they are still working at the speed improvements.

Conclusions

  • Reasoning: ChatGPT4.0 got the joke, ChatGPT3.5 did not
  • Creativity: ChatGPT4.0 does a better job creating a campaign for itself.
  • Analytics: ChatGPT4.0 is a better programer than ChatGPT3.5
  • Self know how: Because of outdated data, none of them know who ChatGPT4.0 is. Yet they try to guess what would be the improvements that the team would do


CHESTER SWANSON SR.

Next Trend Realty LLC./ Har.com/Chester-Swanson/agent_cbswan

1 年

Well said.

要查看或添加评论,请登录

Miguel Rodriguez的更多文章

社区洞察

其他会员也浏览了