Two experiments with ChatGPT in generating algorithms

I have been experimenting with ChatGPT (#chatgpt) for a few weeks now, and I am impressed by its capabilities.

Knowing a bit about how ChatGPT works and about the presence of many opportunities in integrating it with other online systems such as information retrieval systems, it seems almost obvious to me that ChatGPT will keep getting better. So I am excited about its future and many ways we will be able to use it directly or indirectly as part of another system.

Yet judging from my limited set of experiments, I think ChatGPT is still work in progress. One of the biggest issues I see with it is lack of consistency and assurances about the quality of its results. Here by results I mean those whose quality can be judged more objectively; I do not mean results such as poems or essays. Of course, the results quality issue is no secret; it is already acknowledged as a limitation by OpenAI on the ChatGPT opening page.

For now, if you need to use ChatGPT for "real work", it seems like a good idea to be more careful and use ChatGPT for the cases where you will be able to validate the quality of the generated results. Maybe use it as a seed for your own iterations of creating a better solution.

Interestingly this observation on the results quality is also an opportunity for ChatGPT to get better at because if the results are such that they may be validated somewhat objectively, ChatGPT may actually be able to do it on its own, say, with the help of other systems for validation, before presenting the results to the user. This not only will improve the results quality but, if ChatGPT pairs the presentation with the validation information, this will also increase users' confidence in ChatGPT.

Here I want to share my experience with two computer science focused experiments I ran with ChatGPT. Both experiments are about generating an algorithm for a problem.

  • The first one was a simple one: Getting an algorithm for generating the Fibonacci sequence.
  • The second one was a more technical one: Getting an algorithm for computing the minimum cycle time of a weighted directed graph.

Getting an algorithm for generating the Fibonacci sequence

The first experiment was about getting "the" fastest algorithm for generating the Fibonacci sequence: 0, 1, 1, 2, 3, 5, 8, 13, 21, etc. This sequence consists of integers where every number after the first two, 0 and 1, is the sum of the two preceding numbers.

There are many algorithms to generate this sequence. A presentation of twelve known algorithms is given in [1] with their implementation in the Python programming language in [2]. Each of these algorithms take an integer n as the input and generate the nth Fibonacci number in the sequence.

If there is no bound on how big n can be and only integer arithmetic is allowed, the fastest algorithms for computing the sequence run in time logarithmic in n. Using floating-point arithmetic and allowing potentially approximate answers, there are faster algorithms as discussed in [1].

In this first experiment, I asked ChatGPT to show me the fastest algorithm in Python for generating the Fibonacci sequence. The session is shown below. The answer was returned in a few seconds with a complete program in Python and a good and short explanation.

No alt text provided for this image
The Python program with one of the fastest algorithms for the Fibonacci sequence, as returned by ChatGPT. This program has a bug at line 7 where the variable 'result' is a 1x2 matrix while it needs to be a 2x2 matrix due to line 11 and the function 'matrix_multiply'.

Unfortunately, the returned program had a simple bug. I pointed out the bug to ChatGPT. I think what ChatGPT did in response was amazing: It acknowledged the bug, corrected it, commented the line with the correction, produced the correct program and even apologized for the bug. Running the corrected program then generated the Fibonacci sequence correctly.

No alt text provided for this image
The corrected program for the Fibonacci sequence, as returned by ChatGPT. ChatGPT even marked the with the label "Corrected version" the line that had the bug, line 7.

Now imagine what ChatGPT could have done better here: It could have run Python on its own, discovered the bug by itself, and corrected it before presenting it to me. This is an example of what I mean by "integrating with other systems" or "validating its results." It seems like a no-brainer that these kinds of auto-validation will or can be integrated into ChatGPT soon.

Getting an algorithm for computing the minimum cycle mean of a weighted directed graph

This problem is a more technical and/or specialized one. Let us start with the definitions first. If you know what a graph is, these definitions should be easy for you.

  • The weight of a cycle is the sum of the weights of its edges.
  • The length of a cycle is the number of its edges.
  • The mean of a cycle is the ratio of the weight of the cycle to its length. The mean represents the average weight per edge on the cycle.
  • The minimum cycle mean of a graph is the minimum over all cycle means in the graph.
  • The maximum cycle mean is defined analogously. The optimum cycle mean is one of the minimum or maximum.

I first asked ChatGPT the definition of the minimum cycle mean, the generated text had the correct definition though I had difficulty to figure out the vertices and edges in the example graph shown using ASCII characters.

No alt text provided for this image
The definition of the minimum cycle mean of a graph, as returned by ChatGPT.

Next I asked ChatGPT to give me an algorithm to compute the minimum cycle mean of a graph. Here I was unsuccessful to get a correct algorithm: ChatGPT answered using one of the shortest-paths algorithms: Bellman-Ford, Floyd-Warshall, and Johnson. These algorithms can compute single-source or all-pairs shortest paths but not the minimum cycle mean of the input graph.

No alt text provided for this image
A wrong algorithm for computing the minimum cycle mean of a graph, as returned by ChatGPT. The other algorithms cited in the text above are also the wrong algorithms for the problem.

Also note in the returned result ChatGPT defines the minimum cycle mean incorrectly even though it gave the correct definition when the query was asking for the definition.

I then started a conversation with ChatGPT to see if it could correct its answer. When I pointed out the error in the previous answer, ChatGPT apologized for the mistake, acknowledged that Bellman-Ford and Floyd-Warshall were indeed not the correct algorithms for the problem but incorrectly presented Johnson algorithm as the correct algorithm.

A simple way to see that the returned algorithm is incorrect without running it on your computer is to note that the algorithm immediately returns when it detects negative weight cycles in the input graph. The presence of such cycles in the input graph is a headache for shortest paths problems because shortest paths hitting such cycles will diverge to negative infinity; however, their presence is not a headache for the minimum cycle mean problem because in the presence of such cycles, the minimum cycle mean of the input graph will simply be negative.

No alt text provided for this image
Despite the query saying the previous algorithms were the wrong ones, ChatGPT presents one of the algorithms as the right algorithm for computing the minimum cycle mean of a graph.

I warned ChatGPT again that Johnson algorithm was not the correct algorithm to compute the minimum cycle mean in a graph. ChatGPT again apologized for the mistake, acknowledged the mistake, and presented an unnamed "cycle mean algorithm" that I do not recall having seen before. Once I find some time, I will actually test this algorithm in comparison to other algorithms for the problem [3]. Interestingly, in this new algorithm, there is at least a division operation that tries to compute the average weight of a path.

No alt text provided for this image
No alt text provided for this image
For a query pointing out the mistake again, ChatGPT presents what seems to be a generic algorithm for computing the minimum cycle mean of a graph. I have not tested but from a quick review of the pseudocode, the algorithm does not seem to be solving the problem correctly.

How could ChatGPT have done better for this question? As presented in [3] and many other references in the technical literature, there are many algorithms to compute the minimum cycle mean in a given graph. ChatGPT could have easily presented one of these algorithms as the correct solution.

Summary

As I mentioned in the introduction, ChatGPT is an impressive system and I strongly believe that it will keep getting better. One immediate opportunity I see is to integrate ChatGPT with other systems so that its results quality can be made more consistent and can potentially be validated. For additional confidence in ChatGPT, it may present its results together with how it validated them.

For now, I love using ChatGPT and I thank #openai for making it available for free. I am looking forward to how it will be used to enhance existing systems and create new ones. It is an exciting time to be alive indeed.

References

[1] A. Dasdan, "Twelve Simple Algorithms to Compute Fibonacci Numbers", https://arxiv.org/abs/1803.07199

[2] A. Dasdan, Code for the algorithms presented in [1], https://github.com/alidasdan/fibonacci-number-algorithms

[3] A. Dasdan, "Experimental analysis of the fastest optimum cycle ratio and mean algorithms", ACM Transactions on Design Automation of Electronic Systems (TODAES), 9(4), 2004.

[4] A. Dasdan, Code for the algorithms presented in [3], https://github.com/alidasdan/optimum-cycle-mean-algorithms

Along the lines of integrating ChatGPT with another system, please see this article on ChatGPT+Wolfram|Alpha by S. Wolfram on handling computational queries far better. There is also a related discussion on Hacker News. https://writings.stephenwolfram.com/2023/01/wolframalpha-as-the-way-to-bring-computational-knowledge-superpowers-to-chatgpt/

Thank you for such an insightful and detailed analysis. I am also very excited about everything related to GPT these days. The complexity involved in accurately providing answers for problems like the ones you mentioned is quite significant. Given that GPT can struggle with relatively simple math problems like solving quadratic equations, it is not surprising that it may also make mistakes in more complex problems that require reasoning. However, there are also cases where GPT is very useful and can be trusted right away. For example, check out this tweet: https://twitter.com/leonovco/status/1599581238687563776 Even now, with the right approach, GPT can provide a competitive edge. Instead of looking for definitive answers, GPT can help "unstuck" us by providing alternative opinions and smoothing the learning curve in domains that require aggregating information. For example, I recently used GPT to help my teenager learn Latin and it was great because all the information we needed was in one place, along with clarifications when needed.

Matt Oden

Founder / CEO | Execify

1 年

Very interesting Ali Dasdan - learned a lot from this. Eye opening.

要查看或添加评论,请登录

社区洞察

其他会员也浏览了