String concatenation complexity in python p3 - python vs AI

Ok, so after first two parts (1, 2), that ended up catching many people (including myself), I decided to ask a few questions to AI and test modern stuff in practice.

So, what I was going for? I was wandering, how hard it is to combine growing strings into ever growing strings.

To do the trick lazy me asked a few available AIs to:

"write a python function that takes 2 arguments - n and m, and returns a string, containing n repetitions of first m letters of english alphabet"

Obviously, with this level of unclarity in demands, no one got it right, however after some clarifications Google's copilot-like AI in vscode gave me something working:

def generate2d(n: int, m: int) -> str:
    res = ""
    for i in range(n):
        for j in range(m):
            res += chr(ord('a') + j % 26)
    return res        

which is obviously not optimal, but does what I wanted. Manual optimisation of this thing turns it from being O(n*m) into O(n+m) at least for not-super-huge n and m, which is surprising.

Interestingly enough, another variant of Google's AI comes up with right ideas to make this thing faster, but it fails miserably at returning expected result - e.g.(Gemini 1.5 pro):

def repeat_alphabet(n, m):
  alphabet = "abcdefghijklmnopqrstuvwxyz"
  total_letters = n * m
  return (alphabet * (total_letters // 26)) + alphabet[:total_letters % 26]         

this is just plain wrong. However, pre-indexed string is a nice idea.

ChatGPT did it right:

def repeat_alphabet(n, m):
    # Ensure n and m are valid
    if not (isinstance(n, int) and isinstance(m, int)):
        return "Both n and m must be integers."
    if n < 0 or m < 0:
        return "Both n and m must be non-negative integers."
    
    # Create the full alphabet string
    alphabet = 'abcdefghijklmnopqrstuvwxyz'
    
    # Calculate the full substring needed by repeating and slicing the alphabet
    full_repeats = m // 26
    remainder = m % 26
    
    # Create the substring with full repeats and the remainder
    substring = alphabet * full_repeats + alphabet[:remainder]
    
    # Repeat the substring n times
    result = substring * n
    
    return result        

This is quite good (however - it did a bit of defensive programming, that I didn't ask it to do).

Moreover - it is quite fast - my testing script gave me this:

But what is even more amazing - it is surprisingly non-linear - looks like I need to tease it independently from others to be able to measure practical complexity for it - microseconds are practical lower interval of time measurements, that is easily doable.

Several crashed browsers later

Ok, what we got:


This sounds like O(m*n) + O(1), with high number behind O(1) and low number inside O(m*n)

Let's see what time/(m*n) looks like:

Yes! We definitely hit the plato of O(m*n) with high numbers, however looks like for lower/reasonable sizes python is capable of doing some aggressive optimisations here.

So, to conclude the series - key takeaways:

  • O-notation is easy for simple things, and may involve all kinds of magic for complex things.
  • It is possible and meaningful to observe and measure time complexity
  • AI can code. Confidently. But keep your eyes wide open with what it gives you - it tries to get away with incorrect code from time to time.

All code, some data and some more stuff - https://github.com/mickvav/string_mystery

You may also find interesting what #coderabbit thought about my code in https://github.com/mickvav/string_mystery/pull/1

If you found this series interesting/insightful - let me know, like/share/repost is always welcome, you know!

要查看或添加评论,请登录

社区洞察

其他会员也浏览了