String concatenation complexity in python p3 - python vs AI
Ok, so after first two parts (1, 2), that ended up catching many people (including myself), I decided to ask a few questions to AI and test modern stuff in practice.
So, what I was going for? I was wandering, how hard it is to combine growing strings into ever growing strings.
To do the trick lazy me asked a few available AIs to:
"write a python function that takes 2 arguments - n and m, and returns a string, containing n repetitions of first m letters of english alphabet"
Obviously, with this level of unclarity in demands, no one got it right, however after some clarifications Google's copilot-like AI in vscode gave me something working:
def generate2d(n: int, m: int) -> str:
res = ""
for i in range(n):
for j in range(m):
res += chr(ord('a') + j % 26)
return res
which is obviously not optimal, but does what I wanted. Manual optimisation of this thing turns it from being O(n*m) into O(n+m) at least for not-super-huge n and m, which is surprising.
Interestingly enough, another variant of Google's AI comes up with right ideas to make this thing faster, but it fails miserably at returning expected result - e.g.(Gemini 1.5 pro):
def repeat_alphabet(n, m):
alphabet = "abcdefghijklmnopqrstuvwxyz"
total_letters = n * m
return (alphabet * (total_letters // 26)) + alphabet[:total_letters % 26]
this is just plain wrong. However, pre-indexed string is a nice idea.
ChatGPT did it right:
def repeat_alphabet(n, m):
# Ensure n and m are valid
if not (isinstance(n, int) and isinstance(m, int)):
return "Both n and m must be integers."
if n < 0 or m < 0:
return "Both n and m must be non-negative integers."
# Create the full alphabet string
alphabet = 'abcdefghijklmnopqrstuvwxyz'
# Calculate the full substring needed by repeating and slicing the alphabet
full_repeats = m // 26
remainder = m % 26
# Create the substring with full repeats and the remainder
substring = alphabet * full_repeats + alphabet[:remainder]
# Repeat the substring n times
result = substring * n
return result
This is quite good (however - it did a bit of defensive programming, that I didn't ask it to do).
Moreover - it is quite fast - my testing script gave me this:
领英推荐
But what is even more amazing - it is surprisingly non-linear - looks like I need to tease it independently from others to be able to measure practical complexity for it - microseconds are practical lower interval of time measurements, that is easily doable.
Several crashed browsers later
Ok, what we got:
This sounds like O(m*n) + O(1), with high number behind O(1) and low number inside O(m*n)
Let's see what time/(m*n) looks like:
Yes! We definitely hit the plato of O(m*n) with high numbers, however looks like for lower/reasonable sizes python is capable of doing some aggressive optimisations here.
So, to conclude the series - key takeaways:
All code, some data and some more stuff - https://github.com/mickvav/string_mystery
You may also find interesting what #coderabbit thought about my code in https://github.com/mickvav/string_mystery/pull/1
If you found this series interesting/insightful - let me know, like/share/repost is always welcome, you know!