Ask explicitly for efficient code from ChatGPT
Often times the code generated by ChatGPT is not the most efficient. Here I want to remove the second word in a sentence. The initial code works but involves splitting the string into words and then combine them, which is slow.
Me: Write R code to remove the second word in a string.
string <- "Hello World and Universe
words <- strsplit(string, "\\s")[[1]]
words <- words[-2]
new_string <- paste(words, collapse = " ")"
I then asked it to avoid splitting. I got a slightly faster but ugly-looking code.
Me: Can you do that using regular expression not strsplit?
first_space <- which(substr(string, 1, 1:length(string)) == " ")[1
second_space <- which(substr(string, 1, 1:length(string)) == " ")[2]
new_string <- paste(substr(string, 1, first_space), substr(string, second_space+1, nchar(string)), sep="")]
I tried again to ask for efficient code explicitly, ChatGPT pointed me to a function in the stringr package that uses regular expression. It is just one line and should be much faster. But when I tried it. It give me the wrong results -- it removes the first word!
Me: Great. Are there more efficient methods with less computing?
library(stringr)
new_string <- str_replace(string, "^\\S+\\s\\S+\\s", ""))
I have to Google around to find the correct regular expression. The final solution is using stringr package but with a correct regular expression.
library(stringr)
new_string <- str_replace(string, '^(\s*(?:\S+\s+){1})\S+', '\\1'))
You would think more efficient code should be prioritized, but apparently not. You have to be explicit. The lesson is if the code is important, ask repeatedly to get alternative solutions. And then test them. Don't trust ChatGPT.
Sr Data Scientist at Great West Casualty
1 年I appreciate you sharing your findings. Just today I used it to learn how to convert a column of lists to column of string using a lambda function in Python. Probably would not have thought to do that without seeing your posts