Trying Google Gemini for Data & Code Analysis
This is a quick overview and my first attempt to really see how well this works. This is not a proper test, and does not provide a final verdict. It can, I hope, provide some hints and insights.
Approach
I used Gemini for two tasks, both involving coding. One is analyzing data using very well-known and popular libraries, and one for writing code to crawl a website with a set of specifications, using a less known library. The idea is to see how well it can figure out things without tons of data.
First task: Data analysis with two of Python's most popular libraries (pandas and plotly)
Prompt:
The following is a table called "data" of a Basketball player's career stats. Can you please perform some exploratory data analysis on it? Please use pandas and plotly for data processing and visualization. Please provide the code you used to run the analysis
This is the actual response, which I just copied and pasted, without any changes, and ran the code:
The code is very clear, systematic, knows when to use which type of chart, and took its own liberty to include two metrics in one (assists and steals). It also numbered the steps, and provide clarifications before each step as comments.
Running the code produced these charts:
Not only did it provide the explained code, it also provided its own analysis, here is a sample of what is said:
领英推荐
The "Field Goal Percentage" was the only metric is got wrong. The rest it described very nicely. Note how well it described the "Points Trend" chart.
Second task: Crawling a website with advertools, much less known library than the previous ones
Prompt:
please write code to crawl a large website with advertools:
It was extremely surprising how well it knows the code, the functions:
All function parameters are correct, the setting names are exactly right.
Only, this code doesn't work.
For some reason it felt that it had to unnecessarily complicate things with a strange loop that achieves nothing. If you don't know how to use advertools, or you don't know enough Python, this code is not useful. Actually, it would be really confusing for a beginner.
This is not a comprehensive study, just two random examples.
So, the observation so far is, with topics that have tons of data, millions of examples, it can handle quite well. The data analysis is very smart, code, as well as descriptions (I asked for exploratory only). At the same time, for stuff that does not have many examples, it can make stupid mistakes, even while writing Python code (many millions of examples).
My opinion remains: LLMs are great text and natural language processing tools. They are just not "intelligent".