登录查看更多内容

Comprehensive Guide to ChatGPT Conversation Analysis - A quick 5-Minute Exercise

Stephan Goetze

发布日期: 2023年12月19日

Introduction

Analyzing your conversations with ChatGPT can offer fascinating insights into your chat patterns, topics of interest, and engagement habits. This process involves extracting raw data from your ChatGPT sessions and processing it with Python on your on system to derive meaningful statistics. Find my stats at the end.

If you have already Phyton ready this will take you not more then 5 minutes to get interesting insights about your ChatGPT usage.

Privacy Note:

Always keep privacy in mind. If your chat data contains personal or sensitive information, ensure that your analysis respects privacy considerations and that the data is securely handled.

These analyses could provide both valuable insights into your interaction styles and preferences, as well as a bit of fun in looking back at your conversational journey with ChatGPT.

For me Chat-GPT has become an co-pilot on different projects & tasks.

It also assisted me to make this analysis. It provided the phyton code below that helped to analyze the conversations raw data.

Step 1: Obtaining your Raw Data

Access Chat Data: You can download your ChatGPT conversations from your GPT profile.
Export Data: Under "Settings & Beta" go to "Data controls" and click "Export" button.
File Format: The data you will download is zipped. Don't worry it has a wired long cryptic name. You need to extract and will find different JSON files, which is ideal for analysis in Python. You onls need to use the conversations.json for the analysis

Step 2: Setting Up Python Environment

Install Python: Ensure Python is installed on your system. It can be downloaded from python.org.
If pip is not installed (you'll know because the command below will return an error), you can install it by: Downloading get-pip.py from https://bootstrap.pypa.io/get-pip.py. In your CLI, navigate to the folder where get-pip.py is downloaded. Run python get-pip.py.
Open Command Prompt (Windows) or Terminal (Mac/Linux):On Windows, you can press Win + R, type cmd, and press Enter.On a Mac, press Cmd + Space, type Terminal, and press Enter.
Check Python Version:Type python --version and press Enter.If Python is installed correctly, you should see the version number.
Install Libraries: Use pip to install necessary Python libraries:'pip install pandas nltk'
Optional - Jupyter Notebook: For a user-friendly analysis experience, install Jupyter Notebook:'pip install notebook'

Step 3: Initial Analysis with Python

Use the following Python script as your starting point. This script calculates the total number of main and sub-chats, total messages, 'As an AI' warnings, thank yous, and an estimated total duration of your chats:

import json

def load_data(file_path):
    with open(file_path, 'r') as file:
        data = json.load(file)
    return data

def parse_conversations(data, avg_time_per_message=30):
    stats = {
        'total_main_chats': 0,
        'total_sub_chats': 0,
        'total_messages': 0,
        'as_ai_warnings': 0,
        'thank_yous': 0,
        'estimated_total_duration': 0  # in seconds
    }

    for conversation in data:
        mapping = conversation.get('mapping', {})
        for conv_id, conv_data in mapping.items():
            message = conv_data.get('message')
            if message:
                num_messages = len(conv_data.get('children', []))
                stats['total_messages'] += num_messages
                content = message.get('content', {}).get('parts', [])
                if content:
                    text = extract_text(content)
                    stats['as_ai_warnings'] += text.count('as an AI')
                    stats['thank_yous'] += text.lower().count('thank you')

                # Identify main chats
                parent_id = conv_data.get('parent')
                is_main_chat = not parent_id or mapping.get(parent_id, {}).get('parent') is None
                if is_main_chat:
                    stats['total_main_chats'] += 1
                else:
                    stats['total_sub_chats'] += 1

                # Estimate duration for each message
                stats['estimated_total_duration'] += num_messages * avg_time_per_message

    return stats

def extract_text(content_parts):
    text_parts = []
    for part in content_parts:
        if isinstance(part, str):
            text_parts.append(part)
        elif isinstance(part, dict) and 'text' in part:
            text_parts.append(part['text'])
    return " ".join(text_parts)

def main():
    file_path = 'path_to_your_json_file.json'  # Replace with the actual file path
    data = load_data(file_path)
    stats = parse_conversations(data)

    # Convert estimated total duration from seconds to hours
    stats['estimated_total_duration'] = stats['estimated_total_duration'] / 3600

    print(f"Total Main Chats: {stats['total_main_chats']}")
    print(f"Total Sub Chats: {stats['total_sub_chats']}")
    print(f"Total Messages: {stats['total_messages']}")
    print(f"'As an AI' Warnings: {stats['as_ai_warnings']}")
    print(f"Thank Yous: {stats['thank_yous']}")
    print(f"Estimated Total Duration (hours): {stats['estimated_total_duration']}")

if __name__ == "__main__":
    main()

Replace 'path_to_your_json_file.json' with the actual path to your JSON file.

Step 4: Further Analysis

Once you have an initial understanding of your chat data, you can extend your analysis to explore the most active hours, average message length, and common themes.

Free Online Courses With Certificates 1 年前

Prompt Programming with ChatGPT (An Automated Ai Job…

Cohen Reuven 1 年前

How to Build a Speaking Robot using ChatGPT

Dhiraj Patra 1 年前

import json
from datetime import datetime
import pandas as pd
from collections import Counter
import nltk
from nltk.corpus import stopwords
nltk.download('stopwords')

def load_data(file_path):
    with open(file_path, 'r') as file:
        data = json.load(file)
    return data

def analyze_data(data):
    timestamps = []
    message_lengths = []
    words = []

    for conversation in data:
        mapping = conversation.get('mapping', {})
        for conv_id, conv_data in mapping.items():
            message = conv_data.get('message')
            if message and message.get('author', {}).get('role') == 'user':
                content = message.get('content', {}).get('parts', [])
                if content:
                    text = " ".join([part for part in content if isinstance(part, str)])
                    message_lengths.append(len(text.split()))
                    words.extend(text.lower().split())
                create_time = message.get('create_time')
                if create_time:
                    timestamps.append(datetime.fromtimestamp(create_time))

    return timestamps, message_lengths, words

def main():
    file_path = 'path_to_your_json_file.json'  # Replace with your file path
    data = load_data(file_path)
    timestamps, message_lengths, words = analyze_data(data)

    times_df = pd.DataFrame({'timestamp': timestamps})
    times_df['hour'] = times_df['timestamp'].dt.hour
    most_active_hour = times_df['hour'].mode()[0]
    avg_message_length = sum(message_lengths) / len(message_lengths)

    stop_words = set(stopwords.words('english'))
    filtered_words = [word for word in words if word not in stop_words and word.isalpha()]
    word_freq = Counter(filtered_words)
    most_common_words = word_freq.most_common(10)

    print(f"Most Active Hour: {most_active_hour}")
    print(f"Average Message Length: {avg_message_length} words")
    print("Most Common Words:", most_common_words)

if __name__ == "__main__":
    main()

Replace 'path_to_your_json_file.json' with the path to your JSON file containing the chat data.

Conclusion

This guide offers a structured approach to analyzing your ChatGPT conversation data. Starting from extracting raw data to performing initial analysis with Python, you can gain valuable insights into your interaction with AI.

You can also go further and analyze e.g. an hourly activity heatmap showing when you are most active in your chats, a histogram of message lengths to visualize the distribution of your message lengths or a bar chart displaying the most common words used in your chats. For additional insights you will need to change the code. GPT will help you with that ;-)

My Results:

I use Chat-GPT for a really long time since Autum 2022. I had the pleasure to early test and also see the platform adapting and evolving. Naturally it hooked me to try out much more and to use it more often to assist me on my different projects & tasks.

Total Main Chats: I had 190 main chat sessions. These are individual sessions or conversations I started that deal with a specific topic or task.
Total Sub Chats: Within these main chats, there are 5705 sub-chats. These include the follow-up messages or interactions within each main chat session.
Total Messages: The total count of messages is 5705, which aligns with the number of sub-chats. This means each sub-chat corresponds to a single message.
'As an AI' Warnings: I have encountered 12 instances where the system provided a warning or clarification about its AI nature. This is relatively low, suggesting that most of my interactions were straightforward in terms of AI capabilities and limitations.
Thank Yous: I have thanked ChatGPT 196 times. I am of course polite and appreciative.
Estimated Total Duration: I have spent about a minmum of 47.54 hours chatting. This estimation is based on an average duration per message, which was set to 30 seconds. Depending on your actual interaction speed, the real total duration could be higher or lower. I assume my value is to low.
Most Active Hour: 11 PM This suggests that I am most active in my interactions with ChatGPT during the late-night hours (11 PM). When kids sleep and it is quiet it is very effective to engage with Chat-GPT.
Average Message Length: 94.9 words An average message length of approximately 95 words indicates a tendency towards detailed and comprehensive queries or responses. This suggests that my use of ChatGPT involves thorough explanations or complex queries, which is quite common in professional or technical contexts.
Most Common Words: The list of most common words provides a window into the main topics and themes of my conversations: 'Data' and 'Marketing' being at the top mirrors my strong focus on marketing data or data-driven marketing strategies. 'User', 'Martech', 'Customer': These words align well with a focus on marketing technology (MarTech) and customer/user-oriented discussions.

Interpretation :

The usage pattern shows a healthy level of engagement with the ChatGPT system, indicated by a substantial number of main chats and messages.
The 'As an AI' warnings and 'Thank Yous' give a glimpse into the nature of my interactions, suggesting a mix of queries and courteous exchanges.
The estimated duration of nearly 47.5 hours indicates significant time spent interacting with ChatGPT, suggesting either extensive sessions or high frequency of use.
My activity peak at 11 PM point to after-hours research or work, suggesting a high level of dedication or interest in the subjects you're discussing with ChatGPT.
The length of my messages and the presence of technical and marketing-related terms indicate a professional use of the tool, likely exploring complex concepts or strategies in marketing technology.

Note:

Remember, the estimated total duration is based on a preset average time per message. If your actual message interaction time differs from this average, the total estimated hours might vary.

Like this? Follow me.

I'm Stephan G?tze - a?MarTech HERO?and I help companies to "Unlock Marketing" providing strategies for Modern Marketing Leaders in the Data-Driven Age.

Click my name + Follow + ??

Udo Kiel

????Vom Arbeitswissenschaftler zum Wissenschaftskommunikator: Gemeinsam für eine sichtbarere Forschungswelt

11 个月

Wow, this sounds fascinating! Can't wait to uncover the hidden patterns in my AI conversations. ??

1 次回应

Woodley B. Preucil, CFA

Senior Managing Director

11 个月

Stephan Goetze Fascinating read. Thank you for sharing

1 次回应

查看更多评论

要查看或添加评论，请登录

Stephan Goetze的更多文章

Reset Your Marketing Now: Why the Old Foundation Must Die for Growth to Flourish

2024年8月19日

Reset Your Marketing Now: Why the Old Foundation Must Die for Growth to Flourish

I often wonder, why many companies still find themselves hindered by outdated marketing practices that no longer serve…

1 条评论
Unraveling the Seven Sins of Data Misuse in Marketing: A Narrative Journey of Pain and Glory

2024年5月13日

Unraveling the Seven Sins of Data Misuse in Marketing: A Narrative Journey of Pain and Glory

In marketing data acts as the compass that guides decisions, shapes strategies, and molds customer interactions…
Unlocking MarTech's Next Chapter: The Transformative Power of Composable CDPs

2024年4月1日

Unlocking MarTech's Next Chapter: The Transformative Power of Composable CDPs

We all know, effectively managing and leveraging data is crucial for marketing success. However, professionals in this…
Get to know The Seven Apocalyptical Riders of Digital Marketing

2024年2月19日

Get to know The Seven Apocalyptical Riders of Digital Marketing

Embark on a thrilling odyssey through the digital marketing landscape, where the apocalyptical riders beckon you to…

1 条评论
The Day Data Saved Us: A 500.000€ Story of Resilience and Innovation

2024年2月6日

The Day Data Saved Us: A 500.000€ Story of Resilience and Innovation

Adversity often paves the way for groundbreaking innovation. My narrative, shared with data / analytics experts at the…
How We Mastered 1st Party User Data Collection & Usage: A Transformative Odyssey

2023年10月10日

How We Mastered 1st Party User Data Collection & Usage: A Transformative Odyssey

Prologue: The Storm Clouds of a Cookieless Future As digital marketers, we've always been explorers, charting unknown…
A Hero's Journey - How I integrated a Customer Data Platform (CDP) at ImmoScout24 - From Data Chaos to Transformation in Just 4 Weeks

2023年7月24日

A Hero's Journey - How I integrated a Customer Data Platform (CDP) at ImmoScout24 - From Data Chaos to Transformation in Just 4 Weeks

Embark on a captivating journey through the challenges and triumphs of integrating a powerful CDP at ImmoScout24…

11 条评论
The Unseen Power of Proper Naming Conventions in Marketing

2023年7月13日

The Unseen Power of Proper Naming Conventions in Marketing

The reason why user personalization fails or reporting lacks granularity is straightforward: it's not just about teams…

5 条评论
Revolutionizing Efficiency with User Suppression in the New Era of Performance Marketing

2023年7月7日

Revolutionizing Efficiency with User Suppression in the New Era of Performance Marketing

User Suppression isn't just a strategy - it's a game-changer. It has the power to transform the way we navigate the…

2 条评论
The Modern Marketer's Paradox: Mastering the MarTech Maze and Claiming Victory

2023年6月28日

The Modern Marketer's Paradox: Mastering the MarTech Maze and Claiming Victory

No matter if you're a Marketer, CMO, CEO, or a MarTech team member, this all-encompassing article will equip you with…

See all articles

Comprehensive Guide to ChatGPT Conversation Analysis - A quick 5-Minute Exercise

Stephan Goetze

Introduction

Privacy Note:

Step 1: Obtaining your Raw Data

Step 2: Setting Up Python Environment

Step 3: Initial Analysis with Python

Step 4: Further Analysis

领英推荐

Conclusion

My Results:

Interpretation :

Note:

Stephan Goetze的更多文章

社区洞察

其他会员也浏览了

Awesome ChatGPT

Chat GPT: Should software developers be worried?

ChatGPT 4: A revolutionary idea

Case Study: The Making of the Pythle Logo with ChatGPT

How to use OpenAI ChatGPT?API?

The 100% A Strategy - A Data Analysis of SPM 2023 with Python and ChatGPT 4o

Will ChatGPT Replace IT Developers?

Agents & Agency: A Primer

Prompting: A Non-technical Guide to Staying Competitive.

ChatGPT App Development: How to develop an app using ChatGPT?

Introduction

Privacy Note:

Step 1: Obtaining your Raw Data

Step 2: Setting Up Python Environment

Step 3: Initial Analysis with Python

Step 4: Further Analysis

领英推荐

Conclusion

My Results:

Interpretation :

Note:

Stephan Goetze的更多文章

Reset Your Marketing Now: Why the Old Foundation Must Die for Growth to Flourish

Unraveling the Seven Sins of Data Misuse in Marketing: A Narrative Journey of Pain and Glory

Unlocking MarTech's Next Chapter: The Transformative Power of Composable CDPs

Get to know The Seven Apocalyptical Riders of Digital Marketing

The Day Data Saved Us: A 500.000€ Story of Resilience and Innovation

How We Mastered 1st Party User Data Collection & Usage: A Transformative Odyssey

A Hero's Journey - How I integrated a Customer Data Platform (CDP) at ImmoScout24 - From Data Chaos to Transformation in Just 4 Weeks

The Unseen Power of Proper Naming Conventions in Marketing

Revolutionizing Efficiency with User Suppression in the New Era of Performance Marketing

The Modern Marketer's Paradox: Mastering the MarTech Maze and Claiming Victory

社区洞察

其他会员也浏览了

Awesome ChatGPT

Chat GPT: Should software developers be worried?

ChatGPT 4: A revolutionary idea

Case Study: The Making of the Pythle Logo with ChatGPT

How to use OpenAI ChatGPT?API?

The 100% A Strategy - A Data Analysis of SPM 2023 with Python and ChatGPT 4o

Will ChatGPT Replace IT Developers?

Agents & Agency: A Primer

Prompting: A Non-technical Guide to Staying Competitive.

ChatGPT App Development: How to develop an app using ChatGPT?