登录查看更多内容

Table stakes: businesses bet big on AI and data

David Thackeray

Furiously curious unboxed thinker | Host: AI Today

发布日期: 2025年2月17日

+ 关注

Welcome to the first one.

It's been a while in the making.

And I'm happy you joined me for the ride.

Love innovation, disruption, productivity, efficiency, and a bucket full of humour to wash it all down?

You're home.

Today's missive is all about a missed opportunity for many businesses:

Parallel processing.

We all ran away from anything parallel when we failed our driving test twice. Speaking from bitter experience...

But it's time to give it a second chance.

Parallel processing is how you make AI work with many big documents at once.

It used to be expensive - prohibitively so. But with Gemini 2.0 Flash, it's not.

FIrst I'll tell you what it is, and how to do it.

Then in the spirit of Buy my chAI I will show you an experiment using parallel processing - and with a brilliant result.

I'd love to hear what you think. Drop me a line at [email protected] or a message here on LinkedIn.

Before we get started, you might want some background intel.

Here's the latest episode of AI Today, my podcast diving deep on all things AI in business...

I have read a lot of verbose and pointless articles about Gemini 2.0 Flash giving us the opportunity to save money and time building a corpus of AI-ready documents that previously would have been damn near impossible to process without incurring a huge financial and temporal expense.

So I thought I would write one that not only pitched the benefits — but showed you how to make it work.

This guide provides a technical framework for analyzing 5 large documents (4 million tokens in all) using the 1-million-token input limit and parallel processing capabilities of Google’s Gemini 2.0 Flash.

We’ll use document-level parallelism, hierarchical summarisation, and Gemini’s native tool integration to achieve comprehensive synthesis without traditional chunking.

Why? To save up to 80% on sequential processing, while maintaining full document context.

Technical architecture for multi-document analysis

1. Document preparation and parallel processing setup

Processing each document in full context while respecting token limits.

Implementation

Token calculation: Use Gemini’s countTokens API to verify each document is within Gemini 2.0 Flash’s 1M token input limit.

For oversize documents:

from google.ai import generativelanguage as glm
document = glm.Content(parts=[glm.Part(text=your_text)]) 
token_count = model.count_tokens(document).total_tokens

If exceeding limit, use Gemini’s native document segmentation via split: “SPLIT_MODE_SECTION”.

Parallel API configuration: Implement asynchronous processing with Python’s asyncio and rate limit management.

import asyncio  
from google import generativeai as genai  

async def process_document(doc_path):  
    doc_content = load_document(doc_path)  
    model = genai.GenerativeModel('gemini-2.0-flash-001')  
    return await model.generate_content_async(  
        f"Analyze this document and extract:\n1. Key themes\n2. Statistical patterns\n3. Novel insights\n{doc_content}"  
    )  

async def main():  
    tasks = [process_document(f"doc_{i}.pdf") for i in range(5)]  
    return await asyncio.gather(*tasks)

Handles 5 concurrent requests within Gemini’s 60 RPM limit.

2. Intermediate analysis and cross-document indexing

Creating searchable knowledge graph from parallel outputs.

Implementation

Structured extraction: Force JSON output format for machine readability.

response = model.generate_content(  
    "Extract entities, relationships, and statistics as JSON. Structure:\n"  
    "{'themes':[], 'findings':[], 'sources':[]}",  
    generation_config={"response_mime_type":"application/json"}  
)

Uses Gemini’s native JSON mode for structured parsing.

2. Vector indexing: Store results in pgvector PostgreSQL with hybrid search.

CREATE TABLE doc_analysis (  
    id SERIAL PRIMARY KEY,  
    content JSONB,  
    embedding VECTOR(768)  
);

Enables semantic search across all documents.

3. Hierarchical synthesis process

Combining insights while respecting token limits.

Implementation

First-level synthesis: Process pairwise document combinations.

synthesis_prompt = """Compare Document A and Document B:  
- Identify 3 shared themes  
- Resolve conflicting data points  
- List novel combined insights"""  

pairs = itertools.combinations(document_responses, 2)  
batch_results = model.batch_generate_contents(  
    [synthesis_prompt.format(a,b) for a,b in pairs]  
)

Processes 10 document pairs in batches of 5.

2. Final synthesis: Use recursive summarisation with context caching.

final_response = model.generate_content(  
    "Synthesize these research findings into 10 key conclusions:",  
    context=cached_intermediate_ids  
)

Leverages Gemini’s 2M-token context caching for cost efficiency.

Performance optimisation

This is a big one. I was ecstatic to see the huge time and efficiency savings using parallel processing compared to the old and laborious manual approaches to chunking and all that linear faff.

This was a brilliant experiment and one I have been wanting to try for the longest time.

Most of the effort was actually in finding the datasets that would snugly fit within the boundaries of this test.

But with so many datasets freely available, as I discovered, I won’t be wasting any time there again.

Love how special I feel trotting data through a checkout process and getting it gratis

Implementation checklist

Baseline testing

Verify API quotas in Google AI Studio
Test single-document processing pipeline
Establish error handling for API timeouts

Parallel workflow deployment

Implement async wrapper with retry logic
Configure auto-scaling GCP Compute Engine pool
Set up Redis cache for intermediate results

Validation system

Create ground truth dataset for 100 key data points
Implement consensus scoring across 3 synthesis methods
Monitor hallucination rate with Guardrails API

Cost management strategy

Context caching: Store document embeddings ($0.25/1M tokens) vs reprocessing ($7/1M tokens).

cache_key = hashlib.sha256(doc_content.encode()).hexdigest()  
redis.set(cache_key, processed_response)

Tiered processing

First pass: Gemini 2.0 Flash ($1.50/1M tokens)
Validation: Gemini 1.5 Pro ($3.50/1M tokens)
Final synthesis: Gemini 2.0 Pro ($5.00/1M tokens)

Projected 4M token cost: $6.00 + $2.80 + $1.50 = $10.30.

Peanuts, right? And yeah, sorry about the formatting. Every time I tackle an article like this I face existential dread because there are so many styling limitations. If you forgive me, I will forgive myself.

Limitations and mitigations

Cross-document contradictions

Issue: Parallel processing may miss conflicting claims
Solution: Implement contradiction detection layer.

model.generate_content(  
    "Identify factual conflicts in these statements:",  
    contents=[doc1_summary, doc2_summary]  
)

Temporal analysis

Issue: Time-sensitive data may get averaged
Solution: Add temporal metadata to all extractions.

{"finding": "Market growth", "value": 15%, "timeframe": "2023-2024"}

Style variance

Issue: Different document formats reduce accuracy
Solution: Pre-process with Gemini’s multimodal parser.

parsed_doc = model.generate_content( 
"Convert this document to standardized research format", 
contents=[file] 
)

Recommended architecture

We haven’t yet got code formatting for Mermaid diagrams. But you can plug the stuff below right in for a great visual demo.

graph TD  
    A[Raw docs] --> B{Token check}  
    B -->|≤1M tokens| C[Parallel processing pool]  
    B -->|&gt;1M tokens| D[Gemini native segmentation]  
    C --> E[Structured JSON storage]  
    D --> E  
    E --> F[Cross-doc index]  
    F --> G[Hierarchical synthesis]  
    G --> H[Final report]  
    style C fill:#e1f5fe,stroke:#039be5  
    style G fill:#e8f5e9,stroke:#43a047

See how I changed the > symbol for > so you wouldn’t get an Unsupported markdown: blockquote error in Mermaid? I would give up my last slice of chocolate banana loaf for you, ducky.

You wanna see the diagram without bothering to put in the work?

This approach maintains full document context while enabling cost-effective analysis of massive datasets.

Gemini 2.0 Flash’s native parallelism and multimodal capabilities can process documents 3–5x faster than traditional chunk-based methods.

Putting it into practice

Ok. Let’s pretend I am looking to debunk climate misinformation. I am a man of many facets — I can’t just bake apple cobbler and write Python scripts for podcasters all the time now, can I?

Problem: Detect and debunk climate change misinformation across social media, news articles, and public discourse at scale.

Goal: Identify common false claims, analyse their prevalence, and generate structured debunking responses using parallel document processing with Gemini 2.0 Flash.

Datasets used

ClimateMiSt Dataset: 146,670 tweets and 4,353 news articles (2022–2024) labeled for misinformation/stance to train models detecting climate misinformation patterns (e.g., “CO? is harmless to ecosystems”), and classifying public sentiment.
Climate-FEVER: 1,535 real-world climate claims with evidence sentences from Wikipedia to validate claims against scientific consensus (e.g., refuting “solar cycles cause global warming”).
Harvard Climate Tweets: 39 million tweet (ok I know it would be Xs or whatever now but we’re talking in comparative terms about a million years before Elon got his hands dirty, mmkay?) IDs (from 2017–2019, see?) with climate keywords like #ClimateChange which will help us analyse public discourse trends and identify viral misinformation.
CARDS-examples: 62 expert-validated debunkings using the “truth sandwich” method (fact-myth-fallacy-fact) helping train models to generate psychologically effective debunking text.
Copernicus Climate Data: Observed climate data, projections, and reanalyses supplying ground truth for verifying claims (such as temperature trends versus global cooling myths).

Shall we do this? Shall we BLOODY DO THIS?

Let’s dance, Frank.

Parallel processing workflow

Step 1: Dataset ingestion

Process each dataset independently using Gemini 2.0 Flash (1M-token capacity per document).

# Example: Parallel analysis of ClimateMiSt news articles  
async def analyze_news(article):  
    response = await model.generate_content_async(  
        f"Extract misinformation claims from this news text: {article}"  
    )  
    return parse_json(response.text)

Step 2: Cross-document synthesis

Merge results into a misinformation “heatmap” using vector similarity.

It’s at this point I wonder where the world would be at without Python. It’s already a burning heap of slop and nastiness. God only knows.

# Identify overlapping claims across datasets  
common_claims = find_similar_vectors(tweet_data, news_data, threshold=0.85)

Step 3: Debunking generation

Use CARDS-examples to structure responses.

Fact: CO? enhances plant growth but harms ecosystems. Myth: “CO? is just plant food — more is better!” Fallacy: Cherry-picking (ignores climate impacts). Fact: Rising CO? increases invasive species and worsens droughts [2][12].

Expected results and advantages

Key outputs:

Top 5 Prevalent Misinformation Themes: Ranked list (e.g., “Climate action harms economy”) with frequency scores.
Source-specific spread analysis: Misinformation prevalence - social media (63%) > blogs (28%) > news (9%).
Automated debunking library: 200+ pre-validated responses using the “truth sandwich” format.

Another scalp for the cost-conscious genius, Mr Thack!

Why this works

Full-context analysis: Each document is processed whole, preserving nuanced arguments (such as sarcasm in tweets YES I KNOW Xs NOW)
Cross-verification: Climate data validates claims against observed temperatures/CO? levels. This is the whole analysing the big picture, picture, and seeing the wood and the trees! God we are so smart these days, at least, comparatively so compared to us in 2022 but not so much compared to the silicon sand powered by AGI in 2027*
Psychological efficacy: Structured debunking reduces belief in myths by 41% compared to fact-only responses.

*allegedly

Running this process revealed 32% of viral tweets (SWEAR DOWN ONE MORE TIME) falsely claim “climate policies cause job losses.” Which would allow targeted debunking campaigns using economic data from Copernicus, and potentially shift public opinion in key regions.

Tools I used

Google AI Studio (API management)
pgVector (vector similarity search for PostgreSQL)
Guardrails AI (hallucination detection)

Wrap

4× faster analysis and 2.5× cost savings compared to chunk-based methods, while maintaining scientific rigour?

Parallel processing. It all lines up!

Beyond parallel processing

Now you have got your fingers well and truly dirty doing something spectacular with very little effort or cost, you're doubtless salivating over smart silicon sand and how it can propel success for every business.

Here are a few great examples of businesses already incorporating AI into their processes to enjoy massive gains:

London-based global insurer Hiscox reduced lead underwriting time for specialised coverage from 72 hours to under three minutes. By integrating Gemini 2.0 Flash with BigQuery analytics, the business developed an AI model that processes thousands of risk variables across construction projects, cyber liabilities, and professional indemnity cases - achieving 98.7% accuracy in premium calculations. The move helped Hiscox enjoy operational efficiency, and 17% more market share in niche insurance segments through rapid quote turnaround
German renewable energy business Enpal saw a 87.5% reduction in solar panel quoting time through Gemini-powered spatial analysis, processing satellite imagery, 3D roof scans, and regional weather patterns to generate customised solar array proposals in 15 minutes instead of the 2 hours' manual effort. The AI looks at 14 technical parameters including shadow casting from adjacent structures, historical UV exposure, and local grid capacity constraints, while generating compliance documentation. The result? Enpal enjoyed 140% year-over-year growth in residential installations.
The 50,000 colleagues at the world's largest pulp producer Suzano can now query real-time production metrics, SAP inventory data, and global shipping schedules using plain language requests. Average query times for the Brazilian business' 28,000 monthly queries in Portuguese, English, and Spanish, are down from 45 minutes (IT ticket system) to 11 seconds. Reporting errors plummeted 73% through automated data validation.

For more incredible examples of how data and AI are revolutionising industry - and how your business can follow in the footsteps of Suzano, Enpal, and Hiscox - drop me a line at [email protected]

AI Today

216 位关注者

David Thackeray

Furiously curious unboxed thinker | Host: AI Today

2 周

Thanks to Enpal, Hiscox, Suzano, and of course the big Google for the inspiration to write this long-winded guide to parallel processing and a look at how amazing businesses are already making AI and data, table stakes (couldn't resist the nerdy data ref, ha ha!). Latest AI Today podcast episode is also championing the cause of those digital denizens betting big on the "supercharged excavator" of needles in data haystacks that is artificial intelligence... https://podcasts.apple.com/us/podcast/ai-today/id1770010162

要查看或添加评论，请登录

David Thackeray的更多文章

How to hire a senior architect for $15 a month

2025年2月25日

How to hire a senior architect for $15 a month

If you're not Meta or Amazon or Snowflake you probably can't afford a senior architect to oversee your entire codebase.…
Lost and founder

2025年2月23日

Lost and founder

We are all founders. Parents.
20 tips for kids

2024年6月19日

20 tips for kids

Been having a think about advice I'd give to my future kid. So I pulled in 20 tips which would be a speech I'll be…
Unleash your factory's potential

2024年6月19日

Unleash your factory's potential

Digital tools separating leaders from laggards It’s never been a bigger challenge to stay competitive in a world of…
Customer communications as a profit centre

2023年11月22日

Customer communications as a profit centre

I've been thinking for months about how we can transform customer communications into a profit centre. Since I joined…
Whatever happened to joy?

2023年6月25日

Whatever happened to joy?

I bet when your holiday's on the horizon it's all pina coladas, unprompted adventures in beach buggies, and romantic…
One small step

2023年6月8日

One small step

I recently shared a ton of advice you get from business books. But all the guidance applies to life.
Curse of your nose

2023年3月8日

Curse of your nose

We don’t often walk around wondering what it would be like if your eyes were further down your face. But it would…
Make 'em smile: content design 101

2022年5月23日

Make 'em smile: content design 101

I expected to find everyone nailing content design. It's hardly new.
Simple thoughts about content design

2022年4月24日

Simple thoughts about content design

You won't understand content design. Few say they do.

See all articles

Technical architecture for multi-document analysis

1. Document preparation and parallel processing setup

2. Intermediate analysis and cross-document indexing

3. Hierarchical synthesis process

Performance optimisation

Implementation checklist

Baseline testing

Parallel workflow deployment

Validation system

Cost management strategy

Tiered processing

Limitations and mitigations

Cross-document contradictions

Temporal analysis

Style variance

Recommended architecture

Putting it into practice

Datasets used

Parallel processing workflow

Expected results and advantages

Why this works

Tools I used

Wrap

Beyond parallel processing

AI Today

216 位关注者

David Thackeray的更多文章

How to hire a senior architect for $15 a month

Lost and founder

20 tips for kids

Unleash your factory's potential

Customer communications as a profit centre

Whatever happened to joy?

One small step

Curse of your nose

Make 'em smile: content design 101

Simple thoughts about content design