登录查看更多内容

Our Data Problem - and a cheap way around it

Mika Lehtim?ki

StratXcel.tech | DPhil (Oxford) | Corporate lawyer

发布日期: 2023年4月21日

Mika: "Guru, how will we defeat the almighty large tech and data in our quest for the holy grail that is dynamic game?"

Guru: "Seek not in others' words but go to the Source and you will soon harness knowledge and enlightenment of data. Go now Mika, for I need to code, that is, meditate."

Facing the limitations of data and Bloomberg's power

The most pressing problem with our neural network and the connected game theory code has been the lack of extensive, relevant data. You can utilise APIs to get basic financial and trading data, such as a few Yahoo finance-based APIs and Alpha Vantage. Then you realise, it is about 5 queries per minute. Unless you automate things and take a six months vacation, this is a problem.

Then you have excellent commercial databases - which are very costly if you need hundreds of millions of data points. And then good-old Bloomberg devastated us, minor players, by launching Bloomberg GPT using decades of extensive trained data. Should we just give in? Unlikely. So, there are two options:

compete head-on with them, or
purchase eventually API rights and build on their work (extremely tempting).

However, you'll still have to train your own models to reach the capability of dealing with the ultimate objective. For us that is being able to play dynamic financial games with a trained neural network - that is, knowing what to do taking into consideration everyone else's optimal actions in changing environment.

Get to the Basics and Do your Scraping

The solution was clear. Albeit, not having a Guru to resort to, I did the next best thing. I did what I was taught at Oxford.

You do not use secondary sources. You go to the original sources of everything - no shortcuts. We did that. We've been doing the tedious work of scraping free, extensive financial data from mandatory securities filings and sites that are public. Pre-processing and building Pandas Dataframes, arranging and evaluating data throughout. We are also currently at hundreds of millions of items as a result. Thanks SEC!

Because we are not building large language models, unless that proves as the only game in town, we'll crunch the data using low-key supervised learning. In addition, we will add a (legal) reinforced learning module and a Q-learning module to out software during the Summer. Thus, we do not need a hundred NVIDIA GPUs to dance with the data - I'm gonna regret saying this though.

领英推荐

AI - Saturday, December 7, 2024: Commentary with…

Robert Sutor 3 个月前

Superfast Vector Search for LLM, GPT, GenAI and More

Vincent Granville 1 年前

#AGI? ARE WE WHERE YET?

Michael Minkevich 10 个月前

The cash saved by getting to the basics buys my daughters a lot of ice-cream - highest game theoretical utility

But where, where does it all lead (in a Scottish accent)!

Next month we'll get additional hands just for quantitative finance machine learning - a cool title Machine Learning Quantitative Finance Analyst - 'Maleqfa' - perhaps not a good acronym. I'll think again.

Pre-processing of the data is a huge task, as we are not investing in providers of ready-made data. It may appear tedious work, but it combines financial skills, with a bit of Python Pandas and Numpy acrobatics and generally analyzing what is relevant for the solvency and liquidity of the companies.

Such work is material. See what happens, when the data is not properly vetted, pre-processed and 'relevancy-checked'. It leads to the demise of the coder and mandatory meditative practices to restore the willingness to plough through the intricate neural networks built over the months.

What we are testing at the moment is the simple categorization of very large company datasets based on target categorisations of certain financial ratios. We started with the solvency/liquidity measures like the Current ratio moving on to multiple financial ratios. Oddly enough (not a surprise) good old corporate finance basics provide the best solutions. The 'bad thing' - a lot of financial analysis for the Quant Analyst(s).

As for myself, due to our slow systems compared to most competitors; I'll just leave the computers to iterate through 100 epochs and head for the best place in H?meenlinna - 'Appara' as we locals say. Back there, I do not need a Guru, for Appara makes you a Guru. Happy Weekend!

Predictive Games of Law

468 位关注者

要查看或添加评论，请登录

Mika Lehtim?ki的更多文章

Overcoming the Fear of Dark Hyde Park: How to Soar Above ChatGPT and DeepSeek

2025年2月7日

Overcoming the Fear of Dark Hyde Park: How to Soar Above ChatGPT and DeepSeek

Can you beat ChatGPT or Google's or Mistral's models? It seems unsurmountable. Our little kiwi bird has pondered…
Blue Sky So Full of Balloons and LLM Agents

2024年11月26日

Blue Sky So Full of Balloons and LLM Agents

New LLMs and agent systems are constantly showing up in the sky. Amazing start-ups are working on all aspects of…
Basel IV, Riskipainotusten muutokset ja Vaikutuksista Kiinteist?rahoitukseen

2024年10月23日

Basel IV, Riskipainotusten muutokset ja Vaikutuksista Kiinteist?rahoitukseen

Keskustelen t?ss? artikkelissa tietyist? Basel IV s??ntelyn vaikutuksista kotimaiseen kiinteist?rahoitukseen ja miten…
What You Learn from a Week of Pitching

2024年9月24日

What You Learn from a Week of Pitching

Last week I spent a lovely week in London and Oxford promoting our start-up StratXcel. The week was full of emotion;…

2 条评论
'Soldiering on' and 'doorways to happiness'

2024年8月5日

'Soldiering on' and 'doorways to happiness'

Our journey at StratXcel is at a turning point. Although, for a start-up, every day feels like a turning point.

5 条评论
"London calling to the underworld. Come out of the cupboard, you boys and girls"

2024年5月17日

"London calling to the underworld. Come out of the cupboard, you boys and girls"

The Clash song 'London Calling' is the perfect tune for today. Our kiwi-like bird is confused.

16 条评论
"You can't do it Mika", but...

2024年4月17日

"You can't do it Mika", but...

We've all encountered those disheartening words at some point in our lives, whether from a parent, a boss, a teacher…

4 条评论
Happiness...and the art of giving up everything material

2024年2月2日

Happiness...and the art of giving up everything material

I was enjoying my morning coffee at Caffè Nero High Street in Oxford when a man at the next table uttered "only a…

15 条评论
November...slush and AI pitching

2023年11月29日

November...slush and AI pitching

In Finland, November is often considered a very gloomy month. The 'grayest day' of the year is 11 November.
November...slush and AI pitching

2023年11月24日

November...slush and AI pitching

In Finland, November is often considered a very gloomy month. The 'grayest day' of the year is 11 November.

See all articles

Our Data Problem - and a cheap way around it

Mika Lehtim?ki

StratXcel.tech | DPhil (Oxford) | Corporate lawyer

Facing the limitations of data and Bloomberg's power

Get to the Basics and Do your Scraping

领英推荐

But where, where does it all lead (in a Scottish accent)!

Predictive Games of Law

468 位关注者

Mika Lehtim?ki的更多文章

社区洞察

其他会员也浏览了

GenAI in 2025 - key predictions on themes shaping the future of technology

The paradox of machine learning – what leaders need to know

Google Open Sources TFCO to Help Build Fair Machine Learning Models

DeepSeek R1 Review and Quick Study

BxD Notes (Saturday Letter #202407)

BxD Notes (Saturday Letter #202406)

BxD Notes (Saturday Letter #202418)

How to Train your Transformer

In Focus: Google’s Gemini 2.0 Flash Thinking – Ushering in a New Era of AI Reasoning

Dawn of The Age of Algorithms

Facing the limitations of data and Bloomberg's power

Get to the Basics and Do your Scraping

领英推荐

But where, where does it all lead (in a Scottish accent)!

Predictive Games of Law

468 位关注者

Mika Lehtim?ki的更多文章

Overcoming the Fear of Dark Hyde Park: How to Soar Above ChatGPT and DeepSeek

Blue Sky So Full of Balloons and LLM Agents

Basel IV, Riskipainotusten muutokset ja Vaikutuksista Kiinteist?rahoitukseen

What You Learn from a Week of Pitching

'Soldiering on' and 'doorways to happiness'

"London calling to the underworld. Come out of the cupboard, you boys and girls"

"You can't do it Mika", but...

Happiness...and the art of giving up everything material

November...slush and AI pitching

November...slush and AI pitching

社区洞察

其他会员也浏览了

GenAI in 2025 - key predictions on themes shaping the future of technology

The paradox of machine learning – what leaders need to know

Google Open Sources TFCO to Help Build Fair Machine Learning Models

DeepSeek R1 Review and Quick Study

BxD Notes (Saturday Letter #202407)

BxD Notes (Saturday Letter #202406)

BxD Notes (Saturday Letter #202418)

How to Train your Transformer

In Focus: Google’s Gemini 2.0 Flash Thinking – Ushering in a New Era of AI Reasoning

Dawn of The Age of Algorithms