登录查看更多内容

Aging code

Thomas Schmelzer

Portfolio construction and technology @ ADIA | Commodities and LS Equities | Visiting Scholar at Stanford.

发布日期: 2023年12月23日

Code does not age well --- in particular if written in Python.

This year I had the pleasure to conduct research on a paper on the critical line algorithm by Marcos Lopez de Prado and David Bailey. The paper is here. I appreciate when academics publish code (fragments). My PhD supervisor Lloyd Nick Trefethen is a big proponent of this approach. He published an entire collection of his gems in the paper Ten digit algorithms.

Code is (for me) somewhat a lot more expressive and clearer than any mathematical poetry academics and practitioners come up with. The ultimate truth is in the code --- not in the paper. Prof. Alexander Lipton may disagree with me. Plenty of papers manage to get an unhealthy mismatch between both. Code makes a paper stronger...

The paper by de Prado and Bailey does a good job here and the code is embedded in very detailed comments. However, the code did not age well. I failed to get it running. The code is approximately 10 years old.

I moved from Matlab to Python in 2011 being pushed by developer legend Marcin Snarski . I first went back to my old code from the very same period. My old code is available on GitHub. My old code isn't exactly in perfect shape either. Sometimes I go back to my old code at night and address the biggest problems...

Today, I can only offer suggestions how to mitigate the effect. The problem with Python is that the environment and the actual code are often treated as separate entities. Because of this weakness I fell in love with containers. There we can embed both code and environment together in the same unit. This works great for apps and immutability is given for ages.

Some may refuse containers and they don't make sense if you just develop a package used in a bigger application. In this case, you may want to drown your package in tests. Those tests will help to identify the point where your package will eventually fall over and it will. My first tests check whether the code can reproduce the results stated in the paper... There should always be a paper somewhere with your code. Even if it's not intended for publication.

Today there are also tools like dependabot that update your packages following a schedule you define. I use it for most of my packages. Dependabot would not blindly update dependencies. It would generate a pull request which is triggering all the aforementioned tests. My little bot goes wild every Sunday...

But leaving all technical helpers aside we need to accept that code needs love and attention. We should append "But it worked 10 years ago" to "But it worked on my computer"...

领英推荐

Life without Python’s ‘dead batteries’

InfoWorld 5 个月前

Make your code more Pythonic with Magic Methods

Profil Software 2 年前

Python Riddle to Solve in Reasonable Time

Profil Software 2 年前

There's a lot of hype on "reproducible science" but without addressing the aging issue this will fail even if we had the very best intentions.

I wish you all a wonderful Christmas and a great start into 2024...

Thomas

p.s.: If you are interested in the critical line algorithm: Github. With code and dependabot in action...

Thomas Schmelzer

Portfolio construction and technology @ ADIA | Commodities and LS Equities | Visiting Scholar at Stanford.

1 个月

Today I am using renovate rather than dependabot. Both are good tools though.

Nazia Khan

Founder & CEO SimpleAccounts.io at Data Innovation Technologies | Partner & Director of Strategic Planning & Relations at HiveWorx

8 个月

Thomas, Great insights! ?? Thanks for sharing!

Steve Cannon

Two Sigma, ADIA, AQR, IBKR | quant for every investor | trunc.ai

1 年

Love people who publish code as well.

1 次回应

Daniel Uhlemann

Strategy-Analytics-Investments ,, Not everything that can be counted counts, and not everything that counts can be counted." by William Bruce Cameron

1 年

What ever happened to the Universal Compiling System. Also very relevant issue, some of those legacy codes may have an application for quantum computing.

Tobin Driscoll

Unidel Chaired Professor of Mathematical Sciences at the University of Delaware

1 年

More than once I've found that a published code does something not explained in, or even explained differently in, the prose of its related paper. For all its drawbacks, I will give MATLAB some credit. I've maintained the SC Toolbox since 1994, and only once did the language introduce breaking changes (in graphics). That is a remarkable achievement!

1 次回应

查看更多评论

要查看或添加评论，请登录

Thomas Schmelzer的更多文章

Backtesting in the clouds

2025年2月24日

Backtesting in the clouds

Imagine a scenario where you want to facilitate backtesting for multiple strategies, written by various developers…

28 条评论
When did Elvis die?

2025年2月20日

When did Elvis die?

After celebrating being on the 25 quants listed by EQDerivatives, Inc it's time to focus on important stuff again…
Hierarchical Methods in Portfolio Construction: Introducing pyhrp

2025年2月15日

Hierarchical Methods in Portfolio Construction: Introducing pyhrp

In the last part of my mini-series of hierarchical methods. We are diving again into hierarchical methods for portfolio…

3 条评论
Hierarchical Methods in Portfolio Construction: Traversing trees

2025年2月9日

Hierarchical Methods in Portfolio Construction: Traversing trees

I shall start this 2nd and pre-ultimate part of my mini-series on hierarchical methods with a disclaimer: It is not my…

9 条评论
Hierarchical Methods in Portfolio Construction: Understanding the Root Node

2025年2月8日

Hierarchical Methods in Portfolio Construction: Understanding the Root Node

Portfolio construction has long relied on grouping assets into sub-portfolios rather than treating all assets as a…

13 条评论
Cornering Kelly

2025年2月1日

Cornering Kelly

Univariate Trading Systems: Simple & Practical Position Sizing In the world of trading, particularly within…

9 条评论
Radical assembly lining

2025年1月26日

Radical assembly lining

The evolution of manufacturing and software development processes has always fascinated me. My early days as a mechanic…

7 条评论
Rapid Quanting

2025年1月25日

Rapid Quanting

Apparently, quanting is a word! Just as car manufacturers use shared platforms for efficiency, stability, and speed in…

6 条评论
The Secret Shrinkage Sauce

2025年1月22日

The Secret Shrinkage Sauce

WARNING: By LinkedIn standards, this article leans heavily on mathematics. Statisticians have a reputation for poor…

11 条评论
Is convexity really worth it?

2025年1月21日

Is convexity really worth it?

Ever since my early days in finance I have worked with convex functions and their optimization. Once in a while I still…

26 条评论

See all articles

Aging code

Thomas Schmelzer

Portfolio construction and technology @ ADIA | Commodities and LS Equities | Visiting Scholar at Stanford.

领英推荐

Thomas Schmelzer的更多文章

社区洞察

其他会员也浏览了

History: Creation of Python

Week #1 "What is the evolutionary history of the Python?" '+' #' Basics'/n'

Packaging Python and PyTorch for a Machine Learning Application

Python vs Julia: Which Language is Faster?

WHAT IS NUMPY

Palindrome? What is the time complexity?

Why Should AI/ML Be Limited to Python? Let’s Explore the Possibilities Together

5 Biggest Issues While Using Python for Data Science and Artificial Intelligence

Comparing Floats and Integers in Python: When Equality Isn't Quite What It Seems

Day 4: Python Numbers – Int, Float, Complex, Type Conversion, and Random Numbers

领英推荐

Thomas Schmelzer的更多文章

Backtesting in the clouds

When did Elvis die?

Hierarchical Methods in Portfolio Construction: Introducing pyhrp

Hierarchical Methods in Portfolio Construction: Traversing trees

Hierarchical Methods in Portfolio Construction: Understanding the Root Node

Cornering Kelly

Radical assembly lining

Rapid Quanting

The Secret Shrinkage Sauce

Is convexity really worth it?

社区洞察

其他会员也浏览了

History: Creation of Python

Week #1 "What is the evolutionary history of the Python?" '+' #' Basics'/n'

Packaging Python and PyTorch for a Machine Learning Application

Python vs Julia: Which Language is Faster?

WHAT IS NUMPY

Palindrome? What is the time complexity?

Why Should AI/ML Be Limited to Python? Let’s Explore the Possibilities Together

5 Biggest Issues While Using Python for Data Science and Artificial Intelligence

Comparing Floats and Integers in Python: When Equality Isn't Quite What It Seems

Day 4: Python Numbers – Int, Float, Complex, Type Conversion, and Random Numbers