Using Python to explore FX market microstructure
I am sure like most readers, I've lost count of the number of weeks since things were "normal". When will things get back normal? Maybe weeks or more likely months? I suppose that using a Python function random.random(), might be the best approach here! However, in the meantime, I've been trying to work hard at home, doing client projects and also building out my Python open source libraries.
A few weeks ago I open sourced tcapy, a Python library for doing transaction cost analysis (available on GitHub here). I've also put up lots of Jupyter notebooks to demonstrate tcapy, which can be run interactively in your browser, using Binder (thanks Thomas Schmelzer for his help here). I'm also currently working on using Docker to make tcapy easier to deploy (not quite there yet..!) Whilst the main focus of tcapy is transaction cost analysis, analysing your own trade data against market data as a benchmark, I've recently rewritten it, so you can use tcapy to explore high frequency tick data, which I'll discuss now elaborating on some short posts I've already on Twitter and LinkedIn.
The difficulty with doing calculations on high frequency tick data, is that the datasets are huge. As a result, it can be time consuming, and you have to batch the results. I've written tcapy so that it does its computation in a distributed manner, and there's lots of smart caching of data to speed it up. You have 20 cores? Great, kick off tcapy to use lots of Celery workers, and see your computation take advantage of the compute power at your disposal. I'm also keen to extend the computation engine so it can be used easily with AWS Lambda and similar serverless compute services, if I can find a sponsor for this work.
In terms of generating results for market microstructure, I've made it easy to use high frequency market data in tcapy to calculate statistics such as the bid/ask spread throughout the data...
To read the rest of this article on the Cuemacro website, please click here!