Text and trading
People usually ask me several types of questions. The most common one is where I think you can find the best burger is. I do have an answer to this (which may or may not be a secret), but in order to validate this response, I'll need to do a lot more burger based research in the coming years. There's obviously a financial (and possibly health related) cost to this research, but I'm willing to make the sacrifice to find the perfect burger.
When it comes to work related questions, the subjects I usually get asked revolve around FX, alt data and Python etc. When it comes to alt data, a pretty common question I get asked is the following: what dataset should I look at when trading or some variation on this? There is no generic answer that works for use case. In The Book of Alternative Data, Alexander Denev and I, went through quite a few different different use cases for many a number of different asset classes and datasets to give a flavour. At the current time, there are thousands of alt datasets from many different vendors, so having to pick a single dataset from this list is extremely difficult, in particular to do it properly, would require a lot of due diligence and research time. For large funds, they have dedicated teams of data strategists sifting through datasets to shortlist appropriate ones, and then research teams to do the number crunching for a number of those.
The answer I usually give is that it depends on how much structuring work they want to do to a dataset. If they are willing to spend the time (and have a budget), text based datasets can be a nice place to start. One reason is that there are a lot of text datasets out there. In many cases vendors will have already structured the text for you, adding useful labels to help you make sense of the text.