Build You a Library

Build You a Library

This post was originally published on The (new) Sampler, my personal blog for technical case studies and general business thoughts at sedar.co. Posts are admittedly infrequent and lean towards statistics and insurtech: subscribe to the RSS feed for updates.

When I started to build out the data science / actuarial team at my current job, I was keen to make sure we had a small library of technical reference books at our immediate disposal. It's not unusual for companies to have such a resource [^1], but in my opinion a good reference library is underrated. Our line of work is technical, creative, and experimental, so it's wise to have inspiration and a good guide. [^2]

The list is far from canonical or exhaustive, formed from my technical education & professional experience, with a few crowd-sourced titles for good measure. [^3] I hope this might help similar data science / statistics / insurance practitioners looking to build out a technical in-house library. The common thread is modern, leading, technical references under the general themes of Machine Learning, Bayesian Statistics, Insurance and Data Science.

Each entry has format:

  • [Title](link-to-publisher) by [Author](social-handle) [Amazon](link-to-buy) <--- commentary with links to online materials etc


General Machine Learning

A solid grounding in Un/Supervised Learning from Data (using & developing algorithms that adapt to observations to provide useful representations, infer behaviours or make predictions):

You might also consider:


Bayesian Statistics & Probabilistic Programming

Specialised statistical modelling, esp. where we care about parsimony, parametric and/or functional form, handling uncertainty in a principled way, and learning from what the data-generating process(es) and observational process(es) tell us:

You might also consider:


Further Statistics Reading for the Insurance Domain

As a generalist ML / stats practitioner I'm biased towards solving technical problems with modern reproducible research and in particular Bayesian inference. I believe I'm not alone in finding standard actuarial tools / techniques somewhat archaic and suffering from 'professionalisation' where over-simplified models are unnecessarily implemented by hand, learnt rote for the sole purpose of passing an exam, rarely questioned and quickly forgotten [^4].

I want to overcome this bias because "Statistics is applied statistics" [^5] and it's vital to understand one's domain in detail: the business processes, the nuances of the data-generating processes, and learning from the hard-won lessons of the domain experts. The following texts appear to lead in very much the right direction:


General Data Analysis / Python / R / Software Dev

It's impossible to cover all ground here, but these are good references for day-to-day "data science work":


General Reading and Data Viz

Inspiration and casual interest - loan these out across the company to help spark ideas and bridge gaps:


Do shout if you have recommendations worth adding!

?

---


[^1]: These purchases were well-supported internally as part of our wider T&D program, and represent a powerful investment for relatively little money.

[^2]: Proper references can also help to justify the use of non-traditional techniques if you can show that other people (usually smarter than you) also think in the same way. [It's dangerous to go alone!](https://en.wikipedia.org/wiki/It%27s_dangerous_to_go_alone!).

[^3]: Thanks in particular to Mick Crawford and the folks on the [Pandas Arms](https://thepandasarms.slack.com/) Slack channel.

[^4]: Thanks also to Kenny Holms and the folks on the [Actuaries Anonymous](https://actuariesanonymous.slack.com/) Slack channel for opinions and recommendations on the actuarial collection.

[^5]: Gelman usually has an apposite [quote](https://www.stat.columbia.edu/~gelman/book/gelman_quotes.pdf).


This post was originally published on The (new) Sampler, my personal blog for technical case studies and general business thoughts at sedar.co. Posts are admittedly infrequent and lean towards statistics and insurtech: subscribe to the RSS feed for updates.

要查看或添加评论,请登录

Jonathan Sedar的更多文章

  • Delivering Value Throughout the Analytical Process

    Delivering Value Throughout the Analytical Process

    This post was originally published on The (new) Sampler, my personal blog for technical case studies and general…

  • On Contractor Day Rates

    On Contractor Day Rates

    This post was originally published on The (new) Sampler, my personal blog at sedar.co.

    3 条评论
  • 9 Questions To Determine If You Have A Good Data Science Ecosystem

    9 Questions To Determine If You Have A Good Data Science Ecosystem

    This post was originally published on The Sampler, our in-house blog at Applied AI. Subscribe to our RSS or email feed…

  • Our Growing World of Instech

    Our Growing World of Instech

    This post was originally published on The Sampler, our in-house blog at Applied AI. Subscribe to our RSS or email feed…

  • The Data Science Maturity Model

    The Data Science Maturity Model

    This post was originally published on The Sampler, our in-house blog at Applied AI. Subscribe to our RSS or email feed…

    2 条评论
  • How to Build a Data Science Business Function

    How to Build a Data Science Business Function

    This post was originally published on The Sampler, our in-house blog at Applied AI. Subscribe to our RSS or email feed…

  • Tools of the Trade (an overview)

    Tools of the Trade (an overview)

    This post was originally published on The Sampler, our in-house blog at Applied AI. Subscribe to our RSS or email feed…

  • Data Science has become a well established discipline, so what is it?

    Data Science has become a well established discipline, so what is it?

    This post was originally published on The Sampler, our in-house blog at Applied AI. Subscribe to our RSS or email feed…

社区洞察

其他会员也浏览了