Interesting data model problems #1 - temporal reference data

I want to write about a common problem, but we often think it as a one-off issue when we encounter it in consulting. I came across it again at a government project recently.

Consider the dropdown below, the data like (1 Family, 2 Family et. c) is what we typically refer to as Reference Data.

Reference Data changes very slowly relative to other data in the application. It may sometimes be hard coded in the program, and at other times it is stored in a database table. A reasonable solution to this might be:

BuildingTypeId	Description
   1	         1 Family
   2	         2 Family
   3	         3-4 Family
   4	         Motel-Hotel

But how do we deal with values that might no longer be valid? Perhaps something like this?

BuildingTypeId	Description    IsActive
   1	         1 Family        1
   2	         2 Family        1
   3	         3-4 Family      1
   4	         Motel-Hotel     0

With a government project, we needed to know when those reference values were actually valid, so we we can change IsActive to a date range.

BuildingTypeId	Description   FromDate     ToDate
   1	         1 Family     1-Jan-2008    NULL2	         2 Family     1-Jan-2008    NULL3	         3-4 Family   1-Jan-2008    NULL4	         Motel-Hotel  1-Jan-2008  1-Jan-20185             Restaurant   1-Jan-2008  1-Jan-2018

Suddenly, our problem space has become complicated.

  1. When we display old data, we need to include all past values in our drop down, i.e. include Motel-Hotel
  2. When we display a blank data entry form, we should only display current values. i.e. exclude Motel-Hotel
  3. If we allow users to edit old data, our UI and validation rules will have to allow for building type to be Motel-Hotel, but not other building types that were valid.
  4. When data is shifted across boundaries and systems, what is the date of record?

I am not trying to solve the problem here, but to enumerate how a simple business requirement can gather complexity very quickly, how this can lead to mis-estimates.

Useful questions to ask are:

  1. Are these values subject to change over time?
  2. Are these values subject to removal over time?
  3. Does this mean this attribute is static and should not change once it is committed?
  4. Which field represents the time event for the building type entry? (see Martin Fowlers post on temporal data on the difference between actual time and record time).

Solutions

1. Look up fields

The simplest and most flexible way of dealing with temporal reference data is to stop dealing with simple methods like this

getValidBuildingTypes -> BuildingTypes[]

instead we need to deal with

getValidBuildingTypes(recordDate) -> BuildingTypes[]

2. Clear documentation

The table BuildingTypes should be properly annotated with the name of the field which constitutes the recordDate. It could be the date the application was lodged, it could be the date the application was paid, or the date application was granted?


要查看或添加评论,请登录

Chui Tey的更多文章

  • The PolyNetwork $600m hack.

    The PolyNetwork $600m hack.

    Kelvin explains it below, but here's a summary for my future reference. Source: https://twitter.

  • Bots, lies and Hoaxy

    Bots, lies and Hoaxy

    This write up is purely for the students of how information / disinformation is propagated. The notion of bots…

  • Interesting data model problems #2 - querying journal/transactional data

    Interesting data model problems #2 - querying journal/transactional data

    In these series of articles, I write about modelling problems that result in underestimates. You may also be interested…

  • Getting out of the poverty trap

    Getting out of the poverty trap

    The New York Times has an article (paywalled) about the value of social signalling in giving poor people the initiative…

  • 457 visas and a country of makers

    457 visas and a country of makers

    I want to chime in on the debate over the abolition of 457 visas in Australia. The issue surrounding 457s is a hard…

  • The Fateful Pull request

    The Fateful Pull request

    Your colleague has created the following code change and requested you review and approve it. Would you have looked a…

  • P-Invoke on OS X with PowerShell

    P-Invoke on OS X with PowerShell

    This is probably of interest to a very small section of the programming community. I wanted to learn a little bit about…

  • I tried out dokku today and this is what I learnt

    I tried out dokku today and this is what I learnt

    What does dokku do? If you have a dokku service running on your VM, you can "git push" to the server and dokku will run…

  • The mainstreaming of exploratory programming

    The mainstreaming of exploratory programming

    Can we apply technology to improve technology itself? Chas Emerick writes of a conversation with Prof Sussman of MIT…

    4 条评论
  • Recognising technological transitions

    Recognising technological transitions

    When there are technical shifts like the picture above, the game is not to find a faster horse, or a better jockey…

社区洞察

其他会员也浏览了