Handling dates

Handling dates is tough in real life.

Date handling is probably tougher in the data engineering world.

Very few people seem to agree on how to store and handle dates and timestamps. I know that the complexity is due to real life situations of time zones and local processing and also due to formats like 'mm/dd/yyyy' vs 'dd/mm/yyyy'.

Personally, when file names have timestamp in them, and we wish to store such files in a directory, I believe 'yyyy/mm/dd' is the best format, as this allows us to quickly sort the files and arrange them by date.

Interestingly and logically, from the database perspective, a timestamp is neither 'dd/mm/yyy' or 'yyyy/mm/dd' nor any other format. It is usually one or two numbers (longs values), where the value represents microseconds gone by since an epoch date. What we see on the screen are these values formatted for our consumption. As an analogy, if we were dark glasses, the whole world looks very dark. If we were Red coloured glasses, everything appears to have a shade of Red - in the worst case, some complimentary colours will look Black. The same goes for dates and timestamps. The way a date/timestamp is displayed depends completly on the format we specify. If we simply print the raw date/timestamp value, a long number will be displayed, as it is stored in the database.

Hence, the next time we argue that the data is displayed in 'dd/mm/yyy' format, whereas specification mentions it to be 'yyyy/mm/dd', the most likely culprit is the format specified when creating a display equivalent, rather than the data stored in the database

Links

  1. https://www.quora.com/How-is-date-stored-internally-in-databases
  2. https://dba.stackexchange.com/questions/288933/how-does-postgresql-store-timestamp-internally

And one more article -- https://errorprone.info/docs/time

#date_format #date #timestamp #date_handling #timestamp_handling

要查看或添加评论,请登录

Bipin Patwardhan的更多文章

社区洞察

其他会员也浏览了