Caching helps. How you do it matters too. My story!

1. I thought I will Cache an array of objects

2. Logs showed : Repeated calls needed the same data again and again. Perfect for caching.

3. One caveat was that : The array could contain more than 1 Million elements

4. Keeping them in memory for extended periods of time was not an option because the size was huge. Process restart will force it to be repopulated.

5. Hence Persistence was needed.

6. Decided that : I will use the file system to cache data.[ Project already implemented the required interfaces. Lets experiment ]

7. Next decision was : How will I write the In-memory Object to a file and then later read it from the file and recreate the same Object.

8. Since time immemorial, people have been calling this Serialization and Deserialization[S&D]. I just had to implement 2 functions Serialize() and Deserialize() and my job was done.

9. Searched to find a great MSFT reference to copy paste code from :-D. The piece of code used BinaryFormatter to achieve S&D.

10. Copy paste successful. Small Local Testing. Things are fine. Deploy to Dev.

11. Test1 : Array Length : 600K. S is fast. Data cached to file. Cool.

12. Ran the test again to check if caching works or not. Expected super fast response time.

13. Checked the logs. The story is entirely different.

13. Deserialization is super slow. It is slower than actual network call. Network call ~~ 17sec. The deserialization is 48sec! May as well make Network call every-time. Right?

14. Oh my god! What have I done.

15. Took a step back and researched multiple S&D libraries and corresponding performance. Researched the Logic of how S&D would be done and what parameters should be kept in mind. I found out that BinaryFormatter has o(n^2) complexity for D of large objects. One major point was also the type of Data being S&D. This clicked.

16. Generic solutions are created for general populous. In this case, "ANY" type of object to be S&D. 

17. Specific logic is faster than Generic solutions. Generic solution has to take care of a lot of things. Specific solution, not so much.

18. My Objects were string arrays of Length > 1M . Each string length less than 100 characters.

19. Decided to throw out BinaryFormatter. Wrote a custom S&D. Just plain string read/write. Nothing fancy.

20. Boy did this work.

21. Insane drops. S is even faster than before. D decreases from 48sec to 1.5sec for the same object. :-D

Lesson learnt : Think about the data you are handling. Can you make some assumptions which will always hold true? If yes, exploit them to the fullest.

Souparno Daripa

Sr. Software Engineer at Gainsight

4 年

This is great! Sometimes we find our code works better than most of the libraries. And this is just because we look to fulfill specific requirement not the generic.

Jas Arora

Software Engineer 2 at Microsoft

4 年

Please keep writing these tips and insights they help us.

要查看或添加评论,请登录

社区洞察

其他会员也浏览了