Read more to design better
Thank you so much for reading this edition of the newsletter ?? If you found it interesting, you will also love my courses
Read more to design better
One habit I built during my early days was to read design docs, even if they did not belong to my team. The first thing I did after joining Amazon, back in 2016, was to go through their internal Wiki portal. The portal hosted all the public design docs and documentation written by various teams. The portal was a goldmine of information.
One thing that I absolutely love about design docs is how practical they are. The designs are not just some random set of boxes drawn on a piece of paper, but rather they contain a highly practical approach to solving a problem and the solution will be shipped to production.
With multiple engineers writing them and several tech leads reviewing them, these docs hold all the required context, trade-offs made, alternate designs, implementation nuances, and potential pitfalls. Reading them gives a deeper understanding of the domain, the problem, and the system.
To be honest, I was initially quite overwhelmed reading them. But over time I got used to it and started connecting the dots. So, if you try to do this, do not be discouraged by the initial complexity, because things will get easier over time.
So, if your company also practices writing design docs, do spend time reading them, even if they are from different teams. If not, then be the one who initiates and drives this process.
Forming a habit of reading design docs consistently, rewired my thought process and made me a better engineer; hence I would highly recommend you pick this habit up.
By the way,
Being hands-on is the best way for you to learn. Practice interesting programming challenges like building your own BitTorrent client, Redis, DNS server, and even SQLite from scratch on CodeCrafters.
?? Video I posted this week
This week I posted How Rockset achieves zero data latency and workload isolation at scale
Rockset has been a great database for me to dissect, and one of the most interesting things that I learned about it is how it achieves zero data latency and workload isolation at scale.
After talking about their architecture, storage layer, and query execution in the last three videos, I published the 4th one about their most amazing feature - Compute Compute Separation - and its internal details.
Instead of just talking about what it does, I have covered the intuition behind the approach and evolution of architecture. It will give you a ton of insights into how distributed databases are designed, built, and scaled.
?? Paper I read this week
This week I spent reading Query Attribute Recommendation at Amazon Search
This week I read a research paper by Amazon that solves a Search problem very similar to the one that I solved for Unacademy.
领英推荐
High-quality input to a search engine results in good quality results, but people type really short queries like "iPhone 13". Although this information seems complete to us, it is not sufficient for search engines. So there is a need to add Query Understanding.
The core idea is to build a model that expands the search query and generates and extracts relevant search query attributes which are then passed to search engines for better ranking, advertising, and recommendation.
For example: iPhone 13 -> brand:apple, os:ios, model:iphone, etc...
The information retrieval domain is fascinating and something I focussed on during my master's and during my first two years at Unacademy. The paper is pretty short and crisp and something you can even prototype, hence putting it out as a recommendation.
You can download this and other papers I recommend from my papershelf.
How PostgreSQL store large rows?
While exploring PostgreSQL internals, I stumbled upon an interesting internal detail about how it stores long rows.
PostgreSQL stores data in B-Trees and requires one row not to exceed the page size (about 8KB). So, when we insert a row longer than 8KB, it first tries to compress the overflowing data using its built-in compression algorithms. if it works, then great!
If it still does not fit, then TOAST comes into the picture. The core idea is to segment the data into chunks and store them in a dedicated TOAST table, while the original table holds the reference (checksum and virtual address) to ensure efficient retrieval.
The TOAST table for a particular table is named pg_toast_; you can find them with a simple query. This table stores the toast chunks (compressed if required), each identified by a chunk ID and sequence number for ordered retrieval.
We as users of PostgreSQL need not worry about storage as the database transparently manages it. Still, TOASTed data can incur slight performance overhead due to the additional layer of indirection and lookup.
We cannot completely avoid the TOAST, but we can minimize the need for it by designing the schema well, some best practices are
For example, the PLAIN strategy offers a good balance for frequently accessed, compressible data; while EXTENDED works well for infrequently accessed data.
?? Interesting articles I read this week
I read a few engineering blogs almost every single day, and here are the three articles I would recommend you to read.
Thank you so much for reading this edition of the newsletter ?? If you found it interesting, you will also love my courses
Consultor | Empresarial y en el Emprendimiento | Planeación Estratégica y Direccionamiento | Estructuración Organizacional | Administración y Gestión | Productividad | Gestión Costos y Presupuestos | Planes de Negocio |
9 个月Que importante y valioso tema el abordado en este artículo por nuestro autor, muy interesante y como se argumenta determinante para la persona y profesional que está en permanente compromiso de avanzar, de crecer, de mejorar, de enriquecer su talento y conocimiento. Esto se facilita enormemente con férreos hábitos y disciplina de lectura, consulta e investigación en temáticas específicas y de conjunto los cuales se presentan en los documentos de dise?o de una forma que es práctica para su aplicación. Por ello de lo acertado, valioso y practico para reflexionar, adicionando la decisión de iniciar con esta practica para avanzar y crecer.
Realtor Associate @ Next Trend Realty LLC | HAR REALTOR, IRS Tax Preparer
9 个月Well said!.
Pursuing PGDM
9 个月Arpit Bhayani Informative.
System Design for SDE-1s: https://arpitbhayani.me/sys-design System Design for SDE-2s and above: https://arpitbhayani.me/course Redis Internals: https://arpitbhayani.me/redis-internals My knowledge base: https://arpitbhayani.me/knowledge-base Bookshelf: https://arpitbhayani.me/bookshelf Research Papers: https://arpitbhayani.me/papershelf