Browser Performance, Stability, and the bane of HTML Thrashing
Jeremy Streeter
Senior Technology Leader | MS in Software Engineering | Empathetic | Heuristic | People Focused
You may hear web developers talk about ReactJS or virtual DOM and wonder why they are talking about it. It helps to solve a problem many web applications suffer, but particularly something called HTML Thrashing.
Let us discuss this scenario to describe and understand thrashing. You have a data entry screen with repeated data, laid out into a table with columns. Somewhere in this content, web developers have written some code that addresses a bug having to do with styling because the right side of the table is supposed to stick to the left side of a container. Fine.
When that code executes, it causes a cascade of HTML changes to fire, for instance each cell in the table resizes and may or may not have to handle that resize event. The more data, the more HTML on the screen displaying the data, the more elements on the web page that must react to that original event. It causes problems because a lot of this kind of code is recursive because it checks children, and those children check their children, and so on, to make sure the HTML elements look the way we expect. This recursive handling and sudden cascade of events is HTML Thrashing. It temporarily consumes the browser’s rendering memory to perform all the operations, and feels and looks slow.
In some cases, these series of events happen on every cell in every column to row in that table when a change occurs, often associated with 2-way “dirty check looping” data binding, which includes loops to check for (dirty) changes all the time.
You can understand why thrashing is bad from my description I think, but to put it plainly: thrashing makes a web page behave slowly because it cannot render fast enough.
The way we might reduce thrashing involves different solutions.
One of the most popular solutions is virtual DOM. Many developers make custom virtual DOM solutions to best fit their needs. ReactJS is a well marketed, robust, and code managed and supported virtual DOM platform.
As a library, ReactJS uses observable updating. ReactJS, one of the many virtual DOM solutions out there, is a library that makes changes in a virtual DOM (not yet rendered code-only HTML), which updates based on observables, and then provides changes all at once to the actual DOM; (this is an intentional simplification). These are really the basics of a virtual DOM in general.
Virtual DOM is all about making changes to a code-only version, a copy, of the current HTML displayed, and then providing those changes to the browser at intervals for reducing HTML thrashing and decrease the render time. It feels fast, while the HTML still thrashes once in a while, it does so considerably less because it does not have to compensate for individual changes on the DOM as frequently.
The less an application needs to maintain 2-way data binding with loops, the more likely that the performance of the application does not need such strict UI management. AngularJS 2 involves virtual DOM, along with a number of other templating platforms. The solution an organization picks for use with virtual DOM should really involve whether a software solution is new or existing and how mature the application is into the development process. HTML Thrashing typically becomes an issue with complex, large data lists that require many data interactions.
An excellent way to manage thrashing is through server-side paging, which inherently reduces the size of the data sets hitting the browser, making it considerably easier to render and run more quickly. This places a lot of responsibility on the web servers and impact how they performance, really making it important that the web application undergoes thorough and robust load testing for both speed and stability.
I am certain there are other ways to mitigate HTML Thrashing, and increase the performance and stability of a web application, but these are the two I see as the opposing sides of the board.
With my experience in dealing with this problem, I am a fan of the server paging approach. Placing the responsibility of data management and data interactions squarely on a server compartmentalizes the functionality of concerns, and makes the browser do what it is good at: displaying the data read and sent to and from a server. Even if the number of requests to and from a server increases, the decrease in responsibility placed on the browser reduces the potential problems is well worth the effort.
Whatever choice an engineering team makes, it is worth evaluating a rapid prototype of both of the options provided above to compare them.