Chapter 2: Improving the efficiency of your locked down eCommerce Site

Chapter 2: Improving the efficiency of your locked down eCommerce Site

In the last article in this series we took a look at trimming the session duration associated with the shopping cart, giving up resources faster to support a higher number of users on the same hardware. We are looking to bring efficiency levels on the allocation of hardware to a lean & mean level in advance of the holiday shopping period to ensure that every user has the best possible user experience on the site. This article’s recommendations continue down this same path on improving the efficiency of the existing system, ideally without code changes.

Caching outliers - These should be handled by your CDN

At this point we expect that an eCommerce site will have a CDN employed in front of the site servers to shorten the network path to the end user of the site for content delivery. This shortened path directly leads to faster response times at the client, which in turn leads to a higher conversion rate. There are usually a couple of outliers which are not commonly tracked for their cost nor are they appropriately staged in the CDN for resolution to the client. These are HTTP 301 ( Permanent Redirects) and HTTP 404 ( File not found ).

There is a very straightforward way to tell if your site can be improved by adjusting the cache settings for these two types of requests. You may recall in the last post it was recommended that the HTTP time-taken field be enabled so that time cost information could be tracked for each and every request. We will now take advantage of that data which has been collected for the past week and change. You will want to leverage your choice of log processing tools to produce a rollup of the costs of all requests by status. Our firm recommends Splunk for this work. We also use Microsoft Logparser for this work when Splunk is not available. Here is what the query looks like in LogParser

select HTTP-STATUS, sum(TIME-TAKEN) from HTTP_Requests.log group by HTTP-STATUS

You can choose to output this data in table or chart form. I prefer the chart form for a view of the scale of the issue

 Together HTTP 301 & 404 are taking about 15% in the above example. But these values can be significantly higher. In the case of the website The Onion, HTTP 404 responses related to missing articles were the majority of the load of the website. Fixing missing resource requests for pages just isn’t “sexy” in many organizations. A missing resource then resolves to an expensive request which cannot be handled in the CDN and must be resolved by the most distant web server to the client, slowing the client. Developers would rather work on new features rather than clean up behind their peers for missing page elements. This results in some amount of dead requests on the website which can be trimmed out.

HTTP 301, Permanent Redirects, are expensive for a related reason. If not handled by the CDN then these requests are always handled by the origin web servers. This is not just a single request, but a request which results in a redirect to the new location of the resource. The client then makes a request to the new location. Hopefully this new location is in cache and can be resolved by the CDN.

We can recover a substantial portion of the load from each of these request types by using the CDN to cache HTTP 301 & 404 for us. In some cases the CDN can even resolve the 301 and keep the final destination in cache resulting in a single request from the client with final content rather than a handshake and a pass off to the final file. By pushing the response to a location closer to

the client we are both removing the load from the origin servers, saving the resources for other users during high load conditions, as well as improving the response time by shortening the network path to when the request is resolved. A faster client converts at a higher rate.

The mechanics of how become a bit tricky. This is where you need to understand whether your CDN settings or your website Origin server settings have priority in setting the cache policy for your various website elements. If your CDN has priority then the changes are pretty mechanical. You sign onto your CDN console and find the settings related to the caching of HTTP 301 &404 responses and set policy for these types of requests. And then you are done.

On the other hand, if your policy is set to respect the origin web servers then you will need a custom error handler for 404 and the ability to modify the cache header for the generated 301 redirect. The HTTP 301 would unfortunately be a site modification, something very difficult to implement during a code freeze. The graph above will provide some idea of the impact of the change on the load of the web servers to justify the variance during this period. For the HTTP 404, this custom error handler, which implemented in code, is external to the website code.

In both cases set the cache aging to be to the same period as your regular build push, such as expiring every Sunday at 1:00am. This will result in a substantial reduction of writes to the HTTP error log, when the maximal number of HTTP 404 errors limited to the number of CDNs than need to have their cache populated inside of your cache aging window. The number of disk writes to the main HTTP log will also be reduced. Overall this should reduce a dependency on both the finite resources of network (for CDN satisfaction of the request) and on disk (from the reduced number of log writes) - Improved efficiency.

Stop Storing Empty Shopping Carts!

The storage of empty shopping carts is something which happens often in eCommerce shops. The reasons are multiple, but it usually falls to a combination of a default shopping cart being handed out to every user arriving on the site combined with a persistent cart policy which allows a user to return at some indeterminate point in the future and have access to the same shopping cart. Unfortunately this combination of default cart and cart persistence typically results in thousands upon thousands of empty carts being stored as a part of cart storage. This leads to longer and more complex indices on carts plus a radical expansion on the storage needs for carts

How best to address this? There is both a policy aspect and a technical aspect which need to be implemented. On the policy front empty carts should not be stored. As we are inside of the code freeze window for Holiday sites the best way to address this empty cart storage is to handle this with a scheduled job to query the cart storage system. Essentially, at midnight of each day delete all carts (and related records) which have no contents, zero cart elements. Your architecture is going to determine the exact nature of the query executed, but what will result is a removal of the dead wood in cart storage. Any lookup of a cart will have to traverse a shorter index which directly leads to less disk action and faster response from cart storage.

After the Holiday period ends and code changes are beginning to be incorporated into production then the requirements set for new code can include the provision to never store an empty cart. Until then you can have the job run to remove the empty carts just before you run your backups of the cart system. As a plus your backup window needs should also shrink with the removal of the empty carts.

Test in Prod - This is likely your last week to schedule

If you plan on conducting a production performance test prior to Thanksgiving this then is the week to reach out to your internal performance testing team or your external performance testing services provider. It may take four to five weeks to get your test in the queue for execution which leaves a small window of opportunity before the onset of Holiday shopping to make any changes.

We expect most IT organizations already have a full plate at this time of year as a part of Holiday readiness. If you need any assistance at all with the above three items, measuring and then adjusting cache settings for HTTP 301 & 404 requests, settings a query for the removal of empty carts or assistance with staging a test in your production environment prior to the Holiday shopping period then give us a ring at LiteSquare., 888-212-1104 or [email protected]

James Pulley, LiteSquare

Ritika Dhir

Senior Product Lead - Data @ Kiwibank

9 年

good one

回复
Raman Pandey

Building world's finest AI-Copilots under Microsoft R&D Group | Data Science, ML & Analytics | Microsoft Technology Speaker ASIA| National University of Singapore: Entrepreneurship & VC | Indian School of Business - MBA

9 年

I envy you on this ....Worth reading

回复
Barry Perez

Performance Engineering

9 年

Phil Lewis you will find this interesting...

回复
Tapas Bansal

Performance Test Lead at Wipro Technologies

9 年

Very usefull stuff and perfectly timed

回复

要查看或添加评论,请登录

James Pulley的更多文章

社区洞察