Demystify ETag
Abhishek Rajendra Prasad
Goldman Sachs Engineering Associate | MS in CS @ UTD | ex AirAsia | IIT-Dh
By Abhishek R and?Naveen S.R
What is ETag? ??
The ETag or entity tag is a part of the HTTP response header that acts as an identifier for a specific resource version. It is one of the mechanisms that HTTP provides for Web cache validation. This mechanism allows caches to be more efficient and save bandwidth, as a web server or even backend services does not need to resend a full response if the content has not changed. Additionally, etags help prevents simultaneous updates of a resource from overwriting each other.
If the resource at a given URL changes, a new Etag value?must?be generated. Etags are similar to fingerprints, and can quickly determine whether two representations of a resource are the same. They might also be set to persist indefinitely by a server/service.
Where can we use ETag?
Avoiding mid-air collisions
Let’s say when multiple clients are trying to change a wiki page, How can we detect a mid-air edit collision?
We can hash the current wiki content and sent it in the ETag response header:
ETag: “006540df2072ef320c644e61720c754f3”
while saving the wiki page, we send a POST request with the If-Match request header containing the ETag value we received from the response header previously to check the freshness of the page.
If-Match: “006540df2072ef320c644e61720c754f3”
If the hashes don’t match, it means that the document has been edited in-between and a 412 Precondition Failed error is thrown.
Validation of Cached data
Let’s say a mobile/browser has cached response from the server. But how can it check the freshness of the cache and decide whether to show from the cache or get a fresh response from the server?
ETag when used in conjunction with the If-None-Match request header can be used to take advantage of caching at the client side. The server generates the ETag which can determine a page has changed. Essentially clients ask the server to validate its cache bypassing the ETag back to the server.
The process looks like this:
ETag: “006540df2072ef320c644e61720c754f3”
If-None-Match: “006540df2072ef320c644e61720c754f3”
How we use ETag in our back end service ??
Let’s look into how we can actually implement this on the back-end side!!
ETag Generation
We use the MD5 hash of response from our service, which happens at the servlet filter level.
How can we improve the server time? ??
Well comparing the ETag and the If-None-Match and sending 304 Not Modified all seems fine, but the server time remained the same if not increased.
So the trick is to introduce?caching?and?Interceptor?in the spring-boot service.
Caching
We introduce caching mechanism in your service to store the request hash as a part of the key and the response hash has the value so that we can map them.
Cache Namespace:- ETag
Cache key:- consumerID::userID::MD5Hash(Request Payload)
Cache value:- MD5Hash(Response Body) — ETag value
At any point in time, there will be a single entry for a particular userID for a particular consumerID.
领英推荐
Servlet Filters and Interceptor
Servlet Filter
A filter is an object used to intercept the HTTP requests and responses of your application. By using a filter, we can perform two operations at two instances
Usually, servlet filters will not have access to spring-context, meaning you can’t @Autowire to get spring beans.
So use the following method to register your filter or you can even use @Autowire if that filter is defined as a component.
@Configuration
public class ETagConfig {
@Bean
public CustomETagHeaderFilter customETagHeaderFilter(){
return new CustomETagHeaderFilter();
}
// or
// @Autowire
// CustomETagHeaderFilter customETagHeaderFilter
// And use this in the while registering bellow
@Bean
public FilterRegistrationBean<CustomETagHeaderFilter> customETagHeaderFilterRegistrationBean() {
FilterRegistrationBean<CustomETagHeaderFilter> filterRegistrationBean
= new FilterRegistrationBean<>(customETagHeaderFilter());
filterRegistrationBean.addUrlPatterns("/cs/v1/content/data","/cs/v1/content/userintent/data");
filterRegistrationBean.setName("etagFilter");
return filterRegistrationBean;
}
}
Interceptor — HandleInterceptor
HandlerInterceptor is very similar to Servlet Filter, but it just allows custom pre-processing with the option of prohibiting the execution of the handler itself, and custom post-processing.
This interface contains three main methods:
The following piece of code will register with InterceptorRegistry. Note you can also choose the URL pattern for it to apply the interceptors.
@Component
public class WebMvcConfig implements WebMvcConfigurer {
@Autowired
ETagInterceptor eTagInterceptor;
@Override
public void addInterceptors(InterceptorRegistry registry) {
List<String> pattern = new ArrayList<>();
pattern.add("/cs/v1/content/userintent/data");
pattern.add("/cs/v1/content/data");
registry.addInterceptor(eTagInterceptor).addPathPatterns(pattern);
}
}
Design
From the design, you can make out the following three cases:
Implementation
Filter Implementation
Here I will provide a few methods from the CustomETagHeaderFilter class which extends from the OncePerRequestFilter abstract class
@Override
protected void doFilterInternal(HttpServletRequest request, HttpServletResponse response, FilterChain filterChain)
throws ServletException, IOException {
String previousToken = request.getHeader(HttpHeaders.IF_NONE_MATCH);
if(previousToken != null) {
HttpServletResponse responseToUse = response;
HttpServletRequest requestWrapper = new BodyHttpServletRequestWrapper(request);
if (!isAsyncDispatch(request) && !(response instanceof ContentCachingResponseWrapper)) {
responseToUse = new ConditionalContentCachingResponseWrapper(response, request);
}
filterChain.doFilter(requestWrapper, responseToUse);
if (!isAsyncStarted(request) && !isContentCachingDisabled(request)) {
updateResponse(requestWrapper, responseToUse);
}
} else {
// If the header is not present just pass the control next filter in the chain
filterChain.doFilter(request, response);
}
}
private void updateResponse(HttpServletRequest request, HttpServletResponse response) throws IOException {
ContentCachingResponseWrapper wrapper =
WebUtils.getNativeResponse(response, ContentCachingResponseWrapper.class);
Assert.notNull(wrapper, "ContentCachingResponseWrapper not found");
HttpServletResponse rawResponse = (HttpServletResponse) wrapper.getResponse();
if (isEligibleForEtag(request, wrapper, wrapper.getStatus(), wrapper.getContentInputStream())) {
String previousToken = request.getHeader(HttpHeaders.IF_NONE_MATCH);
String eTag = wrapper.getHeader(HttpHeaders.ETAG);
if (!StringUtils.hasText(eTag)) {
eTag = generateETagHeaderValue(wrapper.getContentInputStream(), this.writeWeakETag);
rawResponse.setHeader(HttpHeaders.ETAG, eTag);
logger.info("ETAG: " + eTag);
}
String cacheControl = response.getHeader(HttpHeaders.CACHE_CONTROL);
if(cacheControl == null || !cacheControl.contains(DIRECTIVE_NO_STORE))
{
String finalCacheKey = String.valueOf(request.getAttribute(Constants.RequestAttributes.NAMED_ATTR_REQUEST_HASH));
cacheHelper.setCache(Constants.Cache.ETAG_CACHE_NAMESPACE, finalCacheKey, eTag.replace("\"", ""));
}
if(compareETagHeaderValue(previousToken, eTag)){ // compare previous token with current one
// use the same date we sent when we created the ETag the first time through
rawResponse.setHeader(HttpHeaders.LAST_MODIFIED, request.getHeader(HttpHeaders.IF_MODIFIED_SINCE));
logger.info("ETag match: returning 304 Not Modified");
rawResponse.sendError(HttpServletResponse.SC_NOT_MODIFIED);
} else { // first time through - set last modified time to now
Calendar cal = Calendar.getInstance();
cal.set(Calendar.MILLISECOND, 0);
Date lastModified = cal.getTime();
rawResponse.setDateHeader(HttpHeaders.LAST_MODIFIED, lastModified.getTime());
}
}
wrapper.copyBodyToResponse();
}
Note that we have an option to tell the filter not to store the Etag in the cache by sending in the request header Cache-Control with the value containing no-store.
To read the request multiple times in the filters/interceptors we have written a wrapper called BodyHttpServletRequestWrapper and we pass this wrapper through the filter chain instead of HttpServletRequest which can be read only once.
public class BodyHttpServletRequestWrapper extends HttpServletRequestWrapper {
private final byte[] body;
public BodyHttpServletRequestWrapper(HttpServletRequest request) {
super(request);
this.body = HttpRequestHelper.getBodyString(request).getBytes(StandardCharsets.UTF_8);
}
public BufferedReader getReader() throws IOException {
return new BufferedReader(new InputStreamReader(this.getInputStream()));
}
public ServletInputStream getInputStream() throws IOException {
final ByteArrayInputStream bais = new ByteArrayInputStream(this.body);
return new ServletInputStream() {
public boolean isFinished() {
return false;
}
public boolean isReady() {
return false;
}
public void setReadListener(ReadListener readListener) {
}
public int read() throws IOException {
return bais.read();
}
};
}
}
Here we are storing the request body in a variable, which basically servers as a cache for the request body which can be read as many times as you want!!
Interceptor
@Slf4j
@Component
public class ETagInterceptor extends HandlerInterceptorAdapter {
@Autowired
HashRequestKey hashRequestKey;
@Autowired
CacheHelper cacheHelper;
@Autowired
CacheEvictService cacheEvictService;
private static final String DIRECTIVE_NO_STORE = "no-store";
@Override
public final boolean preHandle(HttpServletRequest request, HttpServletResponse response, Object handler) throws IOException {
String cacheControl = response.getHeader(HttpHeaders.CACHE_CONTROL);
if(!cacheControl.contains(DIRECTIVE_NO_STORE))
{
String method = request.getMethod();
if (!"GET".equalsIgnoreCase(method) && !"POST".equalsIgnoreCase(method))
return true;
String previousETag = request.getHeader(HttpHeaders.IF_NONE_MATCH);
if (previousETag != null) // If the If-None-Match header is present
{
HttpServletRequest requestWrapper = request;
if (!(request instanceof BodyHttpServletRequestWrapper))
requestWrapper = new BodyHttpServletRequestWrapper(request);
String finalCacheKey = hashRequestKey.getCacheKey(requestWrapper);
if (finalCacheKey != null) {
// check cache present or not
if (cacheHelper.cacheCheck(Constants.Cache.ETAG_CACHE_NAMESPACE, finalCacheKey, previousETag.replace("\"", ""))) {
response.setHeader(HttpHeaders.ETAG, previousETag);
// re-use original last modified timestamp
response.setHeader(HttpHeaders.LAST_MODIFIED, request.getHeader(HttpHeaders.IF_MODIFIED_SINCE));
log.info("ETag match: returning 304 Not Modified");
response.sendError(HttpServletResponse.SC_NOT_MODIFIED);
return false; // no further processing required
}
log.info("ETag no match found");
String[] splitCacheKey = finalCacheKey.split("::", 3);
// Set the request hash in the request header for future use
request.setAttribute(Constants.RequestAttributes.NAMED_ATTR_REQUEST_HASH, finalCacheKey);
// Evict the cache if the userID is not Null
if (!splitCacheKey[1].equals("null")) {
String pattern = Constants.Cache.ETAG_CACHE_NAMESPACE + "::" + splitCacheKey[0] + "::" + splitCacheKey[1] + "*";
cacheEvictService.clearCacheByType(pattern);
}
}
}
}
return true;
}
}
Note that we have an option to not implement any process at the interceptor level by sending in the request header Cache-Control with the value containing no-store.
How can a client use the ETag?
For a client to enable the ETag and use it for cache validation, they can simply do so by sending in two request headers:
When they get the response from the server, they have to store the following response headers for future uses:
So in the future when they again call the server, they should send the following
Validating the cache
Make sure you are having the cache for the user in the client before using the ETag feature
Heading the AirAsia Move Flights Engineering Team | Engineering Leader | Travel Anchored SuperApp
3 年Outstanding work by Abhishek R . Going to help us with the mobile smart cache strategy. Pablo Sanz , Madhawan Misra