Functional Cache - high performance on web portals
Wojciech Zielinski
IT Executive | Project/Program Manager | Transformation Manager | Agile Coach | SAFe RTE | Prince2 Practitioner | ITIL | SciFi Author
This article has been originally published on 07.10.2011 at LinkedPHPers blog. Due to problems with Blogspot solution it has been moved to LinkedIn articles area.
I would like to share with some idea of a new caching architecture implemented?COBAEX CMS system prepared by a company I manage - COBA Solutions. It already proven it really rocks - and I think sharing an idea of how to speed up the page generation mechanisms might be interesting.
But getting to the point - the very first question is
Why the standard caching mechanisms are not enough ?
Generally standard caching mechanisms works in a way, that the page that is served by a server is copied either on a client, or on a caching (proxy) servers, and stays there in that way until it is changed. To determine whether the page has been changed or not the server needs to rebuild the page or use some information when it is changed. If the page has not changed - the pre-prepared page or page element is served, in other way - the new version of the page is served (and overwritten in cache).
Much more about this you can read in nice article by Larry Garfield published on his blog - so I think there is no need t describe the whole mechanism in details.
I will maybe focus on the facts why this mechanism is not enough right now. Generally this mechanism focuses only on the page transfer issues. It does not focus on the speed of page generation on the server.?
Additionally this mechanism needs a server to prepare a page many times -?even though?this page might be recreated again and again in the?completely?same, unmodified form and content.
The question I asked my self was why the server needs to recreate the same page again and again ??Why the server can't have the copies of this page and serve only an HTML page ??
The answer was that this is not the way most CMSes are working :) So we decided to change it - and prepare a mechanism, which we called
Functional Cache
The word "functional" in this name is quite important. In simple words - we prepared something like a cache, kept on the server, that is rebuilded (pages or page fragments are regenerated) not basing on the page views, but on the actual system functionality. What I mean is that the cache is rebuilded only if a system functionality (e.g. page edition or some dynamic portal functionalities) triggers it - and only the pages that actually changed are rebuilded (cache is rebuilded very much partially).?
A simple example of such regeneration (based on an actual COBAEX CMS implementation) might be the page?https://www.cobasolutions.com/business_software/en_cms?- this page will be regenerated once: somebody will modify the page content, top or bottom menu has been modified. In case of top or left menu - the page will be regenerated only if specific level of this menu has been changed (actually on this page top and left menu is actually the same menu object with different levels - but this is different topic, I think it's not worth getting into details which it is needed in this article).
Architecture Approach
Implementing such approach needed the complete revision and redesigning of the whole architecture. The "standard" architecture of CMS systems works in a way, that we have 3 elements:
In most solutions all these elements stays on a single physical server.
Generally this means the database is some kind of link between the administration panel and actual page.
In the architecture that uses the functional cache approach we basically have very similar elements, but used in a different way:
The way it works in most implementations (the larger ones) is that 3 first elements are working on one physical server (that is actually accessed only by site admins - not site users) and the last one is different server (or even standard hosting - as it does not use any sophisticated functionality) accessed by all site users.
The link between the presentation and administration layers is the publishing mechanism - the mechanism implemented on secured administration server, that actually puts the static HTML files (using FTP, sFTP or SCP) on the presentation server.
The main advantages of this architecture are:
领英推荐
Performance tests
Even though?I believe the architecture description shows, that these systems simply must be faster then "standard" CMSes - we decided to make several tests?comparing?the pages created on Joomla and our system.
The test procedure we followed was to install on the same virtual machine Joomla standard installation and our system serving same pages, that were prepared by Joomla team. We used JMeter, restarting the VM before each test - and the results were following:
These tests shows, that even once the compared system designed in standard architecture is serving the pages in real time - the?architecture?proposed by COBAEX CMS is still faster - even 5,86 times in the worst case scenario.
Implementations
This system is not something brand new - this architecture already been proven on several implementations. I think it might be useful to mention them together with a simple description of the advantages for these specific implementations.
Akademickie Biuro Karier Uczelni Heleny Chodkowskiej (High School Academic Career Portal) - the COBAEX CMS system allowed us to create a site, that is very seucure in the matter of user authorization and CV storing. The actual CVs are stored on an administration server, and presentation server is requesting the needed, single CV only once it is needed. This means that once somebody would wish to get e.g. all the CVs from the system - it is hradly possible, as he would have to brake into administration system to get these CVs. Breaking into presentation system will give him no more then one CV.
Also regarding the authorization - this one is made using LDAP servers working directly in the School network, and the access to that servers is allowed only from administration server. This means that the actual user authorization process goes through 3 servers, which makes it quite secure :)
Uczelnia Heleny Chodkowskiej w Warszawie (High School Homepage) - COBAEX CMS system is used for administering a website with 500+ subpages, using subdomains, different news or menus and administered by 20+ users with full user rights management. The 2-server architecture additionally allowed us to put the administration servers in the closed high school network, while the page itself is served using the standard hosting services.
Generally the management system is centralized not only for this site, but also for several other sites - 2 additional schools and academic career portal mentioned before. And this administration is made using single interface and single sign-on system (while the pages are put on several different hosting sites).
twoje-zdanie.pl?opinion?portal - actually the first implementation of the COBAEX CMS system, that really makes a good use of functional cache mechanisms in scope of performance. The portal allows to add and publish?opinions?about different companies. This means it stores large number of companies, each having the details page, the?opinions?lists,?opinion?details and comments details pages. Once using a "compilation" mechanisms the site is really very fast, even using quite limited resources.
Actually this system was based on a quite old version of this system - the compilation mechanisms we have right now are uncomparable to the ones used on twoje-zdanie.pl - however still this portal rocks in the matter of speed and performance :)
Domoklik.pl real estate offers portal - this is the newest implementation of COBAEX CMS system. The newest version allowed us to create a portal that is actually the?fastest?one in Poland -?even though?having the largest number of offers in the database.
The mentioned implementations are of course not all of the COBAEX CMS implementations. I thought however there's no need to mention all of them (especially that some I cannot even say they are ours :) ) and the mentioned ones already shows, that the architecture is proving itself.
Some final words
OK - the article got quite long - but I hope still interesting.
One very important thing?regarding?the described solution is the fact, that it actually can work in parallel with the?existing, commonly used technologies such as caching proxies, request compression or else. The architecture of functional cache works only on a server side - so there are no problems with adding other, additional optimization methods.
Also the following article describes the general idea - the way how it will be used very much varies on a specific implementation. What I mean is that e.g. in case of Domoklik we are compiling the parts of the page, not the whole pages. The engine however allows us to add full page compilation, which might get us much faster. At the moment however we decided not to implement that - as even?without?that we have a portal, that loads in 1,4 sec :)