Geo-Distributed Alfresco - AWS - Let's replicate ?
Today i was in WebSummit in Lisbon assisting an AWS partner workshop on "How to build multi-region applications in the cloud, by Adrian Hornsby" where AWS announced that early next year Aurora will support write replication across database replicas in multiple AWS regions.
Alfresco has customers that require full replication globally (let's imagine London and Sidney) and their current highly evolved deployments involve having read-replicas of the repository and database that assure good read performance but when it comes to writing the experience changes dramatically. Write requests to the repository will ALWAYS have to hit the master Aurora database and due to the network latency the performance highly degrades when the write requests to the master database are coming from a different region.
As far as i understand, the foundation blocks (in order of importance) to distribute Alfresco Content Services in AWS across regions are :
- The database is accessible to all nodes for read and write operations with the same latency .
- The solr cluster of shards is accessible to all nodes with similar latency.
- The contentStore is shared across the nodes and all nodes access it with similar latency.
To address those 3 fundamentals we can :
1) Have Aurora multi-master with read and write replicas, both reads and writes will be within the same region of all requests.
2) Have 2 separated clusters of shards, one on each region tracking local databases. All search requests will hit shards within its region.
3) Our Alfresco S3 ContentStore exists on a S3Bucket in a single region (imagine London), so when it comes to downloading binary data (includes previews, thumbnails and document downloads) users in Australia will suffer some latency that normally they can live with but even in this situation, AWS offers Cross-Region-Asyncronous replication for the S3 Bucket. https://docs.aws.amazon.com/AmazonS3/latest/dev/crr.html
This is just the theory but i would surely like to test it. My only doubt here is whether Amazon S3 replication is fast enough to avoid any contentStore-database synchronisation issues at Alfresco.
So after seeing this, i would like to formulate my letter to Santa.
Dear Santa, i've been a good boy this year, please bring me enough AWS credits so i can setup a benchmarking playground to host 1Billion documents.
* Aurora as the Database with cross-region multi-master replication
* A cluster of 4 alfresco nodes in London eu-west-2 (3 nodes behind a load-balancer serving user requests and a separated node for bulk-ingestion) connected to London Aurora master database and a London S3ContentStore.
* A cluster of 4 solr 6 nodes in London eu-west-2, using sharding policies allowing ultra fast search response times for users in Europe.
* A cluster of 4 alfresco nodes in Sidney ap-southeast-2 (3 nodes behind a load-balancer serving user requests and a separated node for bulk-ingestion) connected to Sidney Aurora Replica database and a Sidney S3ContentStore (replica of London S3ContentStore)
* A cluster of 4 solr 6 nodes in Sidney ap-southeast-2, using sharding policies allowing ultra fast search response times for users in Australia.
* An S3 ContentStore hosted in a S3 bucket than can replicate seamlessly across my 2 regions (eu-west-2,ap-southeast-2)
Assuming replicating a contentStore across regions becomes un-acceptable due to costs, Alfresco S3 connector can be adjusted to allow working with multiple-buckets, this way we could choose what parts of the repository we would want to replicated (those that need fastest access)) and avoid having to duplicate the entire repository.
#websummit2018