登录查看更多内容

Geo-Distributed Alfresco - AWS - Let's replicate ?

Luis Cabaceira

Lead Solutions Architect @Rocket.Chat

发布日期: 2018年11月8日

Today i was in WebSummit in Lisbon assisting an AWS partner workshop on "How to build multi-region applications in the cloud, by Adrian Hornsby" where AWS announced that early next year Aurora will support write replication across database replicas in multiple AWS regions.

Alfresco has customers that require full replication globally (let's imagine London and Sidney) and their current highly evolved deployments involve having read-replicas of the repository and database that assure good read performance but when it comes to writing the experience changes dramatically. Write requests to the repository will ALWAYS have to hit the master Aurora database and due to the network latency the performance highly degrades when the write requests to the master database are coming from a different region.

As far as i understand, the foundation blocks (in order of importance) to distribute Alfresco Content Services in AWS across regions are :

The database is accessible to all nodes for read and write operations with the same latency .
The solr cluster of shards is accessible to all nodes with similar latency.
The contentStore is shared across the nodes and all nodes access it with similar latency.

To address those 3 fundamentals we can :

1) Have Aurora multi-master with read and write replicas, both reads and writes will be within the same region of all requests.

2) Have 2 separated clusters of shards, one on each region tracking local databases. All search requests will hit shards within its region.

3) Our Alfresco S3 ContentStore exists on a S3Bucket in a single region (imagine London), so when it comes to downloading binary data (includes previews, thumbnails and document downloads) users in Australia will suffer some latency that normally they can live with but even in this situation, AWS offers Cross-Region-Asyncronous replication for the S3 Bucket. https://docs.aws.amazon.com/AmazonS3/latest/dev/crr.html

This is just the theory but i would surely like to test it. My only doubt here is whether Amazon S3 replication is fast enough to avoid any contentStore-database synchronisation issues at Alfresco.

So after seeing this, i would like to formulate my letter to Santa.

Dear Santa, i've been a good boy this year, please bring me enough AWS credits so i can setup a benchmarking playground to host 1Billion documents.

* Aurora as the Database with cross-region multi-master replication

* A cluster of 4 alfresco nodes in London eu-west-2 (3 nodes behind a load-balancer serving user requests and a separated node for bulk-ingestion) connected to London Aurora master database and a London S3ContentStore.

* A cluster of 4 solr 6 nodes in London eu-west-2, using sharding policies allowing ultra fast search response times for users in Europe.

* A cluster of 4 alfresco nodes in Sidney ap-southeast-2 (3 nodes behind a load-balancer serving user requests and a separated node for bulk-ingestion) connected to Sidney Aurora Replica database and a Sidney S3ContentStore (replica of London S3ContentStore)

* A cluster of 4 solr 6 nodes in Sidney ap-southeast-2, using sharding policies allowing ultra fast search response times for users in Australia.

* An S3 ContentStore hosted in a S3 bucket than can replicate seamlessly across my 2 regions (eu-west-2,ap-southeast-2)

Assuming replicating a contentStore across regions becomes un-acceptable due to costs, Alfresco S3 connector can be adjusted to allow working with multiple-buckets, this way we could choose what parts of the repository we would want to replicated (those that need fastest access)) and avoid having to duplicate the entire repository.

#websummit2018

要查看或添加评论，请登录

Luis Cabaceira的更多文章

Understanding Single Threaded applications Scalability

2024年10月29日

Understanding Single Threaded applications Scalability

Rocket.Chat is built on Node.
Best Practices for Rocket.Chat

2024年4月9日

Best Practices for Rocket.Chat

Hi all, im back to writing a tech article..

1 条评论
My Last Day at Alfresco Hyland

2023年8月30日

My Last Day at Alfresco Hyland

This story starts 10 years ago, I was working in Barcelona for OpenText and my career needed a change, i felt i was…

45 条评论
Stop, Rewind, Play

2021年4月30日

Stop, Rewind, Play

Reviewing the last 7 Years with Alfresco, i've compiled an interesting list on some of my works that eventually got…
SAML & Alfresco - From zero to hero

2018年5月18日

SAML & Alfresco - From zero to hero

In this article i will show you some basic and advanced SAML concepts and a real example showing how to use SAML…

3 条评论
Cheetah method to application Performance

2018年3月3日

Cheetah method to application Performance

Hello friend, today we will be using nature observation to explain you the Cheetah method to achieve optimal…

See all articles

Geo-Distributed Alfresco - AWS - Let's replicate ?

Luis Cabaceira

Lead Solutions Architect @Rocket.Chat

Luis Cabaceira的更多文章

社区洞察

其他会员也浏览了

AWS update of Week 42 (16 Oct - 22 Oct)

Understanding the Basics of AWS Lambda.

AWS update of Week 29 (17Jul - 23Jul)

AWS update of Week 17 (24Apr-30Apr)

AWS update of Week 37 (11Sep - 17Sep)

Day 42: Relational Database Service in AWS

WC 24/11/04 AWS Whats New

Deploy Django Application on EC2 with PostgreSQL, S3, Domain, and SSL Setup

Week 8 (19 Feb - 25 Feb)

AWS update of week 4 (23Jan -29Jan)

Luis Cabaceira的更多文章

Understanding Single Threaded applications Scalability

Best Practices for Rocket.Chat

My Last Day at Alfresco Hyland

Stop, Rewind, Play

SAML & Alfresco - From zero to hero

Cheetah method to application Performance

社区洞察

其他会员也浏览了

AWS update of Week 42 (16 Oct - 22 Oct)

Understanding the Basics of AWS Lambda.

AWS update of Week 29 (17Jul - 23Jul)

AWS update of Week 17 (24Apr-30Apr)

AWS update of Week 37 (11Sep - 17Sep)

Day 42: Relational Database Service in AWS

WC 24/11/04 AWS Whats New

Deploy Django Application on EC2 with PostgreSQL, S3, Domain, and SSL Setup

Week 8 (19 Feb - 25 Feb)

AWS update of week 4 (23Jan -29Jan)