Consider Alluxio as a Remote Cache Service?
Xinli shang
ex - Apache Parquet PMC Chair & PrestoDB committer, Senior Engineering Manager @ Kafka, ex-TLM @ Data Infra
Alluxio is a virtual distributed storage system that can connect to numerous storage systems through a common interface and orchestrate the reads & writes between compute and storage clusters. In addition, Alluxio is used as a cache layer for several use cases like Presto local cache. We evaluate Alluxio as a remote cache service and the unity layer for the underlying storage and we see early positive results. We see Alluxio almost doubled the throughput of HDFS with the cache enabled.?
More work needs to be done to establish a remote cache service with Alluxio but the initial result seems promising.