Business Technology for the Win -Cloud Databases
Troy Hiltbrand
Chief Information Officer | International Experience | Data & Analytics Industry Leader | Award-winning Enterprise Architect | IT Strategy
In the digital world, data is of primary interest to businesses. Smart businesses know that it is a prerequisite to success. Today's data has been compared to oil in the 18th century - an immensely, untapped valuable asset that is key to the economy functioning, and without it, progress would stop.
The challenge is that so much data is being generated every day and yet it all has to be managed to be effective. Studies show that we will end this year with 97 zettabytes (ZB) of data worldwide. That is based on an estimated 2.5 quintillion bytes, or 2,500 petabytes (PB), being created every day. Some of this data has immediate value to the business, but much of it won't have value until some time in the future, if ever.
As a CIO, your challenge is where to store all of this data. Your objective is to make it accessible so that it can be sorted through to identify what is valuable and what is not. As great of a challenge as this is, the three major cloud providers ( Amazon Web Services (AWS) , Google Cloud , and Microsoft Azure Cloud ) provide multiple data storage options that help simplify the process and do it in a scalable manner. These offerings include a combination of relational and NoSQL databases, each with different strengths and weaknesses.
Amazon Web Services
Relational Database Server (RDS) - makes it easy to set up, operate, and scale a PostgreSQL, MySQL, MariaDB, Oracle Database, or SQL Server relational database in the cloud. It provides cost-efficient and resizable capacity. It also automates time-consuming administration tasks such as hardware provisioning, database setup and configuration, upgrades and patching, and backups.
Aurora - is a MySQL and PostgreSQL-compatible relational database engine. It combines the speed and availability of high-end commercial databases with the simplicity and cost-e?ectiveness of open-source databases. Performance testing has shown that Amazon Aurora is three to five times faster than either a standard MySQL database or a standard PostgreSQL database implementation. Amazon Aurora features a distributed, fault-tolerant, self-healing storage system.
Redshift - uses SQL to analyze structured and semi-structured data across data warehouses, operational databases, and data lakes. It uses AWS-designed cloud-based hardware and machine learning to deliver optimal price performance at any scale.
DynamoDB - is a key-value and document database that delivers single-digit millisecond performance. It is a fully managed, multi-region, multi-master database with built-in security, backup and restore capabilities, and in-memory caching for internet-scale applications. Load testing has shown that DynamoDB can handle more than 10 trillion requests per day and support peaks of more than 20 million requests per second.
ElastiCache - is a web service that makes it easy to deploy, operate, and scale an in-memory cache in the cloud. The service improves the performance of web applications by allowing you to retrieve information from fast, managed, in-memory caches, instead of relying entirely on slower disk-based databases.
Keyspaces - is a scalable, highly available, and managed Apache Cassandra–compatible database service. Amazon Keyspaces ensures that your data is encrypted by default. It also has the ability for you to back up your table data continuously using point-in-time recovery.
MemoryDB for Redis - is a Redis-compatible, durable, in-memory database service. It gives you ultra-fast performance and is purpose-built for modern applications with microservices architectures. Using MemoryDB as your primary database for your microservices applications eliminates the need to separately manage both a cache and durable database.
Neptune - is a fast, reliable, fully-managed graph database service. This makes it easy to build and run applications that work with highly connected datasets. The core of Amazon Neptune is a high-performance graph database engine optimized for storing billions of relationships and querying the graph with milliseconds latency. Amazon Neptune supports popular graph models Property Graph and W3C's RDF, and their respective query languages Apache TinkerPop Gremlin and SPARQL.
Timestream - is a fast, scalable, fully managed time series database service for IoT and operational applications. It allows you to store and analyze trillions of events per day at a portion of the cost of comparative relational databases. Time-series data has specific characteristics such as typically arriving in a time order form where data is append-only and queries are always over a time interval.
DocumentDB - is a fast, scalable, highly available, and fully managed document database service that supports MongoDB workloads.
Quantum Ledger Database (QLDB) - is a fully managed ledger database that provides a transparent, immutable, and cryptographically verifiable transaction log owned by a central trusted authority. It tracks each and every application data change and maintains a complete and verifiable history of changes over time without having to have the complex development effort of building your own ledger-like applications. With QLDB, your data’s change history is immutable – it cannot be altered or deleted – and using cryptography, you can easily verify that there have been no unintended modifications to your application’s data.
领英推荐
Google Cloud Platform
Cloud SQL - ?is a fully managed, relational database service that is compatible with SQL Server, MySQL, and PostgreSQL. It includes features for automated backups, data replication, and disaster recovery to ensure high availability and resilience.
Cloud Spanner - is another fully managed, relational database service. It differs from Cloud SQL by focusing on enabling you to combine the benefits of relational structure and non-relational scalability. It provides strong consistency across rows and high-performance operations. It includes features for automatic replication, built-in security, and multi-language support.
AlloyDB for PostgreSQL - is a fully managed, PostgreSQL-compatible database service offering superior performance, availability, and scale for your most demanding enterprise workloads.
BigQuery - is a fully managed, serverless data warehouse. You can use it to perform data analyses via SQL and query streaming data. BigQuery includes features for machine learning, business intelligence, and geospatial analysis.
Cloud Bigtable -?is a fully managed NoSQL Google Cloud database service. It is designed for large operational and analytics workloads. Cloud Bigtable includes features for high availability, zero-downtime configuration changes, and ultra-low-latency data access.
Firestore - is a fully managed, serverless NoSQL Google Cloud database designed for the development of serverless apps. With it, you can store, sync, and query data for web, mobile, and IoT applications. It includes features for offline support, live synchronization, and built-in security.
Memorystore - is a fully managed, in-memory data store. It is designed to be secure, highly available, and scalable. Memorystore enables you to create application caches with sub-millisecond latency for data access. It is compatible with Memcached and Redis protocols.
Microsoft Azure
Azure SQL Database - is a fully managed, multi-modal database service. It offers turnkey, global distribution, multi-master replication, automatic scaling, and single-digit millisecond read/write latency.
Azure Cosmos DB - allows you to develop high-performance applications of any size or scale with a fully managed and serverless distributed database supporting open-source PostgreSQL, MongoDB, and Apache Cassandra.
Azure Cache for Redis - is a distributed, in-memory, scalable solution providing super-fast data access. It supports RedisBloom, RediSearch, RedisJSON, and RedisTimeSeries module integration to enable data analysis, search, and streaming.
With all of these options for storing structured and semi-structured data in the cloud, your job as CIO in managing the business's data has become more achievable. In the end, success requires that you understand what their options are, where they fit, and what they can enable you and your teams to do that will drive the business forward. Data has a huge potential to let your organization succeed in a digital world and cloud databases are definitely one of the tools that you should understand and be able to leverage
————————————————————————————————————— Gartner publishes reports called Magic Quadrants. These reports look across the vendor space for a specific technology area and provide meaningful scoring of a vendor's ability to execute and the completeness of their vision. The Magic Quadrant for Cloud Database Management Systems was released in December 2021 and covers the complete picture of cloud databases. In addition, Gartner for IT has experts such as Philip Russom, Ph.D. , Adam Ronthal , Rick Greenwald , Merv Adrian , Henry Cook , and Yefim Natis who can provide more context and background on these and other technologies in that Magic Quadrant report.