登录查看更多内容

What is Azure Data Explorer?

Rory McManus

Owner @Data Mastery | Need help with Data?

发布日期: 2022年4月17日

Azure Data Explorer (ADX) is a fully managed data analytics service for real-time analysis on large volumes of data streaming from applications, websites and IoT devices.

The primary use of ADX is the ingestion of structured, semi-structured and unstructured data for big data analytics, with speeds of up to 200 Megabytes/sec per node (up to 1000 nodes) returning results in less than a second across billions of records.

More businesses are opening their network to a wide variety of IoT devices and applications, it becomes increasingly vital for businesses to proactively react to events in a timely and cost-effective manner.

I recently employed ADX with a government client to migrate an existing Kafka workload which ingests and transforms Fortinet, Paloalto, and Bluecoat web security logs. During Covid-19, their workload increased 10-fold, with an associated 5-fold increase in costs. The migration of this workload resulted in a 60% cost reduction, a simplified solution and an improvement in data reliability.

How can data be ingested into Azure Data Explorer?

Azure Data Explorer supports server-side stored functions, continuous ingest, and continuous export to Azure Data Lake store. It also supports ingestion time-mapping transformations on the server side, update policies, and precomputed scheduled aggregates with materialized views.

Automated Pipelines - Ingestion Methods

Event Grid Blob Created - When a 'blob' is created on the Azure storage account it results in the firing of an event that triggers the Data Explorer ingestion pipeline.
Event Hub
IoT Hub
Azure Data Factory
Light Ingest - Command line tool for historical loads to minimise cost.

Supported Formats

Uncompressed Formats - ApacheAvro, AvroCSV, JSON, MultiJSON, ORC, Parquet, PSV, RAW, SCsv, SOHsv, TSV, TSVE, TXT, W3CLOGFILE?????

When the source data has a schema provided e.g. avro, parquet, w3clogfile it can be directly inserted into the final destination table with the expected data types, column names etc.

领英推荐

From Chaos to Clarity: How Data Lakehouses Are…

Steven Murhula 2 周前

Site-to-Site Data Pipeline Design Considerations

Satya Srinivas 1 个月前

Live Log and Prosper (Again): A Step-by-Step Reality…

Douglas M. 2 个月前

Compressed Formats - GZip, Zip

Transformations

Data is transformed in ADX by using the native language KQL - Kusto Query Language. This is a simple, yet powerful language to query structured, semi-structured and unstructured data. It assumes a relational data model of tables and columns, with a minimal set of data types. The language is very expressive, easy to read and understand the query intent.

Visualisations

Use different visual displays of your data in the native Azure Data Explorer?Dashboards. You can also display your results using connectors to some of the?leading visualisation services, such as?Power BI?and?Grafana. Azure Data Explorer also has?ODBC?and JDBC connector support to tools such as?Tableau?and?Sisense.

Use Cases

For Fortinet web security log files using ADX click here.

Final Thoughts

I hope you have found this helpful and will save your company understand the basics of Azure Data Explorer.

Please share your thoughts, questions, corrections and suggestions, please drop me a message on?LinkedIn.

RAKESH VELCHURI

2 年

Nice write up Rory McManus ??

2 次回应

查看更多评论

要查看或添加评论，请登录

Rory McManus的更多文章

Azure Data Explorer: Real-Time Analytics - Palo Alto Web Traffic Logs

2022年9月7日

Azure Data Explorer: Real-Time Analytics - Palo Alto Web Traffic Logs

Since remote working has become the norm, risk and information security teams are operating in a completely different…

4 条评论
IoT Real Time Analytics - WAGO PLC with Databricks Auto Loader

2022年8月24日

IoT Real Time Analytics - WAGO PLC with Databricks Auto Loader

Modern businesses have an overwhelming amount of data available to them from a huge number of IoT devices and…

4 条评论
What is Databricks Auto Loader?

2022年8月15日

What is Databricks Auto Loader?

Databricks is a scalable big data analytics platform designed for data science and data engineering. Built on top of…

5 条评论
Azure Data Explorer: Real-Time Analytics - Fortinet Logs

2021年11月15日

Azure Data Explorer: Real-Time Analytics - Fortinet Logs

I recently used Data Explorer with an education client to migrate an existing Kafka workload which ingests and…

13 条评论
Databricks PySpark Type 2 SCD Function for Azure Dedicated SQL Pools

2021年4月21日

Databricks PySpark Type 2 SCD Function for Azure Dedicated SQL Pools

Slowly Changing Dimensions (SCD) is a commonly used dimensional modeling technique used in data warehousing to capture…

20 条评论

See all articles

What is Azure Data Explorer?

Rory McManus

Owner @Data Mastery | Need help with Data?

How can data be ingested into Azure Data Explorer?

Automated Pipelines - Ingestion Methods

Supported Formats

领英推荐

Transformations

Visualisations

Use Cases

Final Thoughts

Rory McManus的更多文章

社区洞察

其他会员也浏览了

Real-Time detection and alerting of unwanted credit card charges (Part 2 of 3)

The Dark Art of Data Sharding: How Discord and Netflix Split Petabyte-Scale Workloads

?? Azure Secure Medallion Mesh Architecture: Innovating to Solve Real Client Challenges ??

Revolutionize Stream Processing with the Power of Data Fabric

200 Nanosecond SQL Queries: libSQL's Local Replica Innovation

Understanding Big Data: An Evolution in Information Processing

Big Data : A Basic Guide

A First Look at Data in Motion: Concepts, Components, and a Practical Use Case

Big Data - The Future Of Business

OpsClarity: Monitoring Real-Time, Fast Data Applications

How can data be ingested into Azure Data Explorer?

Automated Pipelines - Ingestion Methods

Supported Formats

领英推荐

Transformations

Visualisations

Use Cases

Final Thoughts

Rory McManus的更多文章

Azure Data Explorer: Real-Time Analytics - Palo Alto Web Traffic Logs

IoT Real Time Analytics - WAGO PLC with Databricks Auto Loader

What is Databricks Auto Loader?

Azure Data Explorer: Real-Time Analytics - Fortinet Logs

Databricks PySpark Type 2 SCD Function for Azure Dedicated SQL Pools

社区洞察

其他会员也浏览了

Real-Time detection and alerting of unwanted credit card charges (Part 2 of 3)

The Dark Art of Data Sharding: How Discord and Netflix Split Petabyte-Scale Workloads

?? Azure Secure Medallion Mesh Architecture: Innovating to Solve Real Client Challenges ??

Revolutionize Stream Processing with the Power of Data Fabric

200 Nanosecond SQL Queries: libSQL's Local Replica Innovation

Understanding Big Data: An Evolution in Information Processing

Big Data : A Basic Guide

A First Look at Data in Motion: Concepts, Components, and a Practical Use Case

Big Data - The Future Of Business

OpsClarity: Monitoring Real-Time, Fast Data Applications