SQL Statement Execution API by Databricks
Recently, Databricks released an API for the execution of SQL statements. as of now, this is available on AWS and Azure clouds (Databricks Premium and Enterprise tiers).
The restful API call can trigger a SQL statement, and we can manage and manipulate the data and databases from external clients.
These are some samples of the various automation processes this API will open the door.
I created a Python SDK and GitHub code process to trigger the Restful calls from the client to Databricks;
Article Link:
https://www.dhirubhai.net/pulse/building-python-sdk-databricks-rest-api-saikrishna-cheruvu/
GitHub Link:
https://github.com/odbckrishna/databricks-rest-api-python/
Below is the sample code and below are the steps for Bash execution.
Trigger the SQL Statement using POST method:
curl --request POST
https://${DATABRICKS_HOST}/api/2.0/sql/statements/ \
--header "Authorization: Bearer ${DATABRICKS_TOKEN}" \
--header "Content-Type: application/json" \
--data '{
? "warehouse_id": "'"$DATABRICKS_SQL_WAREHOUSE_ID"'",
? "catalog": "samples",
? "schema": "tpch",
? "statement": "SELECT l_orderkey, l_extendedprice, l_shipdate FROM lineitem LIMIT 2"
}' \
--output 'sql-execution-response.json' \
&& jq . 'sql-execution-response.json' \
&& export SQL_STATEMENT_ID=$(jq -r .statement_id 'sql-execution-response.json') \
&& export NEXT_CHUNK_INTERNAL_LINK=$(jq -r .result.next_chunk_internal_link 'sql-execution-response.json') \
&& echo SQL_STATEMENT_ID=$SQL_STATEMENT_ID \
&& echo NEXT_CHUNK_INTERNAL_LINK=$NEXT_CHUNK_INTERNAL_LINK\
once the SQL is executed it will return the JSON below is the sample.
领英推荐
? "statement_id": "01ed92c5-3583-1f38-b21b-c6773e7c56b3",
? "status": {
? ? "state": "SUCCEEDED"
? },
? "manifest": {
? ? "format": "ARROW_STREAM",
? ? "schema": {
? ? ? "column_count": 3,
? ? ? "columns": [
? ? ? ? {
? ? ? ? ? "name": "l_orderkey",
? ? ? ? ? "type_name": "LONG",
? ? ? ? ? "position": 0
? ? ? ? },
? ? ? ? {
? ? ? ? ? "name": "l_extendedprice",
? ? ? ? ? "type_name": "DECIMAL",
? ? ? ? ? "position": 1,
? ? ? ? ? "type_precision": 18,
? ? ? ? ? "type_scale": 2
? ? ? ? },
? ? ? ? {
? ? ? ? ? "name": "l_shipdate",
? ? ? ? ? "type_name": "DATE",
? ? ? ? ? "position": 2
? ? ? ? }
? ? ? ]
? ? },
? ? "total_chunk_count": 1,
? ? "chunks": [
? ? ? {
? ? ? ? "chunk_index": 0,
? ? ? ? "row_offset": 0,
? ? ? ? "row_count": 100000
? ? ? }
? ? ],
? ? "total_row_count": 100000,
? ? "total_byte_count": 2848312
? },
? "result": {
? ? "external_links": [
? ? ? {
? ? ? ? "chunk_index": 0,
? ? ? ? "row_offset": 0,
? ? ? ? "row_count": 100000,
? ? ? ? "external_link": "<presigned-url-to-data-stored-externally>",
? ? ? ? "expiration": "<presigned-url-expiration-timestamp>"
? ? ? }
? ? ]
? }
}{
We can review the SQL Process using the below service call.
it is a GET call
curl --request GET
https://${DATABRICKS_HOST}/api/2.0/sql/statements/${SQL_STATEMENT_ID} \
--header "Authorization: Bearer ${DATABRICKS_TOKEN}" \
--output 'sql-execution-response.json' \
&& jq . 'sql-execution-response.json'
Results JSON:
{
? "statement_id": "01ed92c5-3583-1f38-b21b-c6773e7c56b3",
? "status": {
? ? "state": "PENDING"
? }
}
Cancel or STOP the SQL Execution :
curl --request POST
https://${DATABRICKS_HOST}/api/2.0/sql/statements/${SQL_STATEMENT_ID}/cancel \
--header "Authorization: Bearer ${DATABRICKS_TOKEN}"\
Reference :
Thank you
Principle Data Engineer Lead
1 年Very helpful Artical Sai Thank you