Building Python SDK for Databricks REST API

Building Python SDK for Databricks REST API

This article is about a project I've started to work on lately. Please welcome Databricsk REST API - Python. , It is a vanilla-flavored Python SDK for Azure Databricks REST API 2.0.

"dbrest" is a python module that I created this module for the data bricks API You can download the code base from the Link: https://github.com/odbckrishna/databricks-rest-api-python/

Installation:

pip install dbrest

dbrest.connect(
    domain = 'adb-dummy-adb.net',
    username = [username (no mandatory)],
    password = [password (no mandatory)],
    bearer = 'xxxxxxxxxxxxxxxxxxxxxxx'
)        

Example: "Cluster Start using Rest call"

dbrest.cluster_start(cluster_id)

{ "cluster_id" : "fgfkdgnmk-dfd", "status": "RUNNING" }        

Example: "Get a list of user groups."

dbrest.get_groups()        

Databricks was given API controls for almost all the services before jumping into the code base or API. Below are the tasks we can perform using the Databricks API controls.

You can access all of your data bricks assets in a data bricks workspace. The workspace provides access to data and computational resources such as clusters and jobs and orchestrates objects such as notebooks, libraries, and experiments into folders. The workspace UI, Databricks Command Line Interface (CLI), and Databricks REST API can all be used to manage the workspace. this code base will helps to deal REST API.

No alt text provided for this image

Data Governance:

  • We can get a list of user groups and access-related information. We can integrate this with the Web UI or generate an automated email for user entitlement validation.
  • Automate the creation, revocation, and granting of user access(user entitlement automation?with API calls).
  • With an API call, we can find the users who have not logged in or used it in the last 90 days.
  • users' concurrency validation in data bricks. If more queries are made, we can enhance the cluster on runtime with an API call.
  • catches vulnerable commands and informs users not to use them. (query history API).
  • Deployment and cluster monitoring will be easy.
  • Tokens creation can be completed by a restful call.
  • For data governance auditing standards we can create reports with restful calls.

Data Orchestration:

API calls can trigger the data bricks notebooks with parameters. so that it can be integrated with business process tools like Airflow, Apache Nifi, etc.

The creation of runtime jobs and their relationship with API calls.

A cluster start/reboot/stop can be a restful call.

Write a query, create the payload, execute with a restful call, and get the result in JSON.

a lot more below are the Restful controls.

Rest API Documentation from Azure Databricks

  • Account API?2.0
  • Clusters API?2.0
  • Cluster Policies API?2.0
  • Data Lineage API?2.0
  • Databricks SQL Queries and Dashboards API?2.0
  • Databricks SQL Query History API?2.0
  • Databricks SQL Warehouses API?2.0
  • DBFS API?2.0
  • Databricks SQL API?2.0
  • Delta Live Tables API?2.0
  • Git Credentials API?2.0
  • Global Init Scripts API?2.0
  • Groups API?2.0
  • Instance Pools API?2.0
  • IP Access List API?2.0
  • Jobs API?2.1
  • Libraries API?2.0
  • MLflow API?2.0
  • Permissions API?2.0
  • Repos API?2.0
  • SCIM API?2.0
  • Secrets API?2.0
  • Token API?2.0
  • Token Management API?2.0
  • Unity Catalog API?2.1
  • Workspace API?2.0

Thank you.

要查看或添加评论,请登录

Saikrishna Cheruvu的更多文章

社区洞察

其他会员也浏览了