Share via


Connect to external HTTP services

Important

This feature is in Public Preview.

This article describes how to set up Lakehouse Federation to run federated queries on external service data that is not managed by Azure Databricks. To learn more about Lakehouse Federation, see What is Lakehouse Federation?.

To connect to your external service database using Lakehouse Federation, you must create the following in your Azure Databricks Unity Catalog metastore:

  • A connection to your external service database.
  • A foreign catalog that mirrors your external service database in Unity Catalog so that you can use Unity Catalog query syntax and data governance tools to manage Azure Databricks user access to the database.

Before you begin

Workspace requirements:

  • Workspace enabled for Unity Catalog.

Compute requirements:

  • Network connectivity from your compute resource to the target database systems. See Networking recommendations for Lakehouse Federation.
  • Azure Databricks compute must use Databricks Runtime 15.4 LTS or above and Standard or Dedicated access mode.
  • SQL warehouses must be pro or serverless and must use 2023.40 or above.

Permissions required:

  • To create a connection, you must be a metastore admin or a user with the CREATE CONNECTION privilege on the Unity Catalog metastore attached to the workspace.
  • To create a foreign catalog, you must have the CREATE CATALOG permission on the metastore and be either the owner of the connection or have the CREATE FOREIGN CATALOG privilege on the connection.

Additional permission requirements are specified in each task-based section that follows.

  • Set up authentication to the external service using one of the following methods:

    • Bearer token: Obtain a bearer token for simple token-based authentication.
    • OAuth 2.0 Machine-to-Machine: Create and configure an app to enable machine-to-machine authentication.
    • OAuth 2.0 User-to-Machine Shared: Authenticate with user interaction to share access between service identity and machine.

Authentication methods for external services

Bearer token: A bearer token is a simple token-based authentication mechanism where a token is issued to a client and used to access resources without requiring additional credentials. The token is included in the request header and grants access as long as it is valid.

OAuth Machine-to-Machine (recommended): OAuth Machine-to-Machine (M2M) authentication is used when two systems or applications communicate without direct user involvement. Tokens are issued to a registered machine client, which uses its own credentials to authenticate. This is ideal for server-to-server communication, microservices, and automation tasks where no user context is needed. Databricks recommends using OAuth Machine-to-Machine when it is available.

OAuth User-to-Machine Shared: OAuth User-to-Machine Shared authentication allows a single user identity to authenticate and share the same set of credentials across multiple clients or users. All users share the same access token. This approach is suitable for shared devices or environments where a consistent user identity is sufficient, but it reduces individual accountability and tracking. In cases where identity login is required, select User-to-Machine Shared.

Create a connection to the external service

First, create a Unity Catalog connection to the external service that specifies a path and credentials to access the service.

Benefits of using a Unity Catalog connection include the following:

  • Secure credential management: Secrets and tokens are securely stored and managed in Unity Catalog, ensuring they are never exposed to users.
  • Granular access control: Unity Catalog allows fine-grained control over who can use or manage connections with the USE_CONNECTION and MANAGE_CONNECTION privileges.
  • Host-specific token enforcement: Tokens are restricted to the host_name specified during connection creation, ensuring they cannot be used with unauthorized hosts.

Permissions required: Metastore admin or user with the CREATE CONNECTION privilege.

Create a connection using one of the following methods:

Catalog Explorer

Use the Catalog Explorer UI to create a connection.

  1. In your Azure Databricks workspace, click Catalog icon Catalog.

  2. At the top of the Catalog pane, click the Add or plus icon Add icon and select Add a connection from the menu.

    Alternatively, from the Quick access page, click the External data > button, go to the Connections tab, and click Create connection.

  3. Click Create connection.

  4. Enter a user-friendly Connection name.

  5. Select a Connection type of HTTP.

  6. Select an Auth type from the following options:

    • Bearer token
    • OAuth Machine to Machine
    • OAuth User to Machine Shared
  7. On the Authentication page, enter the following connection properties for the HTTP connection.

    For a bearer token:

    • Host: For example, https://databricks.com
    • Port: For example, 443
    • Bearer Token: For example, bearer-token
    • Base Path: For example, /api/

    For OAuth Machine-to-Machine token:

    • Client ID: Unique identifier for the application you created.
    • Client secret: Secret or password generated for the application that you created.
    • OAuth scope: Scope to grant during user authorization. The scope parameter is expressed as a list of space-delimited, case-sensitive strings. For example, channels:read channels:history chat:write
    • Token endpoint: Used by the client to obtain an access token by presenting its authorization grant or refresh token. Usually in the format https://authorization-server.com/oauth/token

    For OAuth User-to-Machine Shared token:

    • Client ID: Unique identifier for the application you created.
    • Client secret: Secret or password generated for the application that you created.
    • OAuth scope: Scope to grant during user authorization. The scope parameter is expressed as a list of space-delimited, case-sensitive strings. For example, channels:read channels:history chat:write
    • Authorization endpoint: To authenticate with the resource owner via user-agent redirection, usually in the format https://authorization-server.com/oauth/authorize
    • Token endpoint: Used by the client to obtain an access token by presenting its authorization grant or refresh token. Usually in the format https://authorization-server.com/oauth/token

    Note

    For OAuth User-to-Machine Shared, you are prompted to sign in with HTTP using your OAuth credentials.

  8. Click Create connection.

SQL

Use the CREATE CONNECTION SQL command to create a connection.

Note

You cannot use the SQL command to create a connection that uses OAuth Machine-to-User Shared. Instead, see the Catalog Explorer UI instructions.

To create a new connection using a Bearer token, run the following command in a notebook or the Databricks SQL query editor:

CREATE CONNECTION <connection-name> TYPE HTTP
OPTIONS (
  host '<hostname>',
  port '<port>',
  base_path '<base-path>',
  bearer_token '<bearer-token>'
);

Databricks recommends using secrets instead of plaintext strings for sensitive values like credentials. For example:

CREATE CONNECTION <connection-name> TYPE HTTP
OPTIONS (
  host '<hostname>',
  port '<port>',
  base_path '<base-path>',
  bearer_token secret ('<secret-scope>','<secret-key-password>')
)

To create a new connection using OAuth Machine-to-Machine, run the following command in a notebook or the Databricks SQL query editor:

CREATE CONNECTION <connection-name> TYPE HTTP
OPTIONS (
  host '<hostname>',
  port '<port>',
  base_path '<base-path>',
  client_id '<client-id>'
  client_secret '<client-secret>'
  oauth_scope '<oauth-scope1> <oauth-scope-2>'
  token_endpoint '<token-endpoint>'
)

Send an HTTP request to the external system

Now that you have a connection, learn how to send HTTP requests to the service using the http_request built-in SQL function.

Permissions required: USE CONNECTION on the connection object.

Run the following SQL command in a notebook or the Databricks SQL editor. Replace the placeholder values:

  • connection-name: The connection object that specifies the host, port, base_path, and access credentials.
  • http-method: The HTTP request method used to make the call. For example: GET, POST, PUT, DELETE
  • path: The path to concatenate after the base_path to invoke the service resource.
  • json: The JSON body to send with the request.
  • headers: A map to specify the request headers.
SELECT http_request(
  conn => <connection-name>,
  method => <http-method>,
  path => <path>,
  json => to_json(named_struct(
    'text', text
  )),
  headers => map(
    'Accept', "application/vnd.github+json"
  )
);
from databricks.sdk import WorkspaceClient
from databricks.sdk.service.serving import ExternalFunctionRequestHttpMethod

WorkspaceClient().serving_endpoints.http_request(
  conn="connection-name",
  method=ExternalFunctionRequestHttpMethod.POST,
  path="/api/v1/resource",
  json={"key": "value"},
  headers={"extra-header-key": "extra-header-value"},
)

Use HTTP connections for agent tools

AI agents can use the HTTP connection to access external applications like Slack, Google Calendar, or any service with an API using HTTP requests. Agents can use externally connected tools to automate tasks, send messages, and retrieve data from third-party platforms.

See Connect AI agent tools to external services.