Note
Access to this page requires authorization. You can try signing in or changing directories.
Access to this page requires authorization. You can try changing directories.
This package contains a client library for the de-identification service in Azure Health Data Services which enables users to tag, redact, or surrogate health data containing Protected Health Information (PHI). For more on service functionality and important usage considerations, see the de-identification service overview.
Getting started
Prerequisites
- Install the Java Development Kit (JDK) with version 8 or above.
- Have an Azure Subscription.
- Deploy the de-identification service.
- Configure Azure role-based access control (RBAC) for the operations you will perform.
Adding the package to your product
<dependency>
<groupId>com.azure</groupId>
<artifactId>azure-health-deidentification</artifactId>
<version>1.0.0</version>
</dependency>
Authentication
Both the asynchronous and synchronous clients can be created by using DeidentificationClientBuilder
. Invoking buildClient
will create the synchronous client, while invoking buildAsyncClient
will create its asynchronous counterpart.
You will need a service URL to instantiate a client object. You can find the service URL for a particular resource in the Azure portal, or using the Azure CLI:
# Get the service URL for the resource
az deidservice show --name "<resource-name>" --resource-group "<resource-group-name>" --query "properties.serviceUrl"
Optionally, save the service URL as an environment variable named DEID_ENDPOINT
for the sample client initialization code.
The Azure Identity package provides the default implementation for authenticating the client.
You can use DefaultAzureCredential
to automatically find the best credential to use at runtime.
DeidentificationClient deidentificationClient = new DeidentificationClientBuilder()
.endpoint(Configuration.getGlobalConfiguration().get("DEID_ENDPOINT"))
.credential(new DefaultAzureCredentialBuilder().build())
.buildClient();
Key concepts
De-identification operations:
Given an input text, the de-identification service can perform three main operations:
Tag
returns the category and location within the text of detected PHI entities.Redact
returns output text where detected PHI entities are replaced with placeholder text. For exampleJohn
replaced with[name]
.Surrogate
returns output text where detected PHI entities are replaced with realistic replacement values. For example,My name is John Smith
could becomeMy name is Tom Jones
.
Available endpoints
There are two ways to interact with the de-identification service. You can send text directly, or you can create jobs to de-identify documents in Azure Storage.
You can de-identify text directly using the DeidentificationClient
:
String inputText = "Hello, my name is John Smith.";
DeidentificationContent content = new DeidentificationContent(inputText);
content.setOperationType(DeidentificationOperationType.SURROGATE);
DeidentificationResult result = deidentificationClient.deidentifyText(content);
System.out.println("De-identified output: " + (result != null ? result.getOutputText() : null));
// De-identified output: Hello, my name is <synthetic name>.
To de-identify documents in Azure Storage, see Tutorial: Configure Azure Storage to de-identify documents
for prerequisites and configuration options. In the sample code below, populate the STORAGE_ACCOUNT_NAME
and STORAGE_CONTAINER_NAME
environment variables with your desired values. To refer to the same job between multiple examples, set the DEID_JOB_NAME
environment variable.
The client exposes a beginDeidentifyDocuments
method that returns a SyncPoller
or PollerFlux
instance.
Callers should wait for the operation to be completed by calling getFinalResult()
:
String storageLocation = "https://" + Configuration.getGlobalConfiguration().get("STORAGE_ACCOUNT_NAME") + ".blob.core.windows.net/" + Configuration.getGlobalConfiguration().get("STORAGE_CONTAINER_NAME");
DeidentificationJob job = new DeidentificationJob(
new SourceStorageLocation(storageLocation, "data/example_patient_1"),
new TargetStorageLocation(storageLocation, "_output")
.setOverwrite(true)
);
job.setOperationType(DeidentificationOperationType.REDACT);
String jobName = Configuration.getGlobalConfiguration().get("DEID_JOB_NAME", "MyJob-" + Instant.now().toEpochMilli());
DeidentificationJob result = deidentificationClient.beginDeidentifyDocuments(jobName, job)
.waitForCompletion()
.getValue();
System.out.println(jobName + " - " + result.getStatus());
Examples
The following sections provide several code snippets covering some of the most common client use cases, including:
- Create a client
- De-identify text
- Begin a job to de-identify documents in Azure Storage
- Get the status of a de-identification job
- List all de-identification jobs
- List all documents in a de-identification job
Create a DeidentificationClient
DeidentificationClient deidentificationClient = new DeidentificationClientBuilder()
.endpoint(Configuration.getGlobalConfiguration().get("DEID_ENDPOINT"))
.credential(new DefaultAzureCredentialBuilder().build())
.buildClient();
De-identify text
String inputText = "Hello, my name is John Smith.";
DeidentificationContent content = new DeidentificationContent(inputText);
content.setOperationType(DeidentificationOperationType.SURROGATE);
DeidentificationResult result = deidentificationClient.deidentifyText(content);
System.out.println("De-identified output: " + (result != null ? result.getOutputText() : null));
// De-identified output: Hello, my name is <synthetic name>.
Begin a job to de-identify documents in Azure Storage
String storageLocation = "https://" + Configuration.getGlobalConfiguration().get("STORAGE_ACCOUNT_NAME") + ".blob.core.windows.net/" + Configuration.getGlobalConfiguration().get("STORAGE_CONTAINER_NAME");
DeidentificationJob job = new DeidentificationJob(
new SourceStorageLocation(storageLocation, "data/example_patient_1"),
new TargetStorageLocation(storageLocation, "_output")
.setOverwrite(true)
);
job.setOperationType(DeidentificationOperationType.REDACT);
String jobName = Configuration.getGlobalConfiguration().get("DEID_JOB_NAME", "MyJob-" + Instant.now().toEpochMilli());
DeidentificationJob result = deidentificationClient.beginDeidentifyDocuments(jobName, job)
.waitForCompletion()
.getValue();
System.out.println(jobName + " - " + result.getStatus());
Get the status of a de-identification job
String jobName = Configuration.getGlobalConfiguration().get("DEID_JOB_NAME");
DeidentificationJob result = deidentificationClient.getJob(jobName);
System.out.println(jobName + " - " + result.getStatus());
List all de-identification jobs
PagedIterable<DeidentificationJob> result = deidentificationClient.listJobs();
for (DeidentificationJob job : result) {
System.out.println(job.getJobName() + " - " + job.getStatus());
}
List all documents in a de-identification job
String jobName = Configuration.getGlobalConfiguration().get("DEID_JOB_NAME");
PagedIterable<DeidentificationDocumentDetails> result = deidentificationClient.listJobDocuments(jobName);
for (DeidentificationDocumentDetails documentDetails : result) {
System.out.println(documentDetails.getInputLocation().getLocation() + " - " + documentDetails.getStatus());
}
Troubleshooting
A DeidentificationClient
raises HttpResponseException
exceptions. For example, if you
provide an invalid service URL an HttpResponseException
would be raised with an error indicating the failure cause.
In the following code snippet, the error is handled
gracefully by catching the exception and display the additional information about the error.
try {
DeidentificationContent content = new DeidentificationContent("input text");
deidentificationClient.deidentifyText(content);
} catch (HttpResponseException e) {
System.out.println(e.getMessage());
// Do something with the exception
}
Next steps
See the [samples] for several code snippets illustrating common patterns used in the de-identification service Java SDK. For more extensive documentation, see the de-identification service documentation.
Contributing
For details on contributing to this repository, see the contributing guide.
Azure SDK for Java