Azure Event Grid Metadata Output change for Fivetran

Mathious Willie 0 Reputation points
2025-03-12T14:15:06.7933333+00:00

I use Fivetran to lift data from on-prem servers to Azure Data Lake. After that happens, I use Azure Event Grid to send that event to a logic app. The logic app then triggers a Databricks job when the event is received. When I first started using Fivetran the metadata from the event included whether the file was FlushWithClose (Completed) or CreateFile (Incomplete) in the data.api section of the metadata output. I use these two states to indicate to logic apps whether or not to trigger a job. However, recently I realized that this piece of information is no longer being sent. There are still two events per file but there is now no way to tell them apart. Without this mechanism, for each file that is dropped into ADLS by Fivetran it will trigger two jobs, which I want to avoid.

I tried using Azure table storage to store each state of each event. For example if I have a file called my_file, azure will send the CreateFile state of this file and it will be stored in the table. when the FlushWithClose state is sent, the table is queried and the existing state will be returned and the databricks job will be triggered. This mechanism does not work as intended.

Azure Data Lake Storage
Azure Data Lake Storage
An Azure service that provides an enterprise-wide hyper-scale repository for big data analytic workloads and is integrated with Azure Blob Storage.
1,554 questions
{count} votes

1 answer

Sort by: Most helpful
  1. Keshavulu Dasari 4,750 Reputation points Microsoft External Staff Moderator
    2025-03-12T20:20:46.8566667+00:00

    Hi Mathious Willie ,

    The metadata output from Azure Event Grid when using Fivetran to transfer files to Azure Data Lake Storage . The metadata for the events should ideally include the data.api key indicating whether the file was created with FlushWithClose (Completed) or CreateFile (Incomplete).

    According to the context, when using Azure Data Lake Storage Gen2, the data.api key is set to CreateFile or FlushWithClose for the Microsoft.Storage.BlobCreated event. If you want to ensure that the Microsoft.Storage.BlobCreated event is triggered only when a Block Blob is completely committed, you should filter the event for the FlushWithClose REST API call. This filtering is essential to distinguish between the two states effectively.

    If the metadata is no longer being sent as you described, it could be due to changes in how Fivetran interacts with Azure Event Grid or how events are being generated. You may want to check if there have been any updates or changes in the Fivetran configuration or in Azure Event Grid that could affect the event metadata.

    Additional information,

    Using Azure Table Storage to track the state of events is a valid approach, but it may require careful management of state transitions to ensure that the correct job is triggered only when the file is fully uploaded.

    For more information:

    https://learn.microsoft.com/en-us/azure/event-grid/event-schema-blob-storage?tabs=cloud-event-schema#data-lake-storage-gen-2-events

    Please do not forget to "Accept the answer” and “up-vote” wherever the information provided helps you, this can be beneficial to other community members.           

    If you have any other questions or are still running into more issues, let me know in the "comments" and I would be glad to assist you.

    0 comments No comments

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.