Azure Data Factory Scheduling

Scheduling jobs in Azure Data Factory.

Setup the resources for this tutorial

Create resources

Create a Blob Storage
- Create two containers input and output
- Upload a CSV file to input
Create a Data Factory
- Create a Linked Service to Blob
- Create a Dataset for Blob input with the filepath pointing at the CSV uploaded file
- Create a Dataset for Blob output with filepath pointing at the output container
- Create a pipeline with a Copy Activity with source/sink set to the input/output datasets
- Click Debug to test the initial setup

Set parameters for the datasets

In the Input dataset
- Create a parameter fileName (String) and replace the Connection filepath
In the Output dataset
- Create a parameter fileName (String) and replace the Connection filepath
In the Copy Activity
- Set the source fileName value as the file.csv uploaded to the input blob
- Set the sink fileName value as the same file.csv which will be copied to output blob
In the Pipeline
- Save all and Publish.
Click Debug to test this setup

Add a Schedule Trigger

Before creating the trigger, make sure the pipeline was Published.

On the pipeline UI, click Add trigger, then New/Edit

Click on the Choose trigger field and then New
Enter a name like WeeklySchedule
Enter a description like Run pipeline daily at 5AM UTC
Select as Type Schedule
- Other options are Tumbling window, Storage events, and Custom events
Enter a Star date
In Recurrence select an interval
- For example, Every 1 Day
In Execute at these times
- A warning says Hours must be in the range 0-23 and minutes in the range 0-59. The time specified follows the timezone setting above.
- Hours, enter 5
- Minutes, enter 0
- This populates the field Schedule execution times 5:00
Optionally, select an end date
Keep the default Start trigger on creation
Click OK
Click on Save all and Publish to take effect.
The trigger can be found in the Manage hub, under Triggers

Be aware of the Recurrence as this will affect the billing cost of running the pipeline. See Azure Data Factory Pricing.

Add a Storage Event Trigger

You can execute a trigger as soon as files are loaded into Blob Storage.

Register the resource in your Azure subscription

In Azure search for subscription
Select your subscription
On Settings/Resource providers
Search for EventGrid
Click on Register

Set a Pipeline parameter

Click on the pipeline canvas
Under Parameters create one called fileName (String)
In the Copy Activity
- In Source, for the fileName parameter, select dynamic content, then select the parameter
- In Sink, do the same, which should be the expression @pipeline().parameters.fileName
Click Save all, then Debug to test this setup. Then Publish

Create the event trigger

On the ADF pipeline UI, click Add trigger, then New/Edit

Click on the Choose trigger field and then New
Enter a name like NewBlobEventTrigger
Enter a description like Run pipeline as soon as new files are loaded to Storage Blob
Select as Type Storage events
Select the Azure subscription
Select the Storage account name
Enter the Container name with leading and trailing forward slash such as /your-container/
Enter a Blob path begins (optional)
- Blob path’ must begin with the path. For example: '2018/april/shoes.csv' or '2018/'
- In my case I had two folders in the container input and output
- I entered the path as input/
Enter a Blob path ends (optional)
- ‘Blob path’ must end in a file name or extension. For example, 'shoes.csv' or '.csv'. Container and folder name are optional but when specified they must be separated by a '/blobs/' segment. For example '/orders/blobs/2018/april/shoes.csv'. To specify a folder in any container omit the leading '/' character. For example, 'april/shoes.csv'.
- I entered .csv
Select the Event
- Choose which events are associated with this trigger
- Select Blob created
Keep the default Ignore empty blobs as Yes
Keep the default Start trigger on creation
Click OK
A new window shows Data preview
- Event trigger filters show the setup
  - Container name: scheduletrigger (this is how I named my container)
  - Starts with: input/
  - Ends with: .csv
- Below this said 3 blobs matched in "scheduletrigger"
  - input/alabama-dbe.csv
  - input/alaska-dbe.csv
  - input/oklahome-dbe.csv
Click Continue
A new window shows New trigger
- Enter value @triggerBody().fileName
  - The Event Event Grid API returns a @triggerBody()
  - As seen in the MS docs here.
  - The storage event trigger captures the folder path and file name of the blob into properties
  - @triggerBody().folderPath
  - @triggerBody().fileName
    - In Synapse use @trigger().outputs.body.fileName
Click Save all then Publish
The trigger can be found in the Manage hub, under Triggers

Event trigger activates on new blobs

The trigger won’t activate with blobs that were previously uploaded. It will trigger once a new blob with .csv is loaded to the container/directory scheduletrigger/input.

For example if a new file file.csv is added to scheduletrigger/input:

@triggerBody().fileName has a value of file.csv
This value is mapped to the pipeline parameter fileName in @pipeline().parameters.fileName

Upload a file to the container

Test the trigger by uploading a .csv file to the blob container.

Then go to the Monitor hub to see if the pipeline was triggered.

Azure Data Factory Storage Event Trigger Runs

The trigger doesn’t run

Make sure that you Publish the pipeline
Check the pipeline parameter expression
Check the triggerBody expression.

This tutorial is based on Adam’s video here where he explains how to use triggers in ADF.

Setup the resources for this tutorial

Add a Schedule Trigger

Add a Storage Event Trigger

Related Posts