Azure Data Factory: Copy Data from Azure Data Lake Store to Azure SQL

Objective: From Azure Data Lake Store, copy data in a .tsv file to a database table in Azure SQL DB.

My take on Azure Data Factory is that of ETL but less of the transformation and move data from on-premises and cloud sources to Azure data-oriented services.

Azure Data Factory- Copy Data from Azure Data Lake Store to Azure SQL-1

Read further about Azure Data Factory https://docs.microsoft.com/en-us/azure/data-factory/data-factory-introduction

In my Azure Data Lake Store, I have the 227mb tsv file which will be the source data.
Azure Data Factory- Copy Data from Azure Data Lake Store to Azure SQL-2
In my Azure SQL DB, I have the following table which will be the target.
Azure Data Factory- Copy Data from Azure Data Lake Store to Azure SQL-3

The approach I will take is to use the Copy data Wizard. This is much more easier and convenient than the Author and deploy approach which is based on editing JSON. Hopefully in the near future, there will be more UI based tools.
Azure Data Factory- Copy Data from Azure Data Lake Store to Azure SQL-4

Click Copy data
Azure Data Factory- Copy Data from Azure Data Lake Store to Azure SQL-5

Click Next.
Select Azure Data Lake Store as the source

Enter connection information. For Authentication Type, I selected Service Principal over OAuth. The advantage is that the authentication does not expire as it does for OAuth. So for recurring schedules, you are best off with creating a Service Principal. For details on creating a service principal https://docs.microsoft.com/en-us/azure/azure-resource-manager/resource-group-create-service-principal-portal#get-application-id-and-authentication-key

Azure Data Factory- Copy Data from Azure Data Lake Store to Azure SQL-6
Choose the file or folder as the source data
Azure Data Factory- Copy Data from Azure Data Lake Store to Azure SQL-7

Configure File Format settings. Most have been auto-detected. I chose to edit the schema to apply some proper field names
Azure Data Factory- Copy Data from Azure Data Lake Store to Azure SQL-8

Select Destination data store as Azure SQL DB
Azure Data Factory- Copy Data from Azure Data Lake Store to Azure SQL-9

Enter connection and authentication information
Azure Data Factory- Copy Data from Azure Data Lake Store to Azure SQL-10

Select table
Azure Data Factory- Copy Data from Azure Data Lake Store to Azure SQL-11

Configure column mapping. Note: I haven’t setup for repeatability such as defining a date field to help manage this scenario.
Azure Data Factory- Copy Data from Azure Data Lake Store to Azure SQL-12

Performance settings. I choose to leave as default.
Azure Data Factory- Copy Data from Azure Data Lake Store to Azure SQL-13

Confirm Summary
Azure Data Factory- Copy Data from Azure Data Lake Store to Azure SQL-14

Validating and Deployment. This will create the linked services, data sets, and pipelines the same way in the Author and deploy editor
Azure Data Factory- Copy Data from Azure Data Lake Store to Azure SQL-15
Monitor pipeline

Confirm data copy by querying the Azure SQL JobPostings table.
Azure Data Factory- Copy Data from Azure Data Lake Store to Azure SQL-16

In Azure SQL, a quick look of the Resource Utilization during this ADF pipeline activity we can see it has been a substantial load. Although the sizing is at the lowest at 5 DTUs.
Azure Data Factory- Copy Data from Azure Data Lake Store to Azure SQL-17

I find Azure Data Factory I neat tool with basic capabilities. I don’t think this should be compared with SSIS with more sophisticated capabilities. For now, Copy data wizard is good for general bulk data movement to get your data into the Azure cloud.


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s