Azure Data Factory · Azure Data Platform · Azure Services · Data Engineering

Bringing Folder Structure via Azure Data Factory

In this blogpost, we would see how to create Folder Structure in dynamic way as the process via Azure Data Factory with some sample files those are having different file formats. Here, We are going to make those file formats as dynamic folders and within those, we would also going to make them as year–>month–>date folder structures in dynamic way and finally we would copy all the files within those respective folders.

Prerequisite

  • Active Azure Subscription. If you don’t have, create a free account.
  • Subscription Level Contributor access or Owner access with basic understandings of using Azure Data Factory and its components. we can check our Access in Subscriptions–>Access Control (IAM)–>View my Access. (like below)
Type Subscriptions in Global Search Resources Text Box
Check our Access at Subscription level

Sample Files kept in Azure Data Lake Storage Gen 2

0 – sample files

sample files

Get the Above sample files from GitHub repos.

1 – Linked Services

Linked services – ADLS Gen 2
  • Create Azure Blob Storage or Enable hierarchical namespace if creating ADLS (Gen 2 is Microsoft recommended and latest) [we can use either blob or ADLS, for huge volume- ADLS with folder structure is recommended].
  • If we created as blob storage, we can directly upload files using azure portal itself; If we created as ADLS, for uploading our local files, we can use either Azure Storage Explorer[Downloadable Desktop Software] or AzCopy v10(Preview).

2 – Factory Resources – General

Factory Resources

Factory Resources contains both Datasets as well as Pipelines. To bring this folder structure in dynamic way, we requires at least 1 pipeline having 3 datasets.

2-A – Datasets

a) allfiles

allfiles Dataset

b) sourcebinary

sourcebinary Dataset

c) targetbinary

targetbinary Dataset

2-B – Pipeline

Pipeline

Here, in this Pipeline, 2 activities are required for this dynamic behavior:

  • Get Metadata
  • ForEach (with Copy data Activity inside for each loop)
a) Get Metadata Activity
Get Metadata activity
b) ForEach Activity
ForEach activity
c) Copy data Activity within ForEach Activity
Source and Sink of Copy data activity

If you are interested to reuse above method & requires ADF expressions that is used within activities etc., please check AzureStuffs Repos.

Recent Related Posts

Azure Resource Lock

We can even apply Azure Resource Lock that prevents accidental deletions and modifying of the resources.

Summary

Thus, we saw how to bring Folder Structure in dynamic manner by using Azure Data Factory at high level and other settings and options. Here we used Azure Services like Azure Data Lake Storage Gen 2, Azure Data Factory mainly, and Azure Storage Explorer.

Follow Blog and Show your Support for many more interesting upcoming Posts!

Advertisement

3 thoughts on “Bringing Folder Structure via Azure Data Factory

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s