Azure Batch for Internet Data Collection Part 3: Jobs and Tasks

From the previous blog post Part 2, I have created a batch account, deployed my .NET console application as an Application Package and created a Batch Pool of nodes. In this blog post, I will run a simple job against this batch pool.

A job is a collection of one or more tasks. It manages how computation is performed by its tasks on the compute nodes in a pool.

To read more about jobs, go to https://docs.microsoft.com/en-us/azure/batch/batch-api-basics#job

I will show how to create and configure a job to run my deployed .NET console application.

Click on Jobs blade
Azure Batch for Internet Data Collection Part 3- Jobs and Tasks 1
Click on Add
Set the Pool to the pool just created
Configure the Job manage task so that it runs the .NET console application.
The command line in my case was in the format

cmd /c %AZ_BATCH_APP_PACKAGE_<Package Name>%\\<app name>.exe -args {set of command line arguments}

Azure Batch for Internet Data Collection Part 3- Jobs and Tasks 2

Click Select and then Ok

By clicking into the created job, you can see the task has been created for execution.

Azure Batch for Internet Data Collection Part 3- Jobs and Tasks 3

By clicking into the batch pool, you can see one running node that is executing the job’s task

Azure Batch for Internet Data Collection Part 3- Jobs and Tasks 4

Upon the one task completion, the state is completed

Azure Batch for Internet Data Collection Part 3- Jobs and Tasks 5

In the list of jobs, you can see the job has been completed

Azure Batch for Internet Data Collection Part 3- Jobs and Tasks 6

As a result, you can see some the files being stored in my Azure Data Lake Store. My .NET console application is implemented to store data into this Azure Data Lake Store.

Azure Batch for Internet Data Collection Part 3- Jobs and Tasks 7

If there are no further jobs to create, I will save on compute costs by deleting the batch pool. Note that you can go to each individual node and click disable but this won’t save on compute costs as the VM is not deallocated. By disabling, the node is not online to task scheduling.

Conclusion

I have shown how to deploy a .NET console application, create a pool of VMs and run a job to execute the .NET console application. As a result, it only ran against one node. Then what about taking advantage of the rest of the nodes? A job has the capability to run many tasks in parallel whereby using many or all the nodes in the pool. My next blog article will demonstrate parallel task execution.

Next: Azure Batch for Internet Data Collection Part 4: ParallelTask Execution

One thought on “Azure Batch for Internet Data Collection Part 3: Jobs and Tasks

  1. Pingback: Azure Batch for Internet Data Collection Part 2: Application Package and Pool – Roy Kim on Azure, SharePoint, BI, Office 365

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s