Azure Machine Learning Workspace Review

April 24, 2023
12 min read

The first step of preparing a machine learning project in Azure is to prepare your Azure Machine Learning workspace, which provides all the tools, components, triggers, and actions you need to develop, train, deploy and monitor your machine learning model. In this article of this series, I will talk about workspaces, Azure Machine Learning workspace, how to prepare a machine learning workspace, components you get within a ML workspace, plus security, monitoring, and cost management features of a workspace.

What Is a Workspace?

In Azure workspace we get all the components of a project where multiple people can collaborate using different technical tools available within the workspace. It is a recommended best practice to create one workspace per project to monitor the cost, security, and accessibility better. You can have multiple projects in a single workspace and multiple people working on a single workspace as well but, it is easier to manage all the resources of a single workspace if they all belong to one project.

Machine Learning Workspace

In an Azure Machine Learning workspace, multiple data scientists, analysts and collaborators work on a single machine learning project from a single place. Azure Machine Learning workspace provides several types of compute support, machine learning experiment management options, jobs pipeline, dataset management, data source connections, secured container handling, model environment preparation, and scalable inferencing options as well. To begin, we’ll need an Azure account, Azure subscription, and a resource group to create a workspace. We can either create a new resource group or use an existing one for our workspace.

You can create a machine learning workspace in Azure in multiple ways:

  • Using Azure web portal
  • Using Azure Machine Learning studio
  • Using Azure Machine Learning Python SDK
  • Using Azure CLI extension on the command line
  • Using VS Code extension for Azure Machine Learning

In this article, I’ll demonstrate how to use the Azure web portal option to create the workspace.

Creating an Azure Machine Learning Workspace

To create an Azure Machine Learning workspace from the web portal, first I need to create a resource. In the search box I’ll type azure machine learning and Figure 1 shows all the available options under the search bar.

Screenshot in Microsoft Azure to create a resource, and filter to Azure Machine Learning.
Figure 1: Azure Machine Learning workspace creation  |  Used with permission from Microsoft.

Figure 1 shows the available options. Select the first one, Azure Machine Learning and click Create. On the next page (Figure 2) you can review your choice, usage information, available plans, etc.

Screenshot of the page in Microsoft Marketplace where you confirm choices before you create an Azure Machine Learning workspace.
Figure 2: Azure Machine Learning service overview.  |  Used with permission from Microsoft.

From here simply click Create to go to the next page of creating the Azure Machine Learning workspace.

Screenshot that shows details of the Azure Machine Learning workspace that is about to be created. The screenshot shows a subscription from Visual Studio Enterprise with MSDN. The workspace is named ML workspace. Other details include the region, a new storage account, a new key vault, a new application insights, and for now, not setting the container registry.
Figure 3: Azure Machine Learning workspace creation detail.  |  Used with permission from Microsoft.

Here I can see the detailed options to create this Azure Machine Learning workspace. I selected my Visual Studio Enterprise subscription, but you can select one from the multiple available options from the drop-down menu.

You can either choose an existing resource group for this workspace or, click Create new to create new resource group for this machine learning workspace as I’ve done in Figure 3. Right after that, I need to complete the required workspace details. I selected the unique workspace name, region, storage account, key vault, application insights and container registry.

For storage account, key vault, and Application Insights, you can choose an existing key vault and storage account if you already have set up those components within this resource group. But you cannot select components of a different resource group here. If the resource group is new (which is my case here), then in that case I will need to create a new key vault, storage account and application insights here. The storage account security keys will be stored in the vault. All the artifacts generated during the machine learning development process will be, by default, stored in the storage account of this resource group.

You can click Networking if you already have a virtual network deployed to peer your new machine learning workspace in order to make your overall development more secure. In this case, I clicked Review + create to create and deploy the workspace.

Screenshot showing successful deployment of the Azure Machine Learning workspace named Microsoft.MachineLearningServices in the resource group ML workspace1.
Figure 4: Deployment page of Azure Machine Learning workspace  |  Used with permission from Microsoft.

Once the workspace is deployed I will see Figure 4 though it may take few seconds to complete the deployment process. From that deployment page I will click the button labeled Go to resource.

Screenshot of the Azure Machine Learning resource page. This is the hub where you can see the details of the resources in the workspace.
Figure 5: Azure Machine Learning resource page  |  Used with permission from Microsoft.

I can now see Figure 5, where I can review resources of your Azure Machine Learning workspace. Here I see links to the resource information, location, region, subscription id, storage account, and application insights information. Based on your requirements, it is a best practise to add a tag right in your workspace or in your Azure services; this will help you to monitor costs of projects with certain tags.

If you want multiple team members to collaborate for a single project under one machine learning workspace, you need to click Access control (IAM) from the left panel and add new contributor / owner following the team’s needs. All the user access management can be done from the access control (IAM) page along with the access to the storage account.

Now to open Azure Machine Learning studio from the workspace, simply click Studio web URL and this will open a new tab to take you to the Azure Machine Learning studio portal.

Screenshot of the Azure Machine Learning Studio portal.
Figure 6: Azure Machine Learning Studio portal  |  Used with permission from Microsoft.

Azure Machine Learning studio portal gives you all the tools and components you will need to create your machine learning experience, compute for your machine learning experiment, deployment process, automated jobs, etc.

Components Created and Provisioned for You in Your Azure Machine Learning Workspace

Let’s go back to the resource group in Azure portal to see all the components that have been provisioned along with Azure Machine Learning workspace.

Screenshot of the resource group and the components that have been added including the workspace, Application Insights, the storage account and they key vault.
Figure 7: Components within the resource group for Azure Machine Learning workspace  |  Used with permission from Microsoft.

Right after creating the workspace in an empty resource group, if you go back to resource group resources you will see multiple services, but you have provisioned only Azure Machine Learning. The reason is, when you create an Azure Machine Learning workspace, the resources listed below are automatically create for you within the same resource group.

Application Insights

Azure Application Insights is used for collecting monitoring and diagnostics data. If you deploy any endpoint after model training, Application Insights will log the number of successful request responses handled by your deployed model, the number of failed requests, and the runtime information. You can see the data under the Application Insights section. If you delete the Application Insights service from your workspace, you cannot re-create it again unless you create a new workspace.

Storage Account

A storage account created with Azure Machine Learning workspace stores all the data sets uploaded by users to Azure Machine Learning studio. During the experiment phase, all the data artifacts, images, screenshots, and metrics that you create for your machine learning project will be stored in this storage account by default. Even the notebooks created for writing code will be stored in the storage account as well.

Key Vault

Azure created an Azure Key Vault during the creation of the Azure Machine Learning workspace. The key vault is used to store all the secret keys and tokens of data stores, data connections and compute targets. This gives an extra level of security to your machine learning project.

Azure ML Workspace

This is your Azure Machine Learning studio provisioned under this workspace that is giving you all the tools you need for this machine learning project.

Components to Help Train and Develop the ML Model

An Azure Machine Learning workspace (Figure 6) contains many components that helps us throughout our machine learning life cycle to complete our AI application development loop.

Model Development

Within the Azure Machine Learning workspace, you can train and develop the machine learning model in three ways:

In this series, I will only use Azure notebooks for the experiments. Azure notebooks use the underlying technology of Jupyter notebooks where one can work on a machine learning project following a modularized approach. Through automated machine learning projects, all I need to do is select the dataset and some configurations and everything else is handled by the backend. Under machine learning designer, all the machine learning development components are available as drag and drop modules; all I need to do is to drop them in the canvas panel and configure them based on my need.

Data Source

You can both connect different Azure data sources like blob storage, SQL, ADLS, and Cosmos DB, or upload a tabular-based file/data asset into the machine learning studio portal. Lots of data connectors and data asset management options are available in machine learning studio. I will cover this topic in article six where I’ll show a step-by-step demo of data connections.

Jobs

Jobs are scheduled machine learning model runs. If you have developed a training pipeline where every week your experiment collects data from the last seven days from different data sources, and then prepares a complete dataset, and then conducts a machine learning model re-training process, in that case this run is a scheduled job.

Reusable Workflow Pipelines

You can build pipelines with multiple actions (training, resource maintenance, re-training process, etc.), and that collection of actions are called workflow pipelines in Azure Machine Learning. Once you build one workflow pipeline, you can re-use this pipeline in other experiments and these pipelines are used for both model training and re-training.

Environments and Model Registration

Once a model is trained, you need to deploy this model to make predictions in production. During the training process, if you have used multiple libraries to train the model, you should create a docker container with those libraries so that you can conduct the model prediction process properly. The environment you need to create to run the trained model is called a machine learning model environment. A YAML file lists the necessary libraries represented in the environment of a model training.

Once an experiment is done, you take the machine learning model to validate and test. You need to register a docker container with the right environment along with the model scoring file and the trained model file and then register them in a container registry so that you can deploy that container as a container instance in the future during the testing process.

Model Deployment

Azure machine learning provides two options to deploy the trained machine learning model:

  • Azure container instance (ACI): A single instance of the model used for testing purposes.
  • Azure Kubernetes services (AKS): Provides options to deploy our model in multiple pods/instances for scalability, mostly for QA/production purposes.

Managed Compute

Compute is one of the most important parts of machine learning model training, testing and deployment. In Azure Machine Learning you may select multiple types of compute, choosing both CPU- and GPU-based machines. You can select a single instance of a machine or a cluster of multiple machines for bigger machine learning jobs. You can configure the complete scalable machine learning production environment from the workspace with the right number of compute instances.

Machine Learning Workspace Interaction

You can use the machine learning workspace components in four ways:

  • Azure Machine Learning studio (ML studio)
  • Python SDK
  • Azure ML CLI
  • VS Code Extension

You can directly configure jobs, create experiments, and connect data sources from the ML Studio portal. Alternatively, you can use the Python SDK or, the command line-based ML CLI, or a Visual Studio Code extension of Azure Machine Learning workspace to use ML components. In this series I will mostly use ML Studio web portal and in some cases I will use Python SDK to configure some machine learning jobs.

Azure ML Workspace Cost

Monitoring cost is particularly important for a machine learning project as you can easily spend a lot of money if you configure one or more machines wrong and keep them running for hours and hours. I think the best way to monitor cost of a workspace is to go back to resource group, select the cost analysis option from the cost management, and see which component is running, which resource is costing the most, and what the forecasted cost is for the month. You can even configure different types of cost alerts from the cost alerts options under Cost Management where whenever the cost crosses a certain threshold, Azure will send you an email alert mentioning the numbers.

Screenshot of the left list of menu options for the Resource Group with the first option in the Cost Management section selected.
Figure 8: Cost analysis of Azure Machine Learning workspace.  |  Used with permission from Microsoft.

In this article I have mostly reviewed Azure Machine Learning workspace and I discussed how to create an ML workspace, different components of ML workspace, and four ways to interact with ML workspace components. This is the first step to prepare your machine learning environment. In my next article I will talk about Azure Machine Learning compute detail for both model training and deployment.

Next Stop: Azure Machine Learning Compute Review, article 3.

Series articles (will have links as they are published)

Article 1: MLOps Components and Machine Learning Platform Selection 

Article 2: Azure Machine Learning Workspace Review (this article)

Article 3: Azure Machine Learning Compute Review

Article 4: Machine Learning Workflow Review

Article 5: Azure ML Notebook Selection and Development Process

Article 6: Connecting Data Sources with Azure ML Workspace

Article 7: Azure ML Security Review

Article 8: Azure ML Model Training

Article 9: Azure ML Model Registration and ML Job Automation

Article 10: Azure ML Model Deployment in ACI

Article 11: Azure ML Model Deployment in AKS

Article 12: Azure ML Model Health Monitoring

Article 13: Azure ML Model Drift and Data Drift Review

Article 14: Azure ML Model Retraining Pipeline

Article 15: Azure ML Model Result Analysis Dashboard with PowerBI

Rahat Yasir

Rahat Yasir

Rahat Yasir works at ISAAC Instruments as Director of Data Science & AI to lead their Data & AI initiatives for data-driven & AI-powered transportation industry. He was selected as Canada's top 30 software developer under 30 in 2018. He is an eight times Microsoft Most Valuable Professional award holder in the Artificial Intelligence category. He has years of experience in imaging and data analysis application development, cross-platform technologies and enterprise system design.