Designing and Implementing a Data Science Solution on Azure - Labs for DP 100
Lab
Intermediate
5 h 0 m
2020-01-20
Pausable for 72 hours
Lab Overview
In this hands-on lab, you will step through 10 exercises where you will use Azure Machine Learning to accomplish ´╗┐several tasks that are essential to the DP 100 Designing and Implementing a Data Science Solution on Azure exam.You will learn how to Create and Deploy a Training Pipeline, Run Experiments and Manage Models, understand how to work with data stores and data sets, work with environments and compute targets. create and configure a publishing pipeline, ´╗┐understand how to automate machine learning, as well as learn how to monitor with application insights and detect data drift.

Related Learning Path(s):
DP-100: Designing and Implementing an Azure Data Science Solution on Azure
Objectives
  • Understand Azure ML Workspaces and Tools
  • Create and Deploy a Training Pipeline
  • Run Experiments and Manage Models
  • Work with Datastores and Datasets
  • Work with Environments and Compute Targets
  • Create a Publishing Pipeline
  • Create Real-time and Batch Inferencing Services
  • Tune Hyperparameters and Automated Machine Learning
  • Review Automated ML Explanations and Interpreting Models
  • Monitor a Model and Detect Data Drift
Exercises
In this exercise, you will create the Azure Machine Learning workspace that you will use throughout the rest of this lab and you will explore various tools for working with an Azure Machine Learning workspace.
The Designer interface provides a drag & drop environment in which you can define a workflow, or pipeline of data ingestion, transformation, and model training modules to create a machine learning model. You can then publish this pipeline as a web service that client applications can use for inferencing (generating predictions from new data).
Experiments are at the core of a data scientist's work. In Azure Machine Learning, an experiment is used to run a script or a pipeline, and usually generates outputs and records metrics.
Although it's fairly common for data scientists to work with data on their local file system, in an enterprise environment it can be more effective to store the data in a central location where multiple data scientists can access it. In this lab, you'll store data in the cloud, and use an Azure Machine Learning datastore to access it. Datasets provide a way to encapsulate data for experiments and training. You can use tabular and file datasets to define versioned sources of data that can easily be consumed in experiments.
All Python code runs in the context of an environment, which determines the Python packages available. When you run a script as an experiment in Azure Machine Learning, you can configure the environment in which it runs as well as the compute targets for it to execute on.
You can use the Azure Machine Learning SDK to perform all of the tasks required to create and operate a machine learning solution in Azure. Rather than perform these tasks individually, you can use pipelines to orchestrate the steps required to prepare data, run training scripts, register models, and other tasks. After you've created a pipeline, you can publish a REST endpoint through which the pipeline can be initiated. This enables you to run the pipeline on-demand or at scheduled times.
There's no point in training and registering models if you don't plan to make them available for applications to use. In this lab, you'll deploy a model as a web service for real-time inferencing. In many scenarios, inferencing is performed as a batch process that uses a predictive model to score a large number of cases. To implement this kind of inferencing solution in Azure Machine Learning, you can create a batch inferencing pipeline.
Hyperparameters are variables that affect how a model is trained, but which can't be derived from the training data. Choosing the optimal hyperparameter values for model training can be difficult, and usually involved a great deal of trial and error. Determining the right algorithm and preprocessing transformations for model training can involve a lot of guesswork and experimentation. In this exercise, you'll use Azure Machine Learning to tune hyperparameters by performing multiple training runs in parallel and then you'll use automated machine learning to determine the optimal algorithm and preprocessing steps for a model by performing multiple training runs in parallel.
When you use automated machine learning to train a model, you can configure the experiment to generate explanations of feature importance. When you train your own models, you can use explainers to determine feature importance.
When you deploy a model as a service, it's useful to be able to track information about the requests it processes. Changing trends in data over time can reduce the accuracy of the predictions made by a model. Monitoring for this data drift and retraining as necessary is an important way to ensure your machine learning solution continues to predict accurately.
Real-Time Lab
Not Registered?
Create Account
Already Registered?
Login
What are Labs?

Labs provide a live environment to get hands-on experience using the same tools and services in the real world.


Learn More