Perform Cloud Data Science with Azure Machine Learning (70-774)
- Setup and configure Azure Machine Learning Studio.
- Navigate Machine Learning Studio and understand basic functionality such as managing projects, experiments and datasets.
- Importing data into Machine Learning Studio from local and external data sources.
- Summarizing data and viewing basic statistics about datasets in Azure Machine Learning Studio.
- Clean and prepare data for training machine learning algorithms in Azure Machine Learning Studio.
- Classifying data using decision trees in Azure Machine Learning Studio.
- Leveraging existing R script in your machine learning experiments in Azure Machine Learning Studio.
- Operationalizing your machine learning experiments with Azure Machine Learning Studio.
An Azure Machine Learning Studio Workspace allows you to use Machine Learning Studio to create and manage machine learning experiments and predictive web services. You can create multiple Workspaces, each one containing a set of your experiments, datasets, trained predictive models, web services, and notebooks. As the owner of a Workspace, you can invite other users to share the Workspace so you can collaborate with them on predictive analytics solutions.
In this exercise, you will create an Azure Machine Learning Studio Workspace.
Azure Machine Learning Studio is a powerful browser based visual drag-and-drop code free authoring environment for machine learning in Azure. It allows you to build, deploy and share predictive analytics solutions in a fully managed cloud service with minimal overhead and fast time to insights.
In this task, you will take a walkthrough of the Azure Machine Learning Studio interface where you will create and configure machine learning projects with imported datasets and other assets.
In this exercise, you will create an AzureML experiment to help you create a targeted mailing list using a classification algorithm in Azure Machine Learning Studio.
The type of algorithm we will use is called a binary classifier. A binary classifier is a type of algorithm that will classify elements into one of two groups. In our case, whether or not we should send an advertisement to an individual (read: marketing wants to know whether it is worth the cost of the stamp to send an advertisement to a potential customer). Other example use cases might be whether a piece of email is junk or good, whether a patient’s lab value is positive or negative, or whether sentiment is positive or negative.
The specific algorithm we will be using is the Two-Class Boosted Decision Tree. Decision trees are a great entry point into machine learning because they are very intuitive and easy to understand. The Two-Class Boosted Decision Tree is one of the easiest methods to get good performance. However, it is constrained by the size of memory and may not be well suited for larger datasets.
In Azure ML Studio, you can use the Execute R Script module to embed R code into experiments in Azure Machine Learning and execute them using the R language. This means you can have customized R functions and packages that are not immediately available in Azure ML Studio.
In this exercise, we are going to use an R script to sample our dataset. You might want to do this if you have a large dataset and want to use an algorithm such as Two-Class Boosted Decision Trees that operates in-memory and requires a smaller dataset. We will execute the R script by using the Execute R Script task in Azure ML Studio.