課程名稱:Microsoft Certified Azure Data Scientist Associate (1科 Azure Machine Learning 及 MLflow) 國際認可證書課程 - 簡稱:Azure Data Scientist Training Course |
DP-100 Designing and Implementing a Data Science Solution on Azure
1. Design a Data Ingestion Strategy
1.1 Identify your data source and format
1.2 Identify the data source
1.3 Identify the data format
1.3.1 Tabular or structured data
1.3.2 Semi-structured data
1.3.3 Unstructured data
1.4 Identify the desired data format
1.5 Choose how to serve data to machine learning workflows
1.5.1 Separate compute from storage
1.5.2 Store data for model training workloads
1.6 Design a data ingestion solution
1.6.1 Azure Synapse Analytics
1.6.2 Azure Databricks
1.6.3 Azure Machine Learning
1.6.4 Design a data ingestion solution
1.7 Scenario-based Exercise for Designing a Data Ingestion Technology
1.7.1 The Scenario: Give advice on how to ingest and serve the data
1.7.2 What type of data do we currently have?
1.7.3 Which storage solution would you recommend to store the data?
1.7.4 Which tool would you recommend we use to move the data?
1.7.5 Which tool should we use to anonymize the patient data?
1.7.6 Which architecture represents the proposed data ingestion solution?
2. Design a machine learning model training solution
2.1 Identify machine learning tasks
2.2 Capabilities of Azure Machine Learning
2.3 Regression with Azure Machine Learning
2.3.1 Create an Azure Machine Learning workspace
2.3.2 Create compute resources
2.3.3 Explore data
2.3.4 Train a machine learning model
2.3.5 Review the best model
2.3.6 Deploy a model as a service
2.3.7 Test the deployed service
2.3.8 Clean-up
2.4 Choose a service to train a machine learning model
2.5 Decide between compute options
2.5.1 CPU or GPU
2.5.2 General purpose or memory optimized
2.5.3 Spark
2.5.4 Monitor the compute utilization
2.6 Scenario-based Exercise for Designing a Model Training Strategy
2.6.1 How should we train the model to predict diabetes?
2.6.2 Which tool should we use to train the diabetes model?
2.6.3 Which compute would you recommend to train the model?
2.6.4 Which virtual machine size should we use for model development?
2.6.5 How should we train another model to predict skin disorders?
3. Design a model deployment solution
3.1 Understand how model will be consumed
3.1.1 Deploy a model to an endpoint
3.1.2 Get real-time predictions
3.1.3 Get batch predictions
3.2 Decide on real-time or batch deployment
3.2.1 Identify the necessary frequency of scoring
3.2.2 Decide on the number of predictions
3.2.3 Consider the cost of compute
3.2.4 Decide on real-time or batch deployment
3.3 Scenario-based Exercise for Designing a Model Deployment Solution
4. Design a machine learning operations solution
4.1 Explore an MLOps architecture
4.1.1 Set up environments for development and production
4.1.2 Organize Azure Machine Learning environments
4.1.3 Design an MLOps architecture
4.2 Design for monitoring
4.2.1 Monitor the model
4.2.2 Monitor the data
4.2.3 Monitor the infrastructure
4.3 Design for retraining
4.3.1 Prepare your code
4.3.2 Automate your code
4.4 Scenario-based Exercise for Machine Learning Operations
5. Azure ML Workspace Resources and Assets
5.1 Create an Azure Machine Learning workspace
5.1.1 Understand the Azure Machine Learning service
5.1.2 Create the workspace
5.1.3 Give access to the Azure Machine Learning workspace
5.1.4 Organize your workspaces
5.2 Identify Azure Machine Learning resources
5.2.1 Create and manage the workspace
5.2.2 Create and manage compute resources
5.2.3 Create and manage datastores
5.3 Identify Azure Machine Learning assets
5.3.1 Create and manage models
5.3.2 Create and manage environments
5.3.3 Create and manage data
5.3.4 Create and manage components
5.4 Train models in the workspace
5.4.1 Explore algorithms and hyperparameter values with Automated Machine Learning
5.4.2 Run a notebook
5.4.3 Run a script as a job
5.5 Explore the Azure Machine Learning workspace
5.5.1 Provision an Azure Machine Learning workspace
5.5.2 Explore the Azure Machine Learning studio
5.5.3 Author a training pipeline
5.5.4 Create a compute target
5.5.5 Run your training pipeline
5.5.6 Use jobs to view your history
6. Developer tools for workspace interaction
6.1 New Features in v2
6.2 Explore the Python SDK
6.2.1 Install the Python SDK
6.2.2 Connect to the workspace
6.3 Explore the CLI
6.3.1 Install the Azure CLI
6.3.2 Install the Azure Machine Learning extension
6.3.3 Work with the Azure CLI
6.4 Explore developer tools for workspace interaction
6.4.1 Before you start
6.4.2 Provision the infrastructure with the Azure CLI
6.4.3 Create a compute instance with the Azure CLI
6.4.4 Create a compute cluster with the Azure CLI
6.4.5 Configure your workstation with the Azure Machine Learning studio
6.4.6 Use the Python SDK to train a model
6.4.7 Review your job history in the Azure Machine Learning studio
7. Make Data Available in Azure Machine Learning
7.1 Understand URIs
7.2 Datastore
7.2.1 Understand types of datastores
7.2.2 Use the built-in datastores
7.2.3 Create a datastore
7.2.4 Create a datastore for an Azure Blob Storage container
7.3 Data Asset
7.3.1 Understand data assets
7.3.2 When to use data assets
7.3.3 Create a URI file data asset
7.3.4 Create a URI folder data asset
7.3.5 Create a MLTable data asset
7.4 An Exercise of Making data available in Azure Machine Learning
7.4.1 Provision an Azure Machine Learning workspace
7.4.2 Create the workspace and compute resources
7.4.3 Explore the default datastores
7.4.4 Copy the access key
7.4.5 Clone the lab materials
7.4.6 Create a datastore and data assets
7.4.7 Optional: Explore the data assets
8. Compute Targets in Azure Machine Learning
8.1 Choosing the appropriate compute target
8.1.1 Understand the available types of compute
8.1.2 When to use which type of compute?
8.1.3 Choose a compute target for experimentation
8.1.4 Choose a compute target for production
8.1.5 Choose a compute target for deployment
8.2 Create and use a compute instance
8.2.1 Create a compute instance with the Python SDK
8.2.2 Assign a compute instance to a user
8.2.3 Minimize compute time
8.2.4 Use a compute instance
8.3 Create and use a compute cluster
8.3.1 Create a compute cluster with the Python SDK
8.3.2 Using a compute cluster
8.4 Exercise of Working with compute resources in Azure Machine Learning
8.4.1 Provision an Azure Machine Learning workspace
8.4.2 Create the compute setup script
8.4.3 Create the compute instance
8.4.4 Configure the compute instance
8.4.5 Create a compute cluster
9. Working with environments in Azure Machine Learning
9.1 Understanding Environments
9.1.1 What is an environment in Azure Machine Learning?
9.2 Explore and use curated environments
9.2.1 Use a curated environment
9.2.2 Test and troubleshoot a curated environment
9.3 Create and use custom environments
9.3.1 Create a custom environment from a Docker image
9.3.2 Create a custom environment with a conda specification file
9.3.3 Use an environment
9.4 Work with environments in Azure Machine Learning
9.4.1 Provision an Azure Machine Learning workspace
9.4.2 Create the workspace and compute resources
9.4.3 Clone the lab materials
9.4.4 Work with environments
10. Finding the best Classification model with Automated Machine Learning
10.1 Preprocess data and configure featurization
10.1.1 Understand scaling and normalization
10.1.2 Configure optional featurization
10.2 Run an Automated Machine Learning experiment
10.2.1 Restrict algorithm selection
10.2.2 Configure an AutoML experiment
10.2.3 Specify the primary metric
10.2.4 Set the limits
10.2.5 Set the training properties
10.2.6 Submit an AutoML experiment
10.3 Evaluate and compare models
10.3.1 Explore preprocessing steps
10.3.2 Retrieve the best run and its model
10.4 Exercise of Finding the best classification model with Automated Machine Learning
10.4.1 Provision an Azure Machine Learning workspace
10.4.2 Create the workspace and compute resources
10.4.3 Clone the lab materials
10.4.4 Train a classification model with automated machine learning
11. Tracking model training in Jupyter notebooks with MLflow
11.1 Configure MLflow for model tracking in notebooks
11.1.1 Configure MLflow in notebooks
11.1.2 Use MLflow on a local device
11.2 Train and track models in notebooks
11.2.1 Create an MLflow experiment
11.2.2 Log results with MLflow
11.2.3 Enable autologging
11.2.4 Use custom logging
11.3 Exercise of Tracking Model Training in Notebooks by using MLflow
11.3.1 Provision an Azure Machine Learning workspace
11.3.2 Create the workspace and compute resources
11.3.3 Clone the lab materials
11.3.4 Track model training with MLflow
12. Training Script as Command Job
12.1 Convert a notebook to a script
12.1.1 Remove all nonessential code
12.1.2 Refactor your code into functions
12.1.3 Test your script
12.2 Run a script as a command job
12.2.1 Configure and submit a command job
12.3 Use parameters in a command job
12.3.1 Working with script arguments
12.3.2 Passing arguments to a script
12.4 Exercise on Running a training script as command job
12.4.1 Provision an Azure Machine Learning workspace
12.4.2 Create the workspace and compute resources
12.4.3 Clone the lab materials
12.4.4 Convert a notebook to a script
12.4.5 Test a script with the terminal
12.4.6 Run a script as a command job
13. Track model training with MLflow in jobs
13.1 View metrics and evaluate models
13.1.1 View the metrics in the Azure Machine Learning studio
13.1.2 Retrieve metrics with MLflow in a notebook
13.1.3 Search all the experiments
13.1.4 Retrieve runs
13.2 Exercise Using MLflow to track training jobs
13.2.1 Provision an Azure Machine Learning workspace
13.2.2 Create the workspace and compute resources
13.2.3 Clone the lab materials
13.2.4 Submit MLflow jobs from a notebook
14. Hyperparameter tuning with Azure Machine Learning
14.1 Concepts on Hyperparameter
14.1.1 Hyperparameters
14.1.2 Parameters
14.1.3 Tuning hyperparameters
14.2 Define a search space
14.2.1 Discrete hyperparameters
14.2.2 Continuous hyperparameters
14.2.3 Defining a search space
14.3 Three common Hyperparameters in Regression
14.3.1 Learning rate
14.3.2 Batch size
14.3.3 Epochs
14.4 Configure a sampling method
14.4.1 Grid sampling
14.4.2 Random sampling
14.4.3 Sobol
14.4.4 Bayesian sampling
14.5 Early Termination
14.5.1 When to use an early termination policy
14.5.2 Configure an early termination policy
14.5.3 Bandit policy
14.5.4 Median stopping policy
14.5.5 Truncation selection policy
14.6 Using a Sweep Job for Hyperparameter tuning
14.6.1 Configure and run a sweep job
14.6.2 Monitor and review sweep jobs
14.7 Exercise of Perform hyperparameter tuning with a sweep job
14.7.1 Provision an Azure Machine Learning workspace
14.7.2 Create the workspace and compute resources
14.7.3 Clone the lab materials
14.7.4 Tune hyperparameters with a sweep job
15. Run pipelines in Azure Machine Learning
15.1 Create components
15.2 Create a pipeline
15.3 Run a pipeline job
15.4 Exercise of Running pipelines in Azure Machine Learning
15.4.1 Provision an Azure Machine Learning workspace
15.4.2 Create the workspace and compute resources
15.4.3 Clone the lab materials
15.4.4 Run scripts as a pipeline job
16. Register an MLflow model in Azure Machine Learning
16.1 Log models with MLflow
16.1.1 Why use MLflow?
16.1.2 Use autologging to log a model
16.1.3 Manually log a model
16.1.4 Customize the signature
16.2 The MLflow model format
16.2.1 Explore the MLmodel file format
16.2.2 MLmodel file
16.2.3 Flavor
16.2.4 Configure the signature
16.3 Register an MLflow model
16.4 Exercise of Logging and Registering models with MLflow
16.4.1 Provision an Azure Machine Learning workspace
16.4.2 Create the workspace and compute resources
16.4.3 Clone the lab materials
16.4.4 Submit MLflow jobs from a notebook
17. Responsible AI in Azure Machine Learning
17.1 Understanding Responsible AI
17.1.1 Fairness and inclusiveness
17.1.2 Reliability and safety
17.1.3 Transparency
17.1.4 Privacy and security
17.1.5 Accountability
17.2 Create the Responsible AI dashboard
17.2.1 Create a Responsible AI dashboard
17.2.2 Explore the Responsible AI components
17.2.3 Build and run the pipeline to create the Responsible AI dashboard
17.2.4 Using the Python SDK to build and run the pipeline
17.2.5 Exploring the Responsible AI dashboard
17.3 Evaluate the Responsible AI dashboard
17.3.1 Explore error analysis
17.3.2 Explore explanations
17.3.3 Explore counterfactuals
17.3.4 Explore causal analysis
17.4 Exercise Creating and Exploring the Responsible AI dashboard
17.4.1 Provision an Azure Machine Learning workspace
17.4.2 Create the workspace and compute resources
17.4.3 Clone the lab materials
17.4.4 Create a pipeline to evaluate models and submit from a notebook
18. Deploying a model to Managed Online Endpoint
18.1 Explore managed online endpoints
18.1.1 Real-time predictions
18.1.2 Managed online endpoint
18.1.3 Deploy your model
18.1.4 Blue/green deployment
18.1.5 Create an endpoint
18.2 Deploy your MLflow model to a managed online endpoint
18.3 Deploy a model to a managed online endpoint
18.3.1 Create the scoring script
18.3.2 Create an environment
18.3.3 Create the deployment
18.4 Testing Managed Online Endpoints
18.4.1 Use the Azure Machine Learning Python SDK
18.5 Exercise Deploying an MLflow model to an online endpoint
18.5.1 Provision an Azure Machine Learning workspace
18.5.2 Create the workspace and compute resources
18.5.3 Clone the lab materials
18.5.4 Deploy a model to an online endpoint
19. Deploying model to Batch Endpoint
19.1 Understanding Batch endpoints
19.1.1 Batch predictions
19.1.2 Create a batch endpoint
19.1.3 Deploy a model to a batch endpoint
19.1.4 Using compute clusters for batch deployments
19.2 Deploy your MLflow model to a batch endpoint
19.2.1 Register an MLflow model
19.2.2 Deploy an MLflow model to an endpoint
19.3 Deploy a custom model to a batch endpoint
19.3.1 Create the scoring script
19.3.2 Create an environment
19.3.3 Configure and create the deployment
19.4 Invoke and troubleshoot batch endpoints
19.4.1 Trigger the batch scoring job
19.4.2 Troubleshoot a batch scoring job
19.5 Exercise Deploying Model to a Batch endpoint
19.5.1 Provision an Azure Machine Learning workspace
19.5.2 Create the workspace and compute resources
19.5.3 Clone the lab materials
19.5.4 Deploy a model to Batch Endpoint
20. Azure AI Search solution
20.1 Manage Capacity
20.1.1 Service tiers and capacity management
20.1.2 Replicas and partitions
20.2 Search Components
20.2.1 Data source
20.2.2 Skillset
20.2.3 Indexer
20.2.4 Index
20.3 The indexing process
20.4 Search an index
20.4.1 Full text search
20.5 Filtering and Sorting
20.5.1 Filtering results
20.5.2 Filtering with facets
20.5.3 Sorting results
20.6 Enhance the index
20.6.1 Search-as-you-type
20.6.2 Custom scoring and result boosting
20.6.3 Synonyms
20.7 Creating an Azure AI Search Solution
20.7.1 Create Azure resources
20.7.2 Create an Azure AI Search resource
20.7.3 Create an Azure AI Services resource
20.7.4 Create a storage account
20.7.5 Prepare to develop an app in Visual Studio Code
20.7.6 Upload Documents to Azure Storage
20.7.7 Index the documents
20.7.8 Search the index
20.7.9 Explore and modify definitions of search components
20.7.10 Get the endpoint and key for your Azure AI Search resource
20.7.11 Review and modify the skillset
20.7.12 Review and modify the index
20.7.13 Review and modify the indexer
20.7.14 Use the REST API to update the search solution
20.7.15 Query the modified index
20.7.16 Create a search client application
20.7.17 Get the endpoint and keys for your search resource
20.7.18 Prepare to use the Azure AI Search SDK
20.7.19 Explore code to search an index
20.7.20 Explore code to render search results
20.7.21 Run the web app
21. Vector search and retrieval in Azure AI Search
21.1 Concepts about Vector Search
21.2 Prepare your search
21.2.1 Check your index has vector fields
21.2.2 Convert a query input into a vector
21.3 Understand embedding
21.3.1 Embedding models
21.3.2 Embedding space
21.4 Using the REST API to run vector search queries
21.4.1 Set up your project
21.4.2 Create an Index
21.4.3 Upload Documents
21.4.4 Run Queries
22. Introduction to Azure OpenAI Service
22.1 Access Azure OpenAI Service
22.1.1 Create an Azure OpenAI Service resource in the Azure portal
22.1.2 Create an Azure OpenAI Service resource in Azure CLI
22.2 Use Azure AI Studio
22.2.1 Types of generative AI models
22.3 Deploy generative AI models
22.3.1 Deploy using Azure AI Studio
22.3.2 Deploy using Azure CLI
22.3.3 Deploy using the REST API
22.4 Use prompts to get completions from models
22.4.1 Prompt types
22.4.2 Completion quality
22.4.3 Making calls
22.5 Test models in Azure AI Studio's playground
22.5.1 Completions playground
22.5.2 Completions Playground parameters
22.5.3 Chat playground
23. Introduction to Prompt Engineering
23.1 Understand prompt engineering
23.1.1 Considerations for API endpoints
23.1.2 Adjusting model parameters
23.2 Write more effective prompts
23.2.1 Provide clear instructions
23.2.2 Format of instructions
23.2.3 Use section markers
23.2.4 Primary, supporting, and grounding content
23.2.5 Cues
23.3 Provide context to improve accuracy
23.3.1 Request output composition
23.3.2 System message
23.3.3 Conversation history
23.3.4 Few shot learning
23.3.5 Break down a complex task
23.3.6 Chain of thought
24. Retrieval Augmented Generation (RAG) with Azure OpenAI Service
24.1 Concepts of Retrieval Augmented Generation (RAG) with Azure OpenAI Service
24.1.1 Fine-tuning vs. RAG
24.2 Add your own data source
24.2.1 Connect your data
24.3 Chat with your model using your own data
24.3.1 Token considerations and recommended settings
24.3.2 Using the API
24.4 Lab Exercise – Implementing Retrieval Augmented Generation with Azure OpenAI Service
24.4.1 Provision Azure resources
24.4.2 Upload your data
24.4.3 Deploy AI models
24.4.4 Create an index
24.4.5 Prepare to develop an app in Visual Studio Code
24.4.6 Configure your application
24.4.7 Add code to use the Azure OpenAI service
24.4.8 Run your application
25. Transformer Architecture and Large Language Models
25.1.1 Understand artificial intelligence
25.1.2 Understand natural language processing
25.2 Understand statistical techniques used for natural language processing (NLP)
25.2.1 The beginnings of natural language processing (NLP)
25.2.2 Understanding tokenization
25.2.3 Statistical techniques for NLP
25.2.4 Understanding Naive Bayes
25.2.5 Understanding TF-IDF
25.3 Understanding Deep Learning techniques used for Natural Language Processing (NLP)
25.3.1 Word embeddings
25.3.2 Adding memory to NLP models
25.3.3 Using RNNs to include the context of a word
25.3.4 Improving RNNs with Long Short-Term Memory
25.4 Understanding Transformer Architecture used for natural language processing (NLP)
25.4.1 Understand multi-head attention
25.4.2 Using the scaled dot-product to compute the attention function
25.4.3 Explore the Transformer architecture
25.5 Explore foundation models in the model catalog
25.5.1 Explore the model catalog
25.5.2 Explore foundation models
25.6 Exercise Exploring Foundation models in the Model Catalog
25.6.1 Provision an Azure Machine Learning workspace
25.6.2 Explore the model catalog
25.6.3 Explore question answering models
25.6.4 Explore fill mask models
25.6.5 Explore translation models
25.6.6 Deploy the model to a real-time endpoint
25.6.7 Test the deployed translation model
26. Fine-tune a foundation model with Azure Machine Learning
26.1 Understand when to Fine-tune a Foundation Model
26.1.1 Explore foundation models in the model catalog
26.1.2 Fine-tuning foundation models for specific tasks
26.2 Procedures for Fine-tuning Foundation Models
26.2.1 Prepare your data and compute
26.2.2 Choose a foundation model
26.2.3 Configure a fine-tuning job
26.3 Evaluate, deploy, and test a fine-tuned foundation model
26.3.1 Evaluate your fine-tuned model
26.3.2 Deploy your fine-tuned model
26.3.3 Register your model using the Azure Machine Learning studio
26.3.4 Deploy your model using the Azure Machine Learning studio
26.3.5 Test the model in the Azure Machine Learning studio
26.4 Exercise Fine-tuning a foundation model in the Azure Machine Learning studio
26.4.1 Provision an Azure Machine Learning workspace
26.4.2 Create the workspace and upload the dataset
26.4.3 Explore the data
26.4.4 Create a compute cluster
26.4.5 Fine-tune the model
26.4.6 Test the model
26.4.7 Register the model
26.4.8 Deploy the model
26.4.9 Test the endpoint
27. Prompt Flow
27.1 Understand the development lifecycle of a large language model (LLM) app
27.1.1 Initialization
27.1.2 Experimentation
27.1.3 Evaluation and refinement
27.1.4 Production
27.1.5 Explore the complete development lifecycle
27.2 Core components and explore flow types
27.2.1 Understand a flow
27.2.2 Tools available in prompt flow
27.2.3 Types of flows
27.3 Connections and Runtimes
27.3.1 Connections
27.3.2 Runtimes
27.4 Variants and Monitoring options
27.4.1 Variants
27.4.2 Deploy your flow to an endpoint
27.4.3 Monitor evaluation metrics
27.4.4 Metrics
27.5 prompt flow in the Azure Machine Learning Studio
27.5.1 Create the Azure OpenAI service to deploy a GPT model
27.5.2 Provision an Azure OpenAI resource
27.5.3 Save the endpoint and key
27.5.4 Deploy a model
27.5.5 Provision the necessary resources to create a flow
27.5.6 Provision an Azure Machine Learning resource
27.5.7 Create a compute instance
27.5.8 Create a connection
27.5.9 Create and run a flow in Azure Machine Learning
27.5.10 Configure the inputs
27.5.11 Configure the LLM node
27.5.12 Configure the Python node
27.5.13 Configure the output
27.5.14 Run the flow
|