Developing a good model

There are different ways to approach the problem:

Here are some examples of choosing the right approach

Confusion Matrix

Bloom's Taxonomy

Data Preparation

K-Fold Cross-Validation Method

AWS Mechanical Turk

The name Mechanical Turk was inspired by "The Turk", an 18th-century chess-playing automaton made by Wolfgang von Kempelen that toured Europe, beating both Napoleon Bonaparte and Benjamin Franklin. It was later revealed that this "machine" was not an automaton at all, but was, in fact, a human chess master hidden in the cabinet beneath the board and controlling the movements of a humanoid dummy. Likewise, the Mechanical Turk online service uses remote human labour hidden behind a computer interface to help employers perform tasks that are not possible using a true machine.

Hyperparameters vs Parameters

Why GPUs?

Model design

Select a model that is a good fit for the objective
Choose the proper ML approach for your objective (regression, binary classification, etc.) ( you need to know about your algo and data)
Choose proper evaluation strategies for your model based on your objective
Know the steps for training a model

Data Preparation

Understand concepts of Training Data and Testing Data
Identify potential biases introduced in an insufficient split strategy
Know when to use sequential splits versus randomized splits and what additional measures could be used to increase training data value. Sequential split is what we would use for time-series data. Perhaps we want to carve off the last 3 months as our test data sets. In other cases we would use randomized splits or k-fold to make sure we have good data for both training and test datasets.

Model Training

Multiple options for training: SageMaker Console, Apache Spark, Custom Code visa SDK, Jupyter Notebook
Be familiar with default data types SageMaker algorithms support and the recommended format for best performance
Know the difference between a Hyper parameter and Parameter
understand the repository and container image concept for SageMaker training
Understand the process if you wish to provide your own algorithm
Understand the process for using Apache Spark to interact with SageMaker

{% embed url="https://docs.aws.amazon.com/sagemaker/latest/dg/your-algorithms.html" %}

{% embed url="https://docs.aws.amazon.com/machine-learning/latest/dg/amazon-machine-learning-key-concepts.html" %}

{% embed url="https://d1.awsstatic.com/whitepapers/aws-managing-ml-projects.pdf" %}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

developing-a-good-model.md

developing-a-good-model.md

Developing a good model

Confusion Matrix

Bloom's Taxonomy

Data Preparation

K-Fold Cross-Validation Method

AWS Mechanical Turk

Hyperparameters vs Parameters

Why GPUs?

Model design

Data Preparation

Model Training

Files

developing-a-good-model.md

Latest commit

History

developing-a-good-model.md

File metadata and controls

Developing a good model

Confusion Matrix

Bloom's Taxonomy

Data Preparation

K-Fold Cross-Validation Method

AWS Mechanical Turk

Hyperparameters vs Parameters

Why GPUs?

Model design

Data Preparation

Model Training