There are different ways to approach the problem:
Here are some examples of choosing the right approach
The name Mechanical Turk was inspired by "The Turk", an 18th-century chess-playing automaton made by Wolfgang von Kempelen that toured Europe, beating both Napoleon Bonaparte and Benjamin Franklin. It was later revealed that this "machine" was not an automaton at all, but was, in fact, a human chess master hidden in the cabinet beneath the board and controlling the movements of a humanoid dummy. Likewise, the Mechanical Turk online service uses remote human labour hidden behind a computer interface to help employers perform tasks that are not possible using a true machine.
- Select a model that is a good fit for the objective
- Choose the proper ML approach for your objective (regression, binary classification, etc.) ( you need to know about your algo and data)
- Choose proper evaluation strategies for your model based on your objective
- Know the steps for training a model
- Understand concepts of Training Data and Testing Data
- Identify potential biases introduced in an insufficient split strategy
- Know when to use sequential splits versus randomized splits and what additional measures could be used to increase training data value. Sequential split is what we would use for time-series data. Perhaps we want to carve off the last 3 months as our test data sets. In other cases we would use randomized splits or k-fold to make sure we have good data for both training and test datasets.
- Multiple options for training: SageMaker Console, Apache Spark, Custom Code visa SDK, Jupyter Notebook
- Be familiar with default data types SageMaker algorithms support and the recommended format for best performance
- Know the difference between a Hyper parameter and Parameter
- understand the repository and container image concept for SageMaker training
- Understand the process if you wish to provide your own algorithm
- Understand the process for using Apache Spark to interact with SageMaker
{% embed url="https://docs.aws.amazon.com/sagemaker/latest/dg/your-algorithms.html" %}
{% embed url="https://docs.aws.amazon.com/machine-learning/latest/dg/amazon-machine-learning-key-concepts.html" %}
{% embed url="https://d1.awsstatic.com/whitepapers/aws-managing-ml-projects.pdf" %}