Create Your First Experiment#

Follow these steps to see how to run your first experiment.

Prerequisites#

You must have a running HPE Machine Learning Development Environment cluster with the CLI installed.

  • To set up a local cluster, visit Quick Installation.

  • To set up a remote cluster, visit the Installation Guide where you’ll find options for On Prem, AWS, GCP, Kubernetes, and Slurm.

Run an Experiment#

Train a single model for a fixed number of batches, using constant values for all hyperparameters on a single slot. A slot is a CPU or CPU computing device, which the HPE Machine Learning Development Environment master schedules to run.

Note

To run an experiment in a local training environment, your HPE Machine Learning Development Environment cluster requires only a single CPU or GPU. A cluster is made up of a master and one or more agents. A single machine can serve as both a master and an agent.

  1. Download and extract the tar file: mnist_pytorch.tgz.

  2. Open a terminal window and navigate to the directory where you extracted the tar file.

    The const.yaml file is a YAML-formatted experiment configuration file that corresponds to an example experiment.

  3. Create an experiment that specifies the const.yaml configuration file by typing the following CLI command.

    det experiment create const.yaml .
    

    The final dot (.) argument uploads all of the files in the current directory as the context directory for your model. HPE Machine Learning Development Environment copies the model context directory contents to the trial container working directory.

  4. To view the experiment in your browser:

    • Enter the following URL: http://localhost:8080/. This is the cluster address for your local training environment.

    • Accept the default username of determined, and click Sign In. A password is not required.

  5. Navigate to the home page and then visit your Uncategorized experiments.

    Determined AI WebUI Dashboard showing a user's recent experiment submissions
  6. Select the experiment to display the experiment’s details such as Metrics.

    Determined AI WebUI Dashboard showing details for a local experiment

Learn More#

Want to learn how to adapt your existing model code to HPE Machine Learning Development Environment?

The behavior of an experiment is configured via an experiment configuration, or YAML, file. A configuration file is typically passed as a command-line argument when an experiment is created with the CLI.

  • Visit the Experiment Configuration Reference for a complete description of the experiment configuration file.

  • Visit the Core API User Guide for a walk-through of how to adapt your existing model code to HPE Machine Learning Development Environment using the PyTorch MNIST model.

Deep Dive Quick Start

To learn more about how to change your configuration settings to run a distributed training job on multiple GPUs, visit the Quickstart for Model Developers.

More Tutorials

For more quick-start guides including API guides, visit the Tutorials.