Commands and Shells#
HPE Machine Learning Development Environment commands and shells provide support for running code on an HPE Machine Learning Development Environment cluster without writing a model. This page describes how to manage GPU-powered batch commands and interactive shells.
Commands and shells are started through the HPE Machine Learning Development Environment command-line interface (CLI). To learn more, including installation instructions, visit the HPE Machine Learning Development Environment CLI user guide or HPE Machine Learning Development Environment CLI Reference.
Commands execute a user-specified program on the cluster. Commands are useful for running existing code in batch mode. Shells start SSH servers that let you use cluster resources interactively. Shells provide access to the cluster in the form of interactive SSH sessions.
HPE Machine Learning Development Environment commands are manipulated with CLI commands starting
det command, abbreviated as
det cmd. The main subcommand is
det cmd run, which runs
a command in the cluster and streams its output. For example, the following CLI command uses
nvidia-smi to display information about the GPUs available to tasks in the container:
det cmd run nvidia-smi
You can also run more complex commands including shell constructs provided they are quoted to prevent interpretation by the local shell:
det cmd run 'for x in a b c; do echo $x; done'
det cmd run streams output from the command until it finishes, but the command continues
executing and occupying cluster resources even if the CLI is interrupted or killed, such as due to
Ctrl-C. To stop the command or view additional output, you need the command UUID, which
you can get from the output of the original
det cmd run or
det cmd list. After you have the
det cmd logs <UUID>to view a snapshot of logs.
det cmd logs -f <UUID>to view the current logs and continue streaming future output.
det cmd kill <UUID>to stop the command.
Shell-related CLI commands start with
det shell. To start a persistent SSH server container in
the HPE Machine Learning Development Environment cluster and connect an interactive session to it,
det shell start:
det shell start
After starting a server with
det shell start, you can make another independent connection to the
same server by running
det shell open <UUID>. You can get the UUID from the output of the
det shell start or
det shell list command:
$ det shell list Id | Owner | Description | State | Exit Status --------------------------------------+------------+------------------------------+---------+--------------- d75c3908-fb11-4fa5-852c-4c32ed30703b | determined | Shell (annually-alert-crane) | RUNNING | N/A $ det shell open d75c3908-fb11-4fa5-852c-4c32ed30703b
Optionally, you can provide extra options to pass to the SSH client when using
det shell start
det shell open by including them after
--. For example, this command starts a new shell
and forwards a port from the local machine to the container:
det shell start -- -L8080:localhost:8080
To stop the SSH server container and free cluster resources, run
det shell kill <UUID>.