Agent Configuration Reference#
Path to the agent configuration file. Normally this should only be set via an environment variable
or command-line option. Defaults to
Required. The hostname or IP address of the Determined master.
The port of the Determined master. Defaults to
443 if TLS is enabled and
The ID of this agent; defaults to the hostname of the current machine. Agent IDs must be unique within a cluster.
Master hostname that containers started by this agent will connect to. Defaults to the value of
Master port that containers started by this agent will connect to. Defaults to the value of
Which resource pool the agent should join. Defaults to the value of
default, which will work if
and only if there is a resource pool named
default. For more information please see
The GPUs that should be exposed as slots by the agent. A comma-separated list of GPUs, each
specified by a 0-based index, UUID, PCI bus ID, or board serial number. The 0-based index of NVIDIA
GPUs or AMD GPUs can be obtained via the
The slot type that should be exposed. Dynamic agents having GPUs will be configured to
agents without GPUs with
cpu_slots_allowed: true provisioner option will be configured to
none otherwise. For static agents this field defaults to
auto: Automatically detects the slot type. The agent will detect if there are NVIDIA GPUs or AMD GPUs. If there are GPUs, it maps each GPU to one slot. Otherwise, it maps all the CPUs to a slot.
none: The agent will not create any slots for detected devices.
cuda: The agent will map each detected NVIDIA GPU to a slot. Prior to Determined 0.17.6, this
option was called
cpu: Map all the CPUs to a slot, even when GPUs are present.
rocm: The agent will map each detected ROCm AMD GPU to a slot.
The HTTP proxy address for the agent’s containers.
The HTTPS proxy address for the agent’s containers.
The FTP proxy address for the agent’s containers.
The addresses that the agent’s containers should not proxy.
Security-related configuration settings.
Configuration settings for TLS.
enabled: Whether to use TLS to connect to the master. Defaults to
skip_verify: Skip verifying the master certificate when using TLS. Defaults to
false. Enabling this setting will reduce the security of your Determined cluster.
master_cert: CA cert file for the master when using TLS.
master_cert_name: A hostname for which the master’s TLS certificate is valid, if the value of the
master_hostoption is an IP address or is not contained in the certificate.
client_key: Paths to files containing the client TLS certificate and key to use when connecting to the master.
Docker image to use for the managed Fluent Bit daemon. Defaults to
TCP port for the Fluent Bit daemon to listen on. Defaults to port 24224. Should be unique when running multiple agents on the same node.
Name for the Fluent Bit container. Defaults to
determined-fluent. Should be unique when running
multiple agents on the same node.
Maximum number of times the agent will attempt to reconnect to master on connection failure. Defaults to 5.
Time interval between reconnection attempts, in seconds. Defaults to 5 seconds.
Whether to disable setting
AutoRemove flag on task containers. Defaults to false.
Configuration for commands to run when certain events occur. The value of each option in this section is an array of strings specifying the command and its arguments.
A command to run when the agent fails to either connect to the master on startup or reconnect after
a loss of connection. When reconnecting, the agent will make several attempts as specified by the
agent_reconnect_backoff configuration options.
In order to shut down the machine on which the agent is running, set this to
"now"], or just
["shutdown", "now"] if the agent is running as root. Additional system
configuration may be required in order to allow the agent to execute the command from inside a
Docker container or without the need to enter a password.
Deprecated. This field has been deprecated and will be ignored. Use