Install Determined Using Linux Packages#

This user guide provides step-by-step instructions for installing and upgrading Determined using Linux packages.

Determined releases Debian and RPM packages for installing the Determined master and agent as systemd services on machines running Linux.

You have two options for installing the Determined master and agent:

  • Using Debian packages on Ubuntu 20.04 or 22.04, or

  • Using RPM packages on Enterprise Linux distributions (e.g., AlmaLinux, Oracle Linux, Red Hat Enterprise Linux, or Rocky Linux).

Preliminary Setup#

PostgreSQL#

Determined uses a PostgreSQL database to store experiment and trial metadata. You may either use a Docker container or your Linux distribution’s package and service.

Note

If you are using an existing PostgreSQL installation, we recommend confirming that max_connections is at least 96, which is sufficient for Determined.

Run PostgreSQL in Docker#

  1. Pull the official Docker image for PostgreSQL. We recommend using version 10 or greater.

    docker pull postgres:10
    

    This image is not provided by Determined AI; please see its Docker Hub page for more information.

  2. Start PostgreSQL as follows:

    docker run \
        -d \
        --restart unless-stopped \
        --name determined-db \
        -p 5432:5432 \
        -v determined_db:/var/lib/postgresql/data \
        -e POSTGRES_DB=determined \
        -e POSTGRES_PASSWORD=<Database password> \
        postgres:10
    

    If the master will connect to PostgreSQL via Docker networking, exposing port 5432 via the -p argument isn’t necessary; however, you may still want to expose it for administrative or debugging purposes. In order to expose the port only on the master machine’s loopback network interface, pass -p 127.0.0.1:5432:5432 instead of -p 5432:5432.

Install PostgreSQL using apt or yum#

  1. Install PostgreSQL 10 or greater.

    Debian Distributions

    On Debian distributions, use the following command:

    sudo apt install postgresql-10
    

    Enterprise Linux Distributions

    On Enterprise Linux distributions, you’ll need to configure the PostgreSQL yum repository as described in the Enterprise Linux documentation. Then, install version 10:

    sudo yum install postgresql-server -y
    sudo postgresql-setup initdb
    sudo systemctl start postgresql.service
    sudo systemctl enable postgresql.service
    
  2. The authentication methods enabled by default may vary depending on the provider of your PostgreSQL distribution. To enable the determined-master to connect to the database, ensure that an appropriate authentication method is configured in the pg_hba.conf file.

    When configuring the database connection as described in Configure and Start the Cluster, note the following:

    • If you specify the db.hostname property, you must use a PostgreSQL host (TCP/IP) connection.

    • If you omit the db.hostname property, you must use a PostgreSQL local (Unix domain socket) connection.

  3. Finally, create a database for Determined’s use and configure a system account that Determined will use to connect to the database.

    For example, executing the following commands will create a database named determined, create a user named determined with the password determined-password, and grant the user access to the database:

    sudo -u postgres psql
    postgres=# CREATE DATABASE determined;
    postgres=# CREATE USER determined WITH ENCRYPTED PASSWORD 'determined-password';
    postgres=# GRANT ALL PRIVILEGES ON DATABASE determined TO determined;
    

Install the Determined Master and Agent#

  1. Find the latest release of Determined by visiting the Determined repo.

  2. Download the appropriate Debian or RPM package file, which will have the name determined-master_VERSION_linux_amd64.[deb|rpm] (where VERSION is the actual version, e.g., 0.31.0). Similarly, the agent package is named determined-agent_VERSION_linux_amd64.[deb|rpm].

  3. Install the master package on one machine in your cluster, and the agent package on each agent machine.

    Debian Distributions

    On Debian distributions, use the following command:

    sudo apt install <path to downloaded package>
    

    Enterprise Linux Distributions

    On Enterprise Linux distributions, use the following command during the initial installation:

    sudo rpm -i <path to downloaded package>
    

    When upgrading, follow the instructions in the upgrade section.

    Before running the Determined agent, install Docker on each agent machine. If the machine has GPUs, ensure that the NVIDIA Container Toolkit is working as expected.

Configure and Start the Cluster#

  1. Ensure that an instance of PostgreSQL is running and accessible from the machine where the Determined master will run.

  2. Edit the YAML configuration files at /etc/determined/master.yaml (for the master) and /etc/determined/agent.yaml (for each agent) as appropriate for your setup.

    Important

    Ensure that the user, password, and database name correspond to your PostgreSQL configuration.

    db:
      host: <PostgreSQL server IP or hostname, e.g., 127.0.0.1 if running on the master>
      port: <PostgreSQL port, e.g., 5432 by default>
      name: <Database name, e.g., determined>
      user: <PostgreSQL user, e.g., postgres>
      password: <Database password>
    
  3. Start the master by typing the following command:

    sudo systemctl start determined-master
    

    Note

    You can also run the master directly using the command determined-master. This may be useful when experimenting with Determined, such as when you want to quickly test different configuration options before writing them to the configuration file.

  4. Optionally, configure the master to start on boot.

    sudo systemctl enable determined-master
    
  5. Verify that the master started successfully by viewing the log.

    journalctl -u determined-master
    

    You should see logs indicating that the master can successfully connect to the database, and the last line should indicate http server started on the configured WebUI port (8080 by default). You can also validate that the WebUI is running by navigating to http://<master>:8080 with your web browser (or https://<master>:8443 if TLS is enabled). You should see No Agents on the right-hand side of the top navigation bar.

  6. Start the agent on each agent machine.

    sudo systemctl start determined-agent
    

    Similarly, the agent can be run with the command determined-agent.

  7. Optionally, configure the agent to start on boot.

    sudo systemctl enable determined-agent
    
  8. Verify that each agent started successfully by viewing the log.

    journalctl -u determined-agent
    

    You should see logs indicating that the agent started successfully, detected compute devices, and connected to the master. On the Determined WebUI, you should now see slots available, both on the right-hand side of the top navigation bar, and if you select the Cluster view in the left-hand navigation panel.

Socket Activation#

The master can be configured to use systemd socket activation, allowing it to be started automatically on demand (e.g., when a client makes a network connection to the port) and restarted with reduced loss of connection state. To switch to socket activation, run the following commands:

sudo systemctl disable --now determined-master
sudo systemctl enable --now determined-master.socket

When socket activation is in use, the port on which the master listens is configured differently; the port listed in the master config file is not used, since systemd manages the listening socket. The default socket unit for Determined is configured to listen on port 8080. To use a different port, run:

sudo systemctl edit determined-master.socket

which will open a text editor window. To change the listening port, insert the following text (with the port number substituted appropriately) into the editor and then exit the editor:

[Socket]
ListenStream=
ListenStream=0.0.0.0:<port>

For example, you might want to configure the master to listen on port 80 for HTTP traffic or on port 443 if using TLS.

After updating the configuration, run the following commands to put the change into effect (this will restart the master):

sudo systemctl stop determined-master
sudo systemctl restart determined-master.socket

See the systemd documentation on socket unit files or systemctl for more information.

Manage the Cluster#

To configure a service to start running automatically when its machine boots up, run sudo systemctl enable <service>, where the service is determined-master or determined-agent. You can also use sudo systemctl enable --now <service> to enable and immediately start a service in one command.

To view the logging output of a service, run journalctl -u <service>.

To manually stop a service, run sudo systemctl stop <service>.

Upgrade the Cluster#

To upgrade, reinstall Determined.

Note

Enterprise Linux Distributions

When installing the Determined master and agent during the upgrade process, use the following command:

sudo rpm -U <path to downloaded package>

Once the upgrade is completed, reload and restart determined-master.service:

sudo systemctl daemon-reload
sudo restart determined-master.service

Note

Upgrading does not interrupt jobs that are running on the cluster.