Slurm Compute Node Deployment Guide#

The following instructions are for deploying the Slurm Compute/Client agents.

Prerequisites#

This guide is written for a Red Hat Enterprise Linux 8 based operating system which is operating within a cluster of systems and the following are the prerequisites:

Deployment Scripts#

It is assumed that the slurm-22.05.6.rpm.tar.gz file generated by the Slurm Controller Deployment Guide has been uploaded to the system they will be deployed on.

An example bash script of the instructions has been provided: deploy-slurm-compute-node.sh

Deployment Steps#

Note

Instructions assume execution using the root account.

  1. Install dependencies:

dnf -y install munge munge-libs hwloc hwloc-libs ncurses curl rrdtool mariadb \
    mariadb-devel ibacm infiniband-diags
  1. Exclude DNF/Yum Slurm packages:

cat >> /etc/dnf/dnf.conf <<EOL

# Exclude slurm Packages
excludepkgs=slurm*

EOL
  1. Create the deploy directory and go to it:

mkdir -p /root/tmp
cd /root/tmp
  1. Upload TAR file:

Note

Any file transfer method can be used.

scp slurm-22.05.6.rpm.tar.gz root@comp01.engwsc.example.com:/root/tmp
  1. Extract RPMs:

Note

It is recommended that the original installtion files be deleted at the completion of this guide.

tar -xvf slurm-22.05.6.rpm.tar.gz
  1. Copy Munge authentication key:

mkdir -p /etc/munge
cp -f munge.key /etc/munge/munge.key

chmod 700 /etc/munge
chmod 600 /etc/munge/munge.key
chown -R munge:munge /etc/munge
  1. Enable Munge service:

systemctl enable --now munge
  1. Create Slurm user account and group:

groupadd -g 64030 slurm
useradd -m -c "SLURM workload manager" -d /var/lib/slurm -u 64030 -g slurm -s /bin/bash slurm
  1. Install Slurm Compute/Client RPMs:

rpm -ivh \
    slurm-22.05.6-1.el8.x86_64.rpm \
    slurm-perlapi-22.05.6-1.el8.x86_64.rpm \
    slurm-slurmd-22.05.6-1.el8.x86_64.rpm \
    slurm-torque-22.05.6-1.el8.x86_64.rpm
  1. Copy configuration files:

mkdir -p /etc/slurm/

cp -f cgroup.conf /etc/slurm/
cp -f slurm.conf  /etc/slurm/
chmod 755 /etc/slurm/
chmod 600 /etc/slurm/cgroup.conf
chmod 600 /etc/slurm/slurm.conf
chown -R slurm:slurm /etc/slurm/
  1. Create log files:

mkdir -p /var/log/slurm/

touch /var/log/slurm/slurmd.log
chmod 755 /var/log/slurm/
chmod 644 /var/log/slurm/*.log
chown -R slurm:slurm /var/log/slurm
  1. Create spool files:

mkdir -p /var/spool/slurmd
chmod 700 /var/spool/slurmd
chown -R slurm:slurm /var/spool/slurmd
  1. Setup Slurm log rotation:

cat >> /etc/logrotate.d/slurm <<EOL
/var/log/slurmd.log
{
    missingok
    notifempty
    rotate 4
    weekly
    create
}

EOL

chmod 644 /etc/logrotate.d/slurm
  1. Configure firewalld rules:

Important

Replace the IPv4 Address and Subnet mask with the value of your network.

systemctl enable --now firewalld
firewall-cmd --zone=public --add-source=192.168.1.0/24 --permanent
firewall-cmd --zone=public --add-port={6817/tcp,6818/tcp} --permanent
firewall-cmd --reload
  1. Enable Slurm Compute/Client services:

systemctl enable --now slurmd