Slurm: Difference between revisions
Jump to navigation
Jump to search
No edit summary |
|||
Line 4: | Line 4: | ||
Install rpms. | Install rpms. | ||
yum -y install munge slurm slurm-plugins slurm-munge | yum -y install munge slurm slurm-plugins slurm-munge | ||
configure munge | ==configure munge== | ||
dd if=/dev/random bs=1 count=1024 > /etc/munge/munge.key | dd if=/dev/random bs=1 count=1024 > /etc/munge/munge.key | ||
chmod 0600 /etc/munge/munge.key | chmod 0600 /etc/munge/munge.key | ||
chown munge /etc/munge/munge.key | chown munge /etc/munge/munge.key | ||
systemctl start munge | systemctl start munge | ||
==troubleshooting== | |||
Restore node. | |||
scontrol update nodename=www state=resume | |||
==test installation== | ==test installation== |
Revision as of 22:07, 12 January 2016
install slurm under fedora 21
Build slurm
rpmbuild -ta slurm*.tar.bz2
Install rpms.
yum -y install munge slurm slurm-plugins slurm-munge
configure munge
dd if=/dev/random bs=1 count=1024 > /etc/munge/munge.key chmod 0600 /etc/munge/munge.key chown munge /etc/munge/munge.key systemctl start munge
troubleshooting
Restore node.
scontrol update nodename=www state=resume
test installation
Generate a credential on stdout.
munge -n
Check if a credential can be locally decoded.
munge -n | unmunge
Check if a credential can be remotely decoded.
munge -n | ssh somehost unmunge
Run a quick benchmark.
remunge
how does it work
scontrol show config
check priorities of jobs using the command
scontrol show job".
job control
Submit a job
sbatch /tmp/slurm_test_1
List jobs:
squeue
Get job details:
scontrol show job 106
Suspend a job (root only):
scontrol suspend 135
Resume a job (root only):
scontrol resume 135
Kill a job. Users can kill their own jobs, root can kill any job.
scancel 135
Hold a job
scontrol hold 139
Release a job:
scontrol release 139
List partitions:
sinfo
example job script.
#!/usr/bin/env bash #SBATCH -p defq #SBATCH -J simple sleep 60