Submitting Matlab scripts to the SGE cluster

From Center for Cognitive Neuroscience
Revision as of 17:00, 25 August 2011 by Kerr (talk) (Submitting Multiple Jobs without Using Multiple Licenses)

Jump to: navigation, search

Submitting matlab jobs to the SGE cluster is a breeze, all that needs to be done is copying the script below and substituting runme6 with your particular matlab .m file.

#!/bin/sh
#$ -cwd
#$ -j y
#$ -S /bin/sh
#$ -V
. /etc/profile
/Volumes/local/bin/matlab6 -nojvm -nosplash -nodisplay -r runme6

Explanation of script line by line:

#!/bin/sh
this is a sh script
#$ -cwd
before executing the script, change into the current directory
#$ -j y
the output and errors will be put in the same file (in your current directory)
#$ -S /bin/sh
SGE will use the /bin/sh shell
#$ -V
export the current environment to SGE
. /etc/profile
make sure all our paths are right
/Volumes/local/bin/matlab6 -nojvm -nosplash -nodisplay -r runme6
Use the matlab6 binary (change this to the version you desire/need), no gui, no splash screen, and no display used, run the runme6.m script found in the current directory (change this to your particular script.

Wrappers for general script submission

A set of scripts has been created to ease the use of matlab & clustering. The available scripts are as follows:

matsub6
submits jobs using the matlab 6.5 version
matsub7
submits jobs using the matlab 7.0 version
matsub75
submits jobs using the matlab 7.5 version

Usage

$ matsub6 myAwesomeMatlabScript.m

Output will be placed in the current directory in a file called mat_log

Licensing and SGE Matlab Jobs

Please note that each job you submit to SGE will use up one of our available license seats. Submitting multiple jobs at once could use up all the seats so no one but you will have access to matlab.

Please be courteous to other matlab users and only submit one job at a time.

Submitting Multiple Jobs without Using Multiple Licenses

The key to this approach is compiling your code using the MATLAB Compiler (MCC) then running it using the queue. Note that the MATLAB Compiler has some limitations to it.

  • Each job must have a single associated .m file.
  • You CANNOT use the path() function inside the .m file.

Commands to do it:

qrsh
module load matlab
mcc -m my_m_file.m -a top_path
matexe.q my_m_file
mcc -m my_m_file.m -a top_path
This will create an executable called my_m_file
top_path adds all files inside top_path and inside it's subfolders
Note that the -a option can make compilation take a long time
Consider using "-I folder1 -I folder2"
Note that folder1 and folder2 must contain ALL of the dependencies of the compiled function, excluding the default base that MATLAB gives you.
This step takes time, but don't try to parallelize it if you're compiling in the same directory. This will lead to errors.
-R -singleCompThread
This specifies that the job only needs one cpu to run instead of the default 8.
If you include this, each job will take more time but it won't need to wait for 8 cpus to be free on a single node.
matexe.q my_m_file
This submits to the queue and also does the "module load matlab" command before your script is executed.
This is the ONLY way to submit a mcc-compiled job. Other attempts to load the module manually will fail.
-t <hours>
This argument is 8 hours by default. [1,336] is allowed for highp.
If you think your job may take longer than 8 hours, you MUST include this or else the job will fail precisely at 8 hours.
Without using -a
qrsh
module load matlab
mcc -m my_m_file.m -I folder1 -I folder2
matexe.q my_m_file
L33T tip
As jobs are in the queue, request up to 8 cpus of qrsh sessions at a time. This ensures that you will get, at minimum, 8 cpus of processing power at any one time.
You are limited to 8x1cpu sessions or 1x8cpu session by ATS.
To request a single cpu, use "qrsh -l i,time=24:00:00".
Do this also to ensure that your compiled code works as you expect it to.

If you have more questions and/or want example scripts, ask Wesley and/or Hong-jing.