Submitting Matlab scripts to the SGE cluster

From Center for Cognitive Neuroscience
Jump to navigation Jump to search

Submitting matlab jobs to the SGE cluster is a breeze, just use the following command:

matlab.q [-t HOURS] [-d MEMORY] [-m MESSAGING] MY_SCRIPT.m
(optional) This is the number of hours your script will take to run. Choose a high number, but overestimating by too much can make you wait unnecessarily in the queue
(optional) This is the number of megabytes of memory you would like dedicated to your job (e.g. substituting a value of 1024 will give you 1 GB)
(optional) Place the desired messaging flags here
b = beginning
e = end
a = abort
n = none
So "-m bea" would send a message to the email account on file for you with Hoffman. A separate message will come when the job begins, when the job ends, and if the job was aborted.
The m-file you want to execute.
other options
Try typing "matlab.q" at the command prompt on Hoffman2 and use the Info page to see the other available options.
Keep in mind that you can only run one of these at a time since it uses a full MATLAB license. If you are already running MATLAB on Hoffman2 and try to submit a job with matlab.q, it will fail because you are already using a license.

matlab.q will first try to compile your code and then run it. If the compilation fails, it will run it as a job on its own. To directly compile your code, see the instructions below.

Submitting Multiple Jobs without Using Multiple Licenses (aka Compiling)

The key to this approach is compiling your code using the MATLAB Compiler (MCC) then running it using the queue. Note that the MATLAB Compiler has some limitations to it.

  • Each job must have a single associated .m file.
  • You CANNOT use the path() function inside the .m file.

Commands to do it:

module load matlab
mcc -m my_m_file.m -R -singleCompThread -a top_path
matexe.q my_m_file

Alternative commands:

module load matlab
mcc -m my_m_file.m -R -singleCompThread -a top_path
matexe.q -ns my_m_file
qsub my_m_file.cmd
mcc -m my_m_file.m -a top_path
This will create an executable called my_m_file
top_path adds all files inside top_path and inside it's subfolders
Note that the -a option can make compilation take a long time
Consider using "-I folder1 -I folder2"
Note that folder1 and folder2 must contain ALL of the dependencies of the compiled function, excluding the default base that MATLAB gives you.
This step takes time, but don't try to parallelize it if you're compiling in the same directory. This will lead to errors.
-R -singleCompThread
This specifies that the job only needs one cpu to run instead of the default 8.
If you include this, each job will take more time but it won't need to wait for 8 cpus to be free on a single node.
matexe.q my_m_file
This submits to the queue and also does the "module load matlab" command before your script is executed.
This is the ONLY way to submit a mcc-compiled job. Other attempts to load the module manually will fail.
-t <hours>
This argument is 8 hours by default. [1,336] is allowed for highp.
If you think your job may take longer than 8 hours, you MUST include this or else the job will fail precisely at 8 hours.
This stands for not submit. This allows you to edit your .cmd file as you desire. This is particularly useful for job arrays.
Without using -a
module load matlab
mcc -m my_m_file.m -I folder1 -I folder2
matexe.q my_m_file
L33T tip
As jobs are in the queue, request up to 8 cpus of qrsh sessions at a time. This ensures that you will get, at minimum, 8 cpus of processing power at any one time.
You are limited to 8x1cpu sessions or 1x8cpu session by ATS.
To request a single cpu, use "qrsh -l i,time=24:00:00".
Do this also to ensure that your compiled code works as you expect it to.

If you have more questions and/or want example scripts, ask Wesley and/or Hong-jing.