Hoffman2:Compiling MATLAB

From Center for Cognitive Neuroscience
Jump to navigation Jump to search

Back to all things Hoffman2

Running the MATLAB GUI is not always desirable. If you already know that your code works and no visuals or user interaction are needed, then using the MATLAB GUI is unnecessary and adds extra work for the computer. Instead, it may be better to compile your MATLAB code into an executable that can be submitted as a job on Hoffman2.

Furthermore, if you need to run the same MATLAB script on multiple subjects and would like to do so in parallel, you will find that you can't simply open up multiple instances of MATLAB on Hoffman2. This is because each user is only allotted one MATLAB license. To parallelize your processing, it would be better to compile your MATLAB script and submit a job for each subject using the compiled code. Compiled code does not require a MATLAB license in order to run.


Version Notice - 2012.04.23

As of 2012.04.23 ATS updated the default MATLAB version from 7.11 (R2010b) to 7.14 (R2012a).

This means that any MATLAB code previously compiled will need to load the old MATLAB module in order to run. e.g.

$ module load matlab/7.11
$ /path/to/my/old/compiled/code

And the matexe.q tool will be using the newer MATLAB module. Which means you can't use it to submit previously compiled code. Please recompile your code to work with the new MATLAB version before submitting jobs with this tool.


Compiling

There is a handy tool, mcc, which can take a MATLAB .m file and compile it into an executable that can be run outside of MATLAB. It is expensive to buy, but luckily Hoffman2 has already done the purchasing for you.

Remember that to use mcc, you need an open MATLAB license. If you are running MATLAB already and open up a new Hoffman2 session intending to compile something with mcc, know that you will see errors and compilation will fail because you are already using your one allotted MATLAB license.

We have two example MATLAB scripts:

/u/home/FMRI/apps/examples/mcc/subfunctions/exampleSubfunction.m

This function can take any number of arguments and the text it prints out is based on the type of variables it receives. It is kept in a separate directory from the exampleMatlabScript.m file to illustrate how to work with dependent files in different parts of the filesystem.

function exampleSubfunction(varargin)

fprintf('This is a MATLAB function that takes input arguments.\n');

for i=1:length(varargin)
    if( ~isempty(varargin{i}) )
        switch class(varargin{i})
            case 'double'
                fprintf('Argument #%d\t -- %s -- is a double\n', i, varargin{i});
            case 'char'
                fprintf('Argument #%d\t -- %s -- is a char\n', i, varargin{i});
            otherwise
                fprintf('Argument #%d\t -- is an unknown\n', i);
        end
    else
        fprintf('Argument #%d\t -- is empty\n', i);
    end
end
end


/u/home/FMRI/apps/examples/mcc/exampleMatlabScript.m

Makes use of the exampleSubfunction() function.

function matlabScript(varargin)

fprintf('Passing numbers to the subfunction...\n');
subfunction(1, 2, 3);

fprintf('Passing characters to the subfunction...\n');
subfunction('a', 'b', 'c');

fprintf('Passing whatever arguments you gave me on to the subfunction...\n');
for i=1:length(varargin)
    subfunction(varargin{i});
end

end


To compile this example script, first check out an interactive node because this can be a computational intensive process. Then run the following commands

$ mkdir ~/mccExample
$ cd ~/mccExample
$ mcc -m /u/home/FMRI/apps/examples/mcc/exampleMatlabScript.m -R -singleCompThread -I /u/home/FMRI/apps/examples/mcc/subfunctions
-m /u/home/FMRI/apps/examples/mcc/exampleMatlabScript.m
This flag is used to indicate which file is to be compiled.
-R -singleCompThread
This flag indicates what resources are needed. Default is to ask for all available cores, but by only requesting a single core you allow yourself more flexibility. If your code needed all the cores on a node, any jobs you submit to the queue will probably wait much longer to execute. If your code only needs one core, you job can be more flexible in scheduling and will probably execute sooner.
-I /u/home/FMRI/apps/examples/mcc/subfunctions
This flag is used to indicate the subdirectories where any support files are located so that they can be included in the compiling. If you had support .m files in multiple directories, you would need to include each separately e.g.
mcc -m /path/to/main/file -R -singleCompThread -I /path/to/file/one -I /path/to/file/two -I /path/to/file/three

The output will probably start with

Ifconfig uses the ioctl access method to get the full address information, which limits hardware addresses to 8 bytes.
Because Infiniband address has 20 bytes, only the first 8 bytes are displayed correctly.
Ifconfig is obsolete! For replacement check ip.

which can be ignored and will be followed by any other messages related to the compiling, or error messages if something went wrong.



Testing

Now that you've compiled your code, you should test it to make sure it works before submitting 100,000 jobs using it.

  1. Check out an interactive node
  2. Execute
    $ module load matlab
    so that the computing node knows how to speak MATLAB
  3. Then change to the directory where the compiled code is
    $ cd ~/mccExample
  4. And run your compiled code
    $ ./exampleMatlabScript 
    Feel free to add any arguments to your function call, e.g.
    $ ./exampleMatlabScript ARG1 2 3 4
    to see how the output changes.


Submitting

You've compiled code and tested that it works. Time to submit it as a job.

matexe.q

ATS has a tool for this (similar to job.q) called matexe.q. It's a step-by-step tool that's pretty self explanatory, but we provided an example anyway.

Example

  1. Once on Hoffman2, you'll need to edit one file so pull out your favorite text editor and edit the file
    ~/.queuerc
  2. Add the line (if it isn't already there)
    set qqodir = ~/job-output
  3. You've just set the default directory where your job command files will be created. Save the configuration file and close your text editor.
  4. Make that directory using the command
    $ mkdir ~/job-output
  5. Now run
    $ matexe.q
  6. Press enter to acknowledge any messages that may appear (READ IT FIRST THOUGH).
  7. Type Build <ENTER> to begin creating an SGE command file.
  8. The program now asks you which script you'd like to run, enter the following text to use our example script
    ~/mccExample/exampleMatlabScript
  9. The program now asks how much memory the job will need (in Megabytes). This script is really simple, but let's go with the default 1024.
  10. The program now asks how long will the job take (in hours). Go with the minimum 1 hour; it will complete in much less than this.
  11. The program now asks if your job should be limited to only your resource group's cores. Answer n because you do not need to be limiting yourself here and the job is not going to be running for more than 24 hours.
  12. It will ask for any arguments you'd like to supply. Feel free to put random values here or nothing at all to see how the example script treats your arguments.
  13. Soon, the program will tell you that exampleMatlabScript.cmd has been built and saved.
  14. When it asks you if you would like to submit your job, say no. Then type Quit <ENTER> to leave the program.
  15. Now you should be able to run
    $ ls ~/job-output
    and see exampleMatlabScript.cmd. This file will stay there until you delete it and can be run over and over again. Making a command file like this is especially useful if there is a task you'll be running repeatedly on Hoffman2. But if this is something you only need to run once, you should delete the file so you don't needlessly approach your quota.
  16. The time has come to actually run the program (thought we'd never get to that, didn't you?). Type
    $ qsub job-output/exampleMatlabScript.cmd
    and after hitting enter, a message similar to this will pop up:
    Your job 1882940 ("exampleMatlabScript.cmd") has been submitted
    where the number is your JobID, a unique numerical identifier for the computer job you have submitted to the queue.
  17. Now you can check if the job has finished running by doing
    $ ls ~/job-output
  18. When two files named exampleMatlabScript.output.[JOBID] and exampleMatlabScript.joblog.[JOBID] (where JOBID is your job's unique identifier) appear, your job has run.
    exampleMatlabScript.output.[JOBID]
    This file has all the standard output generated by your script. In this case it will be something like
    Passing numbers to the subfunction...
    This is a MATLAB function that takes input arguments.
    Argument #1 -- -- is a double
    Argument #2 -- -- is a double
    Argument #3 -- -- is a double
    Passing characters to the subfunction...
    This is a MATLAB function that takes input arguments.
    Argument #1 -- a -- is a char
    Argument #2 -- b -- is a char
    Argument #3 -- c -- is a char
    Passing whatever arguments you gave me on to the subfunction...
    This is a MATLAB function that takes input arguments.
    Argument #1 -- 1 -- is a char
    This is a MATLAB function that takes input arguments.
    Argument #1 -- a -- is a char
    This is a MATLAB function that takes input arguments.
    Argument #1 -- 2 -- is a char
    This is a MATLAB function that takes input arguments.
    Argument #1 -- b -- is a char
    This is a MATLAB function that takes input arguments.
    Argument #1 -- 3 -- is a char
    This is a MATLAB function that takes input arguments.
    Argument #1 -- c -- is a char
    Warning: No display specified. You will not be able to display graphics on the screen.
    exampleMatlabScript.joblog.[JOBID]
    This file has all the details about when, where, and how your job was processed. Useful information if you are going to be running this job over and over and need to fine tune the resources it uses.
  19. Better ways of checking on your job can be found here.
  20. Finally, go check the inbox of the email you used to sign up for your Hoffman2 account. There will be two emails from "root@mail.hoffman2.idre.ucla.edu" that indicate when the job was started and when the job was completed. This is one of the neat features of the queue so that you can be alerted about the progress of your job without having to stay logged into Hoffman2 and checking on it constantly.


By hand

You could also make a shell script that contains

#!/bin/bash
source /u/local/Modules/default/init/modules.sh
module load matlab
/path/to/compiled/matlab/script

and submit this shell script using qsub or q.sh to achieve similar results.


About those arguments

If you played around with submitting different arguments to the sample script we compiled above, you will have noticed that compiled MATLAB treats all arguments as if they are strings. So if you plan to leverage arguments in your code to make your functions more useful, and we recommend that you do, you may need to make use of the command

str2num()

to convert certain arguments into usable numbers.


Known Issues

  • The official mcc documentation cites a -a flag for including a directory and all of its subdirectories recursively. We haven't found this flag to work very effectively and strongly suggest you use -I to explicitly name all the directories that have .m files your code depends on.
  • Compiling EEGLAB has thus far failed on Hoffman2. Compiling SPM5 or SPM8 may fail similarly. If you figure out how to do this, let us know.
  • You cannot compile something while you have another MATLAB session open. Each Hoffman2 user is allotted one MATLAB license at a time, and mcc requires a license to run.


External Links