Finding FSL Debug Info

From Center for Cognitive Neuroscience
Jump to navigation Jump to search

Gathering Required Debug Info for FSL

Many times you may submit a job and it fails. Due to the nature of a Cluster Job, often the most important information is not right on the screen. But once you understand the mystery of where this info is stored, it's simple to find. More importantly, often times the error is obvious. Such as "Could not find file XYZ". Even if the file is nothing but greek, including this information is essential to anyone attempting to help you.

Finding the Cluster Logs

FSL <= 3.x

For FSL <= 3.x, the logs are stored in /Volumes/local/tmp/error and /Volumes/local/tmp/output. The log for your job submission is uniquely identified by it's job id. The job id is displayed after you execute feat or can be found via the qstat utility.

You'll find that there are many, many log files found there. The easiest way to only list your files are by executing the following command:

$ find /Volumes/local/tmp/ -name "*5116*"

Where 5116 is your unique job id.

FSL >= 4.x

For these versions of fsl, all the logs are kept in the logs directories of the .feat/.gfeat projects. If your project was named my_nobel.feat, you would find the logs at my_nobel.feat/logs. If the project contains cope directories, each of these will also have a log directory.

Occasionally, logs are found in the top level of the .feat/.gfeat directories. However, the fsl folks were very clever and usually named these with the words "report" or "log" within them. For example, report.html or report_log.html.

All Versions

There are two extremely important tools that can be used to find elusive debug information.

The first is grep. grep can often be used to dig out useful information. Two of the most appropriate grep commands would be

  • grep -ir "error" myproject.feat
  • grep -ir "warn" myproject.feat

The -i switch means "case insensitive" and the -r switch means "search through the directory and all subdirectories".

If this fails, open up the log files and read them. Not just the HTML files, but the text files as well. You don't have to read every line, just scan the page for any oddities, warnings, or error type entries.

The second tool is "qstat" and friends. The 'q' utilities provide the ability to get detailed information about your cluster jobs. To see a list of all 'q' utilities, just type 'q' in the terminal and hit <TAB>. There are quite a few, but we'll focus on qstat.

Every cluster job has a jobid. Once you've run feat, type:

$ qstat

In terminal to get a list of queued up jobs. You'll see some with your name under the "user" column. The first column is the "Job-ID" column. By typing:

$ qstat -j <jobid>

Where jobid is the number you saw in the qstat list. This provides very detailed information about the job including any errors. Read over it and look for something meaningful. One of the most common I see is "chdir failed: /Volumes/Songbook1/data/somelab/someuser/foo does not exist", usually indicating a typo in your design.fsf file.

qstat is a very powerful utility, for more information see:

man qstat

Further, in the feat/gfeat log files we spoke of above, each submitted job is followed by a jobid which can be used to uncover more debugging information with qstat.

Why?

This data is excellent information to include in any requests for assistance. In fact, anyone you ask for help will usually respond without hesitation "Can I see your logs?". By preemptively supplying them, you eliminate one round of the support cycle.