Hoffman2:Introduction: Difference between revisions
No edit summary |
No edit summary |
||
Line 1: | Line 1: | ||
=What is Hoffman2?= | =What is Hoffman2?= | ||
The Hoffman2 Cluster is a campus computing resource at UCLA and is named for Paul Hoffman (1947-2003). It is maintained by the Academic Technology Services Department at UCLA and they host a webpage about it [http://www.ats.ucla.edu/clusters/hoffman2/ here]. With many high end processors and data storage and backup technologies, it is a useful tool for executing research computations especially when working with large datasets. More than 1000 users are currently registered and the cluster sees tremendous usage. In February 2012 alone, there were more than 4 million compute hours logged. See more usage statistics [http://www.ats.ucla.edu/clusters/hoffman2/h2stat/statistics.htm here]. | The Hoffman2 Cluster is a campus computing resource at UCLA and is named for Paul Hoffman (1947-2003). It is maintained by the Academic Technology Services Department at UCLA and they host a webpage about it [http://www.ats.ucla.edu/clusters/hoffman2/ here]. With many high end processors and data storage and backup technologies, it is a useful tool for executing research computations especially when working with large datasets. More than 1000 users are currently registered and the cluster sees tremendous usage. In February 2012 alone, there were more than 4 million compute hours logged. See more usage statistics [http://www.ats.ucla.edu/clusters/hoffman2/h2stat/statistics.htm here]. | ||
=Anatomy of the Computing Cluster= | |||
What does Hoffman2 consist of? | What does Hoffman2 consist of? | ||
* Login Nodes | |||
* Computing Nodes | |||
* Storage Space | |||
* Sun Grid Engine (a brain of sorts) | |||
[[File:hoffman2layout.png]] | |||
==Login Nodes== | |||
There are four login nodes which allow you to access and interact with the Hoffman2 Cluster. These are essentially four dedicated computers that you can SSH into and use to look at your files or submit computing jobs to the queue (more on what the queue is in a bit). It is important to remember that these are four computers being shared by ALL the Hoffman2 users. Doing ANY type of heavy computing on these nodes is frowned upon. If you are: | There are four login nodes which allow you to access and interact with the Hoffman2 Cluster. These are essentially four dedicated computers that you can SSH into and use to look at your files or submit computing jobs to the queue (more on what the queue is in a bit). It is important to remember that these are four computers being shared by ALL the Hoffman2 users. Doing ANY type of heavy computing on these nodes is frowned upon. If you are: | ||
*calculating the inverse solution to an EEG signal, or | *calculating the inverse solution to an EEG signal, or | ||
Line 15: | Line 18: | ||
you should NOT be doing this on a login node. If the sysadmins at ATS find any process that is taking up too many resources on the login nodes, they reserve the right to terminate the process immediately. | you should NOT be doing this on a login node. If the sysadmins at ATS find any process that is taking up too many resources on the login nodes, they reserve the right to terminate the process immediately. | ||
==Computing Nodes== | |||
There are more than 800 computing nodes, each with 8, 12, or 16 cores and 1GB, 4GB, or 8GB of RAM per core. This means there are more than 6500 computing cores available and this number continues to grow as the cluster is expanded. The individual cores of the computing nodes are where your programs gets executed when you submit a job to the cluster. There are ways to request that you only be given one core to use or that you be given many cores. | There are more than 800 computing nodes, each with 8, 12, or 16 cores and 1GB, 4GB, or 8GB of RAM per core. This means there are more than 6500 computing cores available and this number continues to grow as the cluster is expanded. The individual cores of the computing nodes are where your programs gets executed when you submit a job to the cluster. There are ways to request that you only be given one core to use or that you be given many cores. | ||
There is also a GPU cluster that has more than 300 nodes, but access to this must be requested separately from a normal Hoffman2 account. | There is also a GPU cluster that has more than 300 nodes, but access to this must be requested separately from a normal Hoffman2 account. Look [http://www.ats.ucla.edu/clusters/hoffman2/computing/gpuq.htm here] for how to request access. | ||
==Storage Space== | |||
===Home Directory=== | |||
When you login to Hoffman2, you get dropped into your home directory immediately. This is where you can keep your data and the scripts you work with. | When you login to Hoffman2, you get dropped into your home directory immediately. This is where you can keep your data and the scripts you work with. Data in your home directory is accessible on all login and computing nodes. | ||
ATS maintains high end storage systems (BlueArc and Panasas) for your home directory. There have built in redundancies and are fault tolerant. On top of that, ATS does tape backups regularly. If all of that sounded Greek to you, the important thing to understand is that there is a lot of disk space on Hoffman2 and they take great pains to make sure your data is safe. If you are a general campus user, you have 20GB of space to play with. If you are part of a cluster contributing group, your storage space is tied to the contributions made by your specific group. | ATS maintains high end storage systems (BlueArc and Panasas) for your home directory. There have built in redundancies and are fault tolerant. On top of that, ATS does tape backups regularly. If all of that sounded Greek to you, the important thing to understand is that there is a lot of disk space on Hoffman2 and they take great pains to make sure your data is safe. If you are a general campus user, you have 20GB of space to play with. If you are part of a cluster contributing group, your storage space is tied to the contributions made by your specific group. | ||
[[ | [[Hoffman2 Helpful Commands#Space|Find out how much space you are using.]] | ||
===Temporary Storage=== | |||
When running a computing job on Hoffman2, reading and writing a bunch of files in your home directory can be slow. So faster temporary storage is available to use for ongoing jobs | When running a computing job on Hoffman2, reading and writing a bunch of files in your home directory can be slow. So faster temporary storage is available to use for ongoing jobs | ||
=====/work===== | =====/work===== | ||
Each computing node has its own unique "work" directory. This is only accessible by jobs on that specific node. Any data your job may put on it will be removed as soon as your job finishes. There is a 100GB limit for the amount of space you may take up here. | Each computing node has its own unique "work" directory. This is only accessible by jobs on that specific node. Any data your job may put on it will be removed as soon as your job finishes. There is a 100GB limit for the amount of space you may take up here. | ||
====/u/scratch/[UserID]==== | |||
Data here is accessible on all login and computing nodes. You can use up to 1TB of space here, but data is not kept here for more than 7 days and can be overwritten sooner if there is a high demand for scratch space. | Data here is accessible on all login and computing nodes. You can use up to 1TB of space here, but data is not kept here for more than 7 days and can be overwritten sooner if there is a high demand for scratch space. | ||
==Sun Grid Engine== | |||
The Sun Grid Engine is the | The Sun Grid Engine is the brains behind how jobs get executed on the cluster. When you request that a script be run on Hoffman2, the SGE looks at the resources you requested (how much memory, how many computing cores, how many computing hours, etc) and puts your job in a queue (a waiting line for those not familiar with British English) based on your requirements. Less demanding jobs generally get front loaded while more demanding ones must wait for adequate resources to free up. The SGE tries to schedule jobs on computing nodes in order to make the most efficient use of the resources available. | ||
[[Hoffman2 Jobs|Find out how to submit computing jobs to the cluster.]] | |||
Revision as of 19:04, 13 March 2012
What is Hoffman2?
The Hoffman2 Cluster is a campus computing resource at UCLA and is named for Paul Hoffman (1947-2003). It is maintained by the Academic Technology Services Department at UCLA and they host a webpage about it here. With many high end processors and data storage and backup technologies, it is a useful tool for executing research computations especially when working with large datasets. More than 1000 users are currently registered and the cluster sees tremendous usage. In February 2012 alone, there were more than 4 million compute hours logged. See more usage statistics here.
Anatomy of the Computing Cluster
What does Hoffman2 consist of?
- Login Nodes
- Computing Nodes
- Storage Space
- Sun Grid Engine (a brain of sorts)
Login Nodes
There are four login nodes which allow you to access and interact with the Hoffman2 Cluster. These are essentially four dedicated computers that you can SSH into and use to look at your files or submit computing jobs to the queue (more on what the queue is in a bit). It is important to remember that these are four computers being shared by ALL the Hoffman2 users. Doing ANY type of heavy computing on these nodes is frowned upon. If you are:
- calculating the inverse solution to an EEG signal, or
- running a bunch of python iterations to extract tractography of a brain
you should NOT be doing this on a login node. If the sysadmins at ATS find any process that is taking up too many resources on the login nodes, they reserve the right to terminate the process immediately.
Computing Nodes
There are more than 800 computing nodes, each with 8, 12, or 16 cores and 1GB, 4GB, or 8GB of RAM per core. This means there are more than 6500 computing cores available and this number continues to grow as the cluster is expanded. The individual cores of the computing nodes are where your programs gets executed when you submit a job to the cluster. There are ways to request that you only be given one core to use or that you be given many cores.
There is also a GPU cluster that has more than 300 nodes, but access to this must be requested separately from a normal Hoffman2 account. Look here for how to request access.
Storage Space
Home Directory
When you login to Hoffman2, you get dropped into your home directory immediately. This is where you can keep your data and the scripts you work with. Data in your home directory is accessible on all login and computing nodes.
ATS maintains high end storage systems (BlueArc and Panasas) for your home directory. There have built in redundancies and are fault tolerant. On top of that, ATS does tape backups regularly. If all of that sounded Greek to you, the important thing to understand is that there is a lot of disk space on Hoffman2 and they take great pains to make sure your data is safe. If you are a general campus user, you have 20GB of space to play with. If you are part of a cluster contributing group, your storage space is tied to the contributions made by your specific group.
Find out how much space you are using.
Temporary Storage
When running a computing job on Hoffman2, reading and writing a bunch of files in your home directory can be slow. So faster temporary storage is available to use for ongoing jobs
/work
Each computing node has its own unique "work" directory. This is only accessible by jobs on that specific node. Any data your job may put on it will be removed as soon as your job finishes. There is a 100GB limit for the amount of space you may take up here.
/u/scratch/[UserID]
Data here is accessible on all login and computing nodes. You can use up to 1TB of space here, but data is not kept here for more than 7 days and can be overwritten sooner if there is a high demand for scratch space.
Sun Grid Engine
The Sun Grid Engine is the brains behind how jobs get executed on the cluster. When you request that a script be run on Hoffman2, the SGE looks at the resources you requested (how much memory, how many computing cores, how many computing hours, etc) and puts your job in a queue (a waiting line for those not familiar with British English) based on your requirements. Less demanding jobs generally get front loaded while more demanding ones must wait for adequate resources to free up. The SGE tries to schedule jobs on computing nodes in order to make the most efficient use of the resources available.