Directory structure
Home
Home directory is where you start each session on the HPC cluster.
This is usually where you place your scripts and various things you are working on.
The user home directory is stored in /rhome/, which is a shortcut to /bigdata/rhome.
Each home directory has a share folder that is a shortcut to their lab group at /bigdata/labname.
The home directory is available on all the nodes.
The home directory also uses the lab quota as well, which is 1TB. Additional storage can be purchased.
| Path | /rhome/username |
|---|---|
| User availability | All users |
| Node availability | All nodes |
| Quota respondibility | Lab |
Bigdata
This is where large files are kept. Files are also shared with everyone within your lab.
Each lab has a quota of 1TB and additional storage can be purchased.
| Path | /bigdata/labname |
|---|---|
| User availability | All users |
| Node availability | All nodes |
| Quota responsibility | Lab |
Non-Persistent
Frequently, there is a need for faster temporary storage. For example activities like the following would fall under this category: 1. Output a significant amount of intermediate data during a job 2. Access a dataset from a faster medium than bigdata or the home directories 3. Write out lock files
These types of activities are well suited to the use of fast non-persistent spaces. Below are the filesystems available on the HPC cluster that would be best suited for these actions.
This is only created to when a job started, and deleted when a job end/error/failed/canceled.
SSD Backed Space
This space is much faster than the persistent space (/rhome, /bigdata).
Storage varies from node to node since other users could be using it as well.
Data is deleted after the job completes or crashes. Make sure to move or copy the files before the job finishes.
| Path | /scratch/$SLURM_JOB_USER/$SLURM_JOB_ID |
|---|---|
| User availability | All users |
| Node availability | All worker nodes |
RAM Backed Space
This space is much faster than the persistent space (/rhome, /bigdata) and SSD backed space /scratch.
This uses RAM, so make sure you allocate enough memory for your application and the amount of storage you need. The job will crash if your application memory and storage usage in this directory exceed the memory requested for the job.
Storage space depends on how much memory is requested for the job. This is beneficial if your programs does alot of write and read of many files at once.
Data is deleted after the job completes or crashes. Make sure to move or copy the files before the job finishes.
| Path | /tmpfs/$SLURM_JOB_USER/$SLURM_JOB_ID |
|---|---|
| User availability | All users |
| Node availability | All worker nodes |
Check Quota
Our filesystem uses Beegfs, and to check quota, use the command below.
quota_check.sh
du command
Due to our use of Beegfs, the du command do not give the correct folder size.
You will need to use du command with the flag --apparent-size. More information on sample du command can be found at linux command.