Data

Data

On this cluster there is a /data directory that is used to store files for various projects. A subdirectory will be created for your lab/project with permissions set to only allow the users you specify to have access. The /data directory is not meant for long term storage. Within Globus the /data directory is under the collection named Duke RCC Scratch Storage.

Lab Shared Directory

Each lab is allocated a /data/<GroupName>directory with minimally a 10 TB quota for sharing within a lab. Often the GroupName is the last name of your PI followed by the word “lab”.

All lab members have read-write-execute (rwx) permissions at the root of the lab’s shared scratch space. Any file or directory that a lab member creates at this location will have that lab member as user-owner, with rwx permissions. The lab group will be the group owner, with default read and execute (r-x) permissions. If other group members aside from the individual who created a file or directory need write access to it, the creator must grant that permission with chmod . ‘chmod g+w <file>’ grants group write permission to a file. ‘chmod -R g+w <directory>’ grants group write recursively throughout the specified directory to all files and subdirectories within it.

IRB Protocol Directory

The IRB protocol accounts are meant to secure the data controlled by an IRB and enable billing specific for that work. A /data/irb/<deptname>/pro<irbnum>directory is created to store the data with minimally a 10 TB quota.  

User Home Directory

Each user is allocated a 50 GB quota for a user’s home directory.

Some users like to create a symlink (analogous to a shortcut in windows) to their project’s shared directory (under /data/ or /data/irb) for ease of access. This is OK to do, but comes with some caveats.

To create a symlink in your home directory to project shared storage, make sure you are in your home directory ('cd ~') and then run 'ln -s /data/myproject myproject' (adjust path to reflect your actual project shared storage. You can name the link - the last argument in the command - anything you want).

If you decide to use a symlink, it is best practice to only use it to change into the target directory with ‘cd’. Certain command such as ‘cp’ and ‘mv’ can result in unintended consequences when you use the symlink path in them explicitly. For that reason you should never copy or move files or directories into a symlink, but rather into the target the symlink points to.

If my home directory contains a file called ‘myfile’ and a symlink to /data/itlab called ‘itlab’, and I want to move ‘myfile’ into ‘itlab’, these two commands will have different results (specifically, incorrect group owner and permissions on the target for the latter):

mv /home/dss13/myfile /data/itlab/ (mv to target - the right way)

mv /home/dss13/myfile itlab (mv to symlink - wrong group owner and permissions on target)

 

 

Label list

As you and your team label content this area will fill up and display the latest updates.

 

 

 

See On Premises HPC for details about increasing quotas

 

See Transferring Data for ways to transfer data in and out of the cluster.