Using RStudio Server
The process described in this guidance document to use rstudio is currently broken (August 2023).
Running a web server based system on a compute node such as RStudio Server is slightly more complicated than on your own computer. RStudio Server is best run in a container as it has many system level dependencies. The easiest way to do this is through a Slurm batch job script. The cluster has a default Slurm job definition for running a RStudio Server container provided by the Rocker Project.
Run an RStudio Server Instance
Open a shell prompt on your local computer and login to the SoM HPC scheduler node, replacing with your Duke NetID.
ssh <NETID>@dash.duhs.duke.edu
Submit a batch job using the common RStudio Server slurm job template. The default parameters for this job are in the job template but you can alter the
sbatch
command depending on your needs. Default parameters are:--time=08:00:00
Terminates the server instance after 8 hours--ntasks=1
Defines the server as a single task--cpus-per-task=2
Provides 2 CPUs for the server--mem=8192
Provides 8192 MB of memory for the serversbatch /data/shared/jobs/rstudio-server.job
You will see a slurm output showing you the job id
Submitted batch job XXXXXX
In your home directory there will be a new file called
rstudio-server.job.XXXXXX
whereXXXXXX
is the job id. To display the connection instructions for this instance of the RStudio Server replaceXXXXXX
with the value provided to you in Step 3.cat rstudio-server.job.XXXXXX
Follow the connection instructions which look like the following
# NOTE: THIS IS AN EXAMPLE, you must follow the instructions in the rstudio-server.job.XXXXXX in your home directory 1. Start an interactive sesion on the cluster node where RStudio is running (Default duration is 8 hours, feel free to modify if needed) srun --mem 500 --time=8:00:00 -p exec --pty bash 2. SSH tunnel from the node back to the scheduler node using the internal URL. After entering your password, the process will continue with no additional output to the terminal. ssh -NR <PORT>:localhost:<PORT> <NETID>@somhpc-tunnel.azure.dhe.duke.edu 3. From your local workstation, SSH tunnel to the scheduler node. After entering your password, the process will continue with no additional output to the terminal. ssh -NL <PORT>:localhost:<PORT> <NETID>@somhpc-tunnel.azure.dhe.duke.edu and point your web browser to http://localhost:<PORT> 4. log in to RStudio Server using the following credentials: user: <NETID> password: <PASSWORD> When done using RStudio Server, terminate the job by: 1. Exit the RStudio Session ("power" button in the top right corner of the RStudio window) 2. Type Ctrl-C to close the SSH tunnel on the interactive session 3. Type "exit" to end the interactive session 4. Issue the following command on the scheduler node: scancel -f <JOBID> 5. On your local workstation, type Ctrl-C to close the SSH tunnel to the scheduler node
Altering the RStudio Server Environment
If additional Debian (or other) software packages are needed, the Rocker Project RStudio Server container base image we use can be extended in a Singularity definition file. Note that sudo privileges are required to use the singularity build
command, unless using a remote builder such as the Sylabs Cloud Remote Builder. Alternatively, a Rocker base image can be extended in a Dockerfile and a Singularity image built using the docker2singularity Docker image. Modifications to the base Rocker image are not needed for installing R packages into a personal library in the user’s home directory.
We have recently added Singularity fakeroot feature which allows users to build Singularity image on the cluster without a real sudo privilege, to request this feature please check: Using Singularity guide
Once you have a new container built you can copy the Slurm batch job script located at /data/shared/jobs/rstudio-server.job
on the SoM Shared Cluster and modify it with your new container location and name.