Using RStudio Server

The process described in this guidance document to use rstudio is currently broken (August 2023). 

Running a web server based system on a compute node such as RStudio Server is slightly more complicated than on your own computer. RStudio Server is best run in a container as it has many system level dependencies. The easiest way to do this is through a Slurm batch job script. The cluster has a default Slurm job definition for running a RStudio Server container provided by the Rocker Project

Run an RStudio Server Instance

  1. Open a shell prompt on your local computer and login to the SoM HPC scheduler node, replacing with your Duke NetID.

    ssh <NETID>@dash.duhs.duke.edu
  2. Submit a batch job using the common RStudio Server slurm job template. The default parameters for this job are in the job template but you can alter the sbatch command depending on your needs. Default parameters are:

    • --time=08:00:00  Terminates the server instance after 8 hours
    • --ntasks=1   Defines the server as a single task

    • --cpus-per-task=2  Provides 2 CPUs for the server

    • --mem=8192  Provides 8192 MB of memory for the server

      sbatch /data/shared/jobs/rstudio-server.job
  3. You will see a slurm output showing you the job id

    Submitted batch job XXXXXX
  4. In your home directory there will be a new file called rstudio-server.job.XXXXXX where XXXXXX is the job id. To display the connection instructions for this instance of the RStudio Server replace XXXXXX with the value provided to you in Step 3.

    cat rstudio-server.job.XXXXXX
  5. Follow the connection instructions which look like the following   

    # NOTE: THIS IS AN EXAMPLE, you must follow the instructions in the rstudio-server.job.XXXXXX in your home directory 
    
    1. Start an interactive sesion on the cluster node where RStudio is running (Default duration is 8 hours, feel free to modify if needed)
    
       srun --mem 500 --time=8:00:00 -p exec --pty bash
    
    2. SSH tunnel from the node back to the scheduler node using the internal URL. After entering your password, the process will continue with no additional output to the terminal.
    
       ssh -NR <PORT>:localhost:<PORT> <NETID>@somhpc-tunnel.azure.dhe.duke.edu
    
    3. From your local workstation, SSH tunnel to the scheduler node. After entering your password, the process will continue with no additional output to the terminal.
    
       ssh -NL <PORT>:localhost:<PORT> <NETID>@somhpc-tunnel.azure.dhe.duke.edu
    
       and point your web browser to http://localhost:<PORT>
    
    4. log in to RStudio Server using the following credentials:
    
       user: <NETID>
       password: <PASSWORD>
    
    When done using RStudio Server, terminate the job by:
    
    1. Exit the RStudio Session ("power" button in the top right corner of the RStudio window)
    2. Type Ctrl-C to close the SSH tunnel on the interactive session
    3. Type "exit" to end the interactive session
    4. Issue the following command on the scheduler node:
    
          scancel -f <JOBID>
    
    5. On your local workstation, type Ctrl-C to close the SSH tunnel to the scheduler node

Altering the RStudio Server Environment

If additional Debian (or other) software packages are needed, the Rocker Project RStudio Server container base image we use can be extended in a Singularity definition file. Note that sudo privileges are required to use the singularity build command, unless using a remote builder such as the Sylabs Cloud Remote Builder. Alternatively, a Rocker base image can be extended in a Dockerfile and a Singularity image built using the docker2singularity Docker image. Modifications to the base Rocker image are not needed for installing R packages into a personal library in the user’s home directory.  

We have recently added Singularity fakeroot feature which allows users to build Singularity image on the cluster without a real sudo privilege, to request this feature please check: Using Singularity guide

Once you have a new container built you can copy the Slurm batch job script located at /data/shared/jobs/rstudio-server.job on the SoM Shared Cluster and modify it with your new container location and name.