Using Jupyter Notebook
Running a Jupyter Notebook on the cluster is slightly more complicated than on your own computer. You want to make sure you're running the server on a compute node and not on the low performance Scheduler node and Compute nodes in the cluster are only reachable through interactive sessions (not through ssh). Below is an example of running a Jupyter Notebook on a DASH Execute node.
Run a Jupyter Notebook Instance
Open a shell prompt on your local computer and login to the SoM HPC scheduler node, replacing with your Duke NetID.
ssh <NETID>@dash.duhs.duke.edu
Activate your conda or venv environment with desired version of Jupyter Notebook in it.
conda activate <my-conda-env-name>
Submit a batch job using the commonJupyter Notebook slurm job template. The default parameters for this job are in the job template but you can alter the
sbatch
command depending on your needs. Default parameters are:--time=08:00:00
Terminates the notebook instance after 8 hours--ntasks=1
Defines the notebook as a single task--cpus-per-task=2
Provides 2 CPUs for the notebook--mem=8192
Provides 8192 MB of memory for the notebooksbatch /data/shared/jobs/jupyter-notebook.job
You will see a slurm output showing you the job id.
Submitted batch job XXXXXX
In your home directory there will be a new file called
jupyter-notebook.job.XXXXXX
whereXXXXXX
is the job id. To display the connection instructions for this instance of the Jupyter Notebook replaceXXXXXX
with the value provided to you in Step 3.cat jupyter-notebook.job.XXXXXX
Follow the connection instructions which look like the following.
# NOTE: THIS IS AN EXAMPLE, you must follow the instructions in the jupyter-notebook.job.XXXXXX in your home directory 1. Start an interactive sesion on the cluster node where Jupyter Notebook is running (Default duration is 8 hours, feel free to modify if needed) srun --mem 500 --time=8:00:00 -p execute -w <NODENAME> --pty bash 2. SSH tunnel from the node back to the scheduler node using the internal URL. NOTE: After entering your password, the process will continue with no additional output to the terminal. (VERY IMPORTANT!!) ssh -NR <PORT>:localhost:<PORT> <NETID>@somhpc-tunnel.azure.dhe.duke.edu 3. From your local workstation, SSH tunnel to the scheduler node using the public URL. NOTE: After entering your password, the process will continue with no additional output to the terminal. (VERY IMPORTANT!!) ssh -NL <PORT>:localhost:<PORT> <NETID>@somhpc-tunnel.azure.dhe.duke.edu and point your web browser to http://localhost:<PORT> 4. log in to Jupyter Notebook using the following credentials: password: <PASSWORD> When done using Jupyter Notebook, terminate the job by: 1. Choose Logout on your Jupyter Notebook session in the browser 2. Type Ctrl-C to close the SSH tunnel on the interactive session 3. Type "exit" to end the interactive session 4. Issue the following command on the scheduler node: scancel -f <JOBID> 5. On your local workstation, type Ctrl-C to close the SSH tunnel to the scheduler node # Output from Jupyter Server --------------------------------------------------------------------------------
If you wish to set up Jupyter notebook using a different kernel, please refer to this document for additional configuration.