Using Bowtie
One way to run bowtie on an Azure HPC cluster is to first create a slurm file "bowtie.slurm" where you can edit fastq and reference genome file names as needed. In more advanced implementations, the files can be specified in the user environment or replaced on the fly via "sed":
bowtie.slurm
#!/bin/bash #SBATCH -o bowtie.slurm.%j.%N.out #SBATCH -e bowtie.slurm.%j.%N.err #SBATCH -D /home/ter18/example_scripts #SBATCH -J bowtie.slurm #SBATCH -c 8 #SBATCH --get-user-env #SBATCH --time=12:00:00 #SBATCH --exclusive # READ1_FASTQ=/path/to/file.fastq # leave empty for single-end alignment READ2_FASTQ= REFERENCE_GENOME=/data/common/bowtie_genomes/hg19.ebwt/hg19 OUTPUT=`echo $READ1_FASTQ | sed -e s/fastq/bowtie/` srun bowtie_align.sh $READ1_FASTQ $READ2_FASTQ $REFERENCE_GENOME $OUTPUT 8
Then, there is a bash script that actually does the work:
bowtie_align.sh
#!/bin/bash BOWTIE=`which bowtie` # # Paramters: # $1 - READ1 fastq file # $2 - READ2 fastq file (optional) # $3 - Bowtie reference genome # $4 - Number of processors # # Note: # 4 parameters implies single-end alignment # 5 parameters implies paired-end alignment # if [ -z $5 ] then $BOWTIE -p $4 -t --chunkmbs 512 --best $2 $1 $3 else $BOWTIE -p $5 -t --chunkmbs 512 --best -X 2000 $3 -1 $1 -2 $2 $4 fi
Finally to run the command, submit the job file to the slurm queue:
$ sbatch -N 1 ./bowtie.slurm