Script to align reads to a reference genome using hisat2. This requires an existing index which may be created using hisat2 itself. Commonly used genome indices may also be downloaded from the HISAT2 homepage.
run_hisat2(hisat2 = "hisat2", idx = NULL, mate1 = NULL, mate2 = NULL, fastq = TRUE, fasta = FALSE, softClipPenalty = NULL, noSoftClip = FALSE, noSplice = FALSE, knownSplice = NULL, strand = NULL, tmo = FALSE, maxAlign = NULL, secondary = FALSE, minInsert = NULL, maxInsert = NULL, nomixed = FALSE, nodiscordant = FALSE, threads = 1, rgid = NULL, quiet = FALSE, non_deterministic = FALSE)
| hisat2 | Path to hisat2 (if using WSL, then this should be the full path on the linux subsystem)  | 
    
|---|---|
| idx | The basename of the index for the reference genome. The basename is the name of any of the index files up to but not including the final .1.ht2, etc.  | 
    
| mate1 | Comma-separated list of files containing mate 1s (filename usually includes _1)  | 
    
| mate2 | Comma-separated list of files containing mate 2s (filename usually includes _2). Sequences specified with this option must correspond file-for-file and read-for-read with those specified in .  | 
    
| fastq | Logical indicating if reads are FASTQ files.  | 
    
| fasta | Logical indicating if reads are FASTA files.  | 
    
| softClipPenalty | Sets the maximum (MX) and minimum (MN) penalties for soft-clipping per base, both integers. Must be given in the format "MX,MN".  | 
    
| noSoftClip | Logical indicating whether to disallow soft-clipping.  | 
    
| noSplice | Logical indicating whether to switch off spliced alignment, e.g., for DNA-seq analysis.  | 
    
| knownSplice | Path to text file containing known splice sites.  | 
    
| strand | Specify strand-specific information. Default is unstranded.  | 
    
| tmo | Logical indicating whether to report only those reads aligning to known transcripts.  | 
    
| maxAlign | Integer indicating the maximum number of distinct primary alignments to search for each read.  | 
    
| secondary | Logical indicating whether to report secondary alignments.  | 
    
| minInsert | The minimum fragment length for valid paired-end alignments. This option is valid only with noSplice = TRUE.  | 
    
| maxInsert | The maximum fragment length for valid paired-end alignments. This option is valid only with noSplice = TRUE.  | 
    
| nomixed | By default, when hisat2 cannot find a concordant or discordant alignment for a pair, it then tries to find alignments for the individual mates. If TRUE, this option disables that behavior.  | 
    
| nodiscordant | By default, hisat2 looks for discordant alignments if it cannot find any concordant alignments. If true, this option disables that behavior.  | 
    
| threads | an integer value indicating the number of workers to be used. If NULL then one less than the maximum number of cores will be used. [DEFAULT = NULL].  | 
    
| rgid | Character string, to which the read group ID is set.  | 
    
| quiet | If TRUE, print nothing except alignments and serious errors.  | 
    
| non_deterministic | When set to TRUE, HISAT2 re-initializes its pseudo-random generator for each read using the current time.  | 
    
Alignment file in SAM format
# NOT RUN { run_hisat2(hisat2 = "hisat2", idx = "../prana/data-raw/index/UCSC.hg19", mate1 = "../prana/data-raw/seqFiles/HB1_sample_1.fastq.gz", mate2 = "../prana/data-raw/seqFiles/HB1_sample_2.fastq.gz", fastq = TRUE, fasta = FALSE, softClipPenalty = NULL, noSoftClip = FALSE, noSplice = FALSE, knownSplice = NULL, strand = NULL, tmo = FALSE, maxAlign = NULL, secondary = FALSE, minInsert = NULL, maxInsert = NULL, nomixed = FALSE, nodiscordant = FALSE, threads = (parallel::detectCores() - 1), rgid = NULL, quiet = FALSE, non_deterministic = TRUE) # }