Wrapper script to run HISAT2.

Script to align reads to a reference genome using hisat2. This requires an existing index which may be created using hisat2 itself. Commonly used genome indices may also be downloaded from the HISAT2 homepage.

run_hisat2(hisat2 = "hisat2", idx = NULL, mate1 = NULL,
  mate2 = NULL, fastq = TRUE, fasta = FALSE,
  softClipPenalty = NULL, noSoftClip = FALSE, noSplice = FALSE,
  knownSplice = NULL, strand = NULL, tmo = FALSE, maxAlign = NULL,
  secondary = FALSE, minInsert = NULL, maxInsert = NULL,
  nomixed = FALSE, nodiscordant = FALSE, threads = 1, rgid = NULL,
  quiet = FALSE, non_deterministic = FALSE)

Arguments

hisat2	Path to hisat2 (if using WSL, then this should be the full path on the linux subsystem)
idx	The basename of the index for the reference genome. The basename is the name of any of the index files up to but not including the final .1.ht2, etc.
mate1	Comma-separated list of files containing mate 1s (filename usually includes _1)
mate2	Comma-separated list of files containing mate 2s (filename usually includes _2). Sequences specified with this option must correspond file-for-file and read-for-read with those specified in .
fastq	Logical indicating if reads are FASTQ files.
fasta	Logical indicating if reads are FASTA files.
softClipPenalty	Sets the maximum (MX) and minimum (MN) penalties for soft-clipping per base, both integers. Must be given in the format "MX,MN".
noSoftClip	Logical indicating whether to disallow soft-clipping.
noSplice	Logical indicating whether to switch off spliced alignment, e.g., for DNA-seq analysis.
knownSplice	Path to text file containing known splice sites.
strand	Specify strand-specific information. Default is unstranded.
tmo	Logical indicating whether to report only those reads aligning to known transcripts.
maxAlign	Integer indicating the maximum number of distinct primary alignments to search for each read.
secondary	Logical indicating whether to report secondary alignments.
minInsert	The minimum fragment length for valid paired-end alignments. This option is valid only with noSplice = TRUE.
maxInsert	The maximum fragment length for valid paired-end alignments. This option is valid only with noSplice = TRUE.
nomixed	By default, when hisat2 cannot find a concordant or discordant alignment for a pair, it then tries to find alignments for the individual mates. If TRUE, this option disables that behavior.
nodiscordant	By default, hisat2 looks for discordant alignments if it cannot find any concordant alignments. If true, this option disables that behavior.
threads	an integer value indicating the number of workers to be used. If NULL then one less than the maximum number of cores will be used. [DEFAULT = NULL].
rgid	Character string, to which the read group ID is set.
quiet	If TRUE, print nothing except alignments and serious errors.
non_deterministic	When set to TRUE, HISAT2 re-initializes its pseudo-random generator for each read using the current time.

Value

Alignment file in SAM format

Examples

# NOT RUN {
run_hisat2(hisat2 = "hisat2", idx = "../prana/data-raw/index/UCSC.hg19",
mate1 = "../prana/data-raw/seqFiles/HB1_sample_1.fastq.gz",
mate2 = "../prana/data-raw/seqFiles/HB1_sample_2.fastq.gz",
fastq = TRUE, fasta = FALSE, softClipPenalty = NULL, noSoftClip = FALSE,
noSplice = FALSE, knownSplice = NULL, strand = NULL, tmo = FALSE,
maxAlign = NULL, secondary = FALSE, minInsert = NULL, maxInsert = NULL,
nomixed = FALSE, nodiscordant = FALSE,
threads = (parallel::detectCores() - 1), rgid = NULL, quiet = FALSE,
non_deterministic = TRUE)
# }

Arguments

Value

Examples

Contents