Abstract gRNA
To encapsulate motifs that we search for on the genome, we can use abstract definition of gRNA/off-target, which is defined in Motif data structure.
CHOPOFF.Motif — TypeMotif(
alias::String,
fwdmotif::String,
fwdpam::String,
forward_strand::Bool = true,
reverse_strand::Bool = true,
distance::Int = 4,
extends5::Bool = true,
ambig_max::Int = 5)Motif(alias::String)
Motif defines what we search on the genome, what can be identified as an off-target.
Arguments
alias - alias of the motif for easier identification e.g. Cas9
fwdmotif - Motif that indicates where is PAM inside fwdpam. For example for Cas9 it is 20*N + XXX: NNNNNNNNNNNNNNNNNNNNXXX
fwdpam - Motif in 5'-3' that will be matched on the reference (without the X). For example for Cas9 it is 20*X + NGG: XXXXXXXXXXXXXXXXXXXXNGG
forward - If false will not match to the forward reference strand.
reverse - If false will not match to the reverse reference strand.
distance - How many extra nucleotides are needed for a search? This will indicate within what distance we can search for off-targets. When we don't have those bases we use DNA_Gap.
extend5 - Defines how off-targets will be aligned to the guides and where extra nucleotides will be added for alignment within distance. Whether to extend in the 5' and 3' direction. Cas9 is extend5 = true.
ambig_max- How many ambiguous bases are allowed in the pattern?
Example for Cas9 where we want to search for off-targets within distance of 4:
alias: Cas9
fwdmotif: NNNNNNNNNNNNNNNNNNNNXXX
fwdpam: XXXXXXXXXXXXXXXXXXXXNGG
forward: true
reverse: true
distance: 4
extend5: true
ambig_max:5 Alignments will be performed from opposite to the extension direction (which is defined by extend5).
Examples
julia> Motif("Cas9")
Alias: Cas9
Maximum search distance: 4
Number of allowed ambigous bp: 0
20N-NGG
julia> Motif("test name", "NNNNNNNNNNNNNNNNNNNNXXX", "XXXXXXXXXXXXXXXXXXXXNGG", true, true, 4, true, 5)
Alias: test name
Maximum search distance: 4
Number of allowed ambigous bp: 5
20N-NGGCHOPOFF.length_noPAM — Functionlength_noPAM(motif::Motif)
Calculate what is the length of the motif, without extension, and without PAM. Effectively, size of the gRNA.
Examples
julia> length_noPAM(Motif("Cas9"))
20CHOPOFF.setdist — Functionsetdist(motif::Motif, distance::Int)
Set the distance (maximum value of allowed mismatches, deletion, insertions) that are allowed during alignment.
Examples
julia> setdist(Motif("Cas9"), 2)
Alias: Cas9
Maximum search distance: 2
Number of allowed ambigous bp: 0
20N-NGGCHOPOFF.setambig — Functionsetambig(motif::Motif, ambig::Int)
Set the ambiguity (how many ambiguous bases are allowed, not counting PAM, not counting extension) level for motif.
Examples
julia> setambig(Motif("Cas9"), 15)
Alias: Cas9
Maximum search distance: 4
Number of allowed ambigous bp: 15
20N-NGG