Package org.snpeff.interval
Class Transcript
- All Implemented Interfaces:
Serializable
,Cloneable
,Comparable<Interval>
,Iterable<Exon>
,TxtSerializable
Interval for a transcript, as well as some other information: exons, utrs, cds, etc.
- Author:
- pcingola
- See Also:
-
Field Summary
Fields inherited from class org.snpeff.interval.Interval
chromosomeNameOri, end, id, parent, start, strandMinus
-
Constructor Summary
Constructors -
Method Summary
Modifier and TypeMethodDescriptionint[]
Calculate chromosome position as function of Amino Acid number Note that returns the chromosomal position of the first base for each Amino Acidint
aaNumber2Pos
(int aaNum) Find a genomic position of the first base in a Amino Acid 'aaNum'void
Add a CDSvoid
Add an intronvoid
add
(SpliceSite spliceSite) Add a SpliceSitevoid
Add a UTRboolean
adjust()
Adjust transcript coordinatesCreate a new transcript after applying changes in variantbaseAt
(int pos) Find base at genomic coordinate 'pos'int
baseNumber2MRnaPos
(int pos) Calculate distance from transcript start to a position mRNA is roughly the same than cDNA.int
baseNumberCds
(int pos, boolean usePrevBaseIntron) Calculate base number in a CDS where 'pos' mapsbaseNumberCds2Codon
(int cdsBaseNumber) Return a codon that includes 'cdsBaseNumber'int[]
Calculate chromosome position as function of CDS numberint
baseNumberCds2Pos
(int cdsBaseNum) cds()
Retrieve coding sequenceCreate a marker of the coding region in this transcriptPerform a shallow cloneint[]
codonNumber2Pos
(int codonNum) Return an array of 3 genomic positions where amino acid number 'aaNum' mapsboolean
Collapses exons having gaps of zero (i.e.double
Calculate CpG bias: number of CpG / expected[CpG]int
cpgExons()
Count total CpG in this transcript's exonsvoid
createSpliceSites
(int spliceSiteSize, int spliceRegionExonSize, int spliceRegionIntronMin, int spliceRegionIntronMax) Find all splice sites.void
createUpDownStream
(int upDownLength) Creates a list of UP/DOWN stream regions (for each transcript) Upstream (downstream) stream is defined as upDownLength before (after) transcriptboolean
Deletes redundant exons (i.e.Find a CDS that matches exactly the exonfindExon
(int pos) Return the an exon that intersects 'pos'Return an exon intersecting 'marker' (first exon found)findIntron
(int pos) Return an intron overlapping position 'pos'findUtr
(int pos) Return the UTR that hits position 'pos'Return the UTR that intersects 'marker' (null if not found)boolean
Correct exons based on frame information.Create a list of 3 prime UTRsCreate a list of 5 prime UTRsgetCds()
Get all CDSsint
int
getExons()
A more intuitive name for 'subintervals'Get first coding exongetGene()
getTss()
Create a TSS markergetUtrs()
Get all UTRsboolean
hasCds()
boolean
hasError()
Does this transcript have any errors?boolean
Does this transcript have any errors?boolean
boolean
Does this transcript have any errors?introns()
Get all introns (lazy init)boolean
protected boolean
isAdjustIfParentDoesNotInclude
(Marker parent) Adjust parent if it does not include child?boolean
boolean
Has this transcript been checked against CDS/DNA/AA sequences?boolean
boolean
boolean
isDownstream
(int pos) boolean
Check if coding length is multiple of 3 in protein coding transcriptsboolean
Is the first codon a START codon?boolean
Check if protein sequence has STOP codons in the middle of the coding sequenceboolean
isIntron
(int pos) boolean
boolean
boolean
isUpstream
(int pos) boolean
isUtr
(int pos) boolean
boolean
isUtr3
(int pos) boolean
isUtr5
(int pos) boolean
Is the last codon a STOP codon?markers()
A list of all markers in this transcriptmRna()
Retrieve coding sequence AND the UTRs (mRNA = 5'UTR + CDS + 3'UTR) I.e.protein()
Protein sequence (amino acid sequence produced by this transcripts)Query all genomic regions that intersect 'marker'Return the first exon that intersects 'interval' (null if not found)boolean
Assign ranks to exonsvoid
reset()
Remove all intervalsvoid
void
sanityCheck
(Variant variant) Perfom some baseic chekcs, return error type, if anyvoid
serializeParse
(MarkerSerializer markerSerializer) Parse a line from a serialized fileserializeSave
(MarkerSerializer markerSerializer) Create a string to serialize to a filevoid
setAaCheck
(boolean aaCheck) void
setBioType
(BioType bioType) void
setCanonical
(boolean canonical) void
setDnaCheck
(boolean dnaCheck) void
setProteinCoding
(boolean proteinCoding) void
setRibosomalSlippage
(boolean ribosomalSlippage) void
setTranscriptSupportLevel
(TranscriptSupportLevel transcriptSupportLevel) void
setVersion
(String version) void
sortCds()
toString()
toString
(boolean full) toStringAsciiArt
(boolean full) Show a transcript as an ASCII Artboolean
utrFromCds
(boolean verbose) Calculate UTR regions from CDSsboolean
variantEffect
(Variant variant, VariantEffects variantEffects) Get some details about the effect on this transcriptMethods inherited from class org.snpeff.interval.IntervalAndSubIntervals
add, addAll, addAll, clone, containsId, get, invalidateSorted, iterator, numChilds, remove, setStrandMinus, shiftCoordinates, sorted, sortedStrand, subIntervals
Methods inherited from class org.snpeff.interval.Marker
adjust, applyDel, applyDup, applyIns, applyMixed, codonTable, compareTo, compareToPos, distance, distanceBases, getParent, getType, idChain, idChain, idChain, includes, intersect, isDeferredAnalysis, isShowWarningIfParentDoesNotInclude, minus, query, readTxt, shouldApply, union, variantEffectNonRef
Methods inherited from class org.snpeff.interval.Interval
equals, findParent, getChromosome, getChromosomeName, getChromosomeNameOri, getChromosomeNum, getEnd, getGenome, getGenomeName, getId, getStart, getStrand, hashCode, intersects, intersects, intersects, intersects, intersectSize, isCircular, isSameChromo, isStrandMinus, isStrandPlus, isValid, setChromosomeNameOri, setEnd, setId, setParent, setStart, size, toStr, toStringAsciiArt, toStrPos
Methods inherited from class java.lang.Object
equals, finalize, getClass, notify, notifyAll, wait, wait, wait
Methods inherited from interface java.lang.Iterable
forEach, spliterator
-
Constructor Details
-
Transcript
public Transcript() -
Transcript
-
-
Method Details
-
aaNumber2Pos
public int[] aaNumber2Pos()Calculate chromosome position as function of Amino Acid number Note that returns the chromosomal position of the first base for each Amino AcidIf you need the chromosomal position of each base
-
aaNumber2Pos
public int aaNumber2Pos(int aaNum) Find a genomic position of the first base in a Amino Acid 'aaNum' -
add
Add a CDS -
add
Add an intron -
add
Add a SpliceSite -
add
Add a UTR -
adjust
public boolean adjust()Adjust transcript coordinates -
apply
Create a new transcript after applying changes in variantNote: If this transcript is unaffected, no new transcript is created (same transcript is returned)
- Overrides:
apply
in classIntervalAndSubIntervals<Exon>
- Returns:
- The marker result after applying variant
-
baseAt
Find base at genomic coordinate 'pos' -
baseNumber2MRnaPos
public int baseNumber2MRnaPos(int pos) Calculate distance from transcript start to a position mRNA is roughly the same than cDNA. Strictly speaking mRNA has a poly-A tail and 5'cap. -
baseNumberCds
public int baseNumberCds(int pos, boolean usePrevBaseIntron) Calculate base number in a CDS where 'pos' maps- Parameters:
usePrevBaseIntron
- : When 'pos' is intronic this method returns: - if( usePrevBaseIntron== false) => The first base in the exon after 'pos' (i.e. first coding base after intron) - if( usePrevBaseIntron== true) => The last base in the exon before 'pos' (i.e. last coding base before intron)
-
baseNumberCds2Codon
Return a codon that includes 'cdsBaseNumber' -
baseNumberCds2Pos
public int[] baseNumberCds2Pos()Calculate chromosome position as function of CDS number -
baseNumberCds2Pos
public int baseNumberCds2Pos(int cdsBaseNum) -
cds
Retrieve coding sequence -
cdsMarker
Create a marker of the coding region in this transcript -
cloneShallow
Description copied from class:Marker
Perform a shallow clone- Overrides:
cloneShallow
in classIntervalAndSubIntervals<Exon>
-
codonNumber2Pos
public int[] codonNumber2Pos(int codonNum) Return an array of 3 genomic positions where amino acid number 'aaNum' maps- Returns:
- aa2pos[0], aa2pos[1], aa2pos[2] are the coordinates (within the chromosome)
of the three bases conforming codon 'aaNum'. Any aa2pos[i] = -1 means that
it could a base in the codon could not be mapped.
Bases in the array are sorted by chromosome position, so aa2pos[0] < aa2pos[1] < aa2pos[2]
-
collapseZeroGap
public boolean collapseZeroGap()Collapses exons having gaps of zero (i.e. exons that followed by other exons). Does the same for CDSs and UTRs.- Returns:
- true of any exon in the transcript was 'collapsed'
-
cpgExonBias
public double cpgExonBias()Calculate CpG bias: number of CpG / expected[CpG] -
cpgExons
public int cpgExons()Count total CpG in this transcript's exons -
createSpliceSites
public void createSpliceSites(int spliceSiteSize, int spliceRegionExonSize, int spliceRegionIntronMin, int spliceRegionIntronMax) Find all splice sites. -
createUpDownStream
public void createUpDownStream(int upDownLength) Creates a list of UP/DOWN stream regions (for each transcript) Upstream (downstream) stream is defined as upDownLength before (after) transcript -
deleteRedundant
public boolean deleteRedundant()Deletes redundant exons (i.e. exons that are totally included in other exons). Does the same for CDSs. Does the same for UTRs. -
findCds
Find a CDS that matches exactly the exon -
findExon
Return the an exon that intersects 'pos' -
findExon
Return an exon intersecting 'marker' (first exon found) -
findIntron
Return an intron overlapping position 'pos' -
findUtr
Return the UTR that hits position 'pos'- Returns:
- An UTR intersecting 'pos' (null if not found)
-
findUtrs
Return the UTR that intersects 'marker' (null if not found) -
frameCorrection
public boolean frameCorrection()Correct exons based on frame information.E.g. if the frame information (form a genomic database file, such as a GTF) does not match the calculated frame, we correct exon's boundaries to make them match.
This is performed in two stages: i) First exon is corrected by adding a fake 5'UTR ii) Other exons are corrected by changing the start (or end) coordinates.
-
get3primeUtrs
Create a list of 3 prime UTRs -
get3primeUtrsSorted
-
get5primeUtrs
Create a list of 5 prime UTRs -
get5primeUtrsSorted
-
getBioType
-
setBioType
-
getCds
Get all CDSs -
getCdsEnd
public int getCdsEnd() -
getCdsStart
public int getCdsStart() -
getDownstream
-
getExons
A more intuitive name for 'subintervals' -
getFirstCodingExon
Get first coding exon -
getGene
-
getTranscriptSupportLevel
-
setTranscriptSupportLevel
-
getTss
Create a TSS marker -
getUpstream
-
getUtrs
Get all UTRs -
getVersion
-
setVersion
-
hasCds
public boolean hasCds() -
hasError
public boolean hasError()Does this transcript have any errors? -
hasErrorOrWarning
public boolean hasErrorOrWarning()Does this transcript have any errors? -
hasTranscriptSupportLevelInfo
public boolean hasTranscriptSupportLevelInfo() -
hasWarning
public boolean hasWarning()Does this transcript have any errors? -
introns
Get all introns (lazy init) -
isAaCheck
public boolean isAaCheck() -
setAaCheck
public void setAaCheck(boolean aaCheck) -
isAdjustIfParentDoesNotInclude
Description copied from class:Marker
Adjust parent if it does not include child?- Overrides:
isAdjustIfParentDoesNotInclude
in classMarker
-
isCanonical
public boolean isCanonical() -
setCanonical
public void setCanonical(boolean canonical) -
isChecked
public boolean isChecked()Has this transcript been checked against CDS/DNA/AA sequences? -
isCorrected
public boolean isCorrected() -
isDnaCheck
public boolean isDnaCheck() -
setDnaCheck
public void setDnaCheck(boolean dnaCheck) -
isDownstream
public boolean isDownstream(int pos) -
isErrorProteinLength
public boolean isErrorProteinLength()Check if coding length is multiple of 3 in protein coding transcripts- Returns:
- true on Error
-
isErrorStartCodon
public boolean isErrorStartCodon()Is the first codon a START codon? -
isErrorStopCodonsInCds
public boolean isErrorStopCodonsInCds()Check if protein sequence has STOP codons in the middle of the coding sequence- Returns:
- true on Error
-
isIntron
public boolean isIntron(int pos) -
isProteinCoding
public boolean isProteinCoding() -
setProteinCoding
public void setProteinCoding(boolean proteinCoding) -
isRibosomalSlippage
public boolean isRibosomalSlippage() -
setRibosomalSlippage
public void setRibosomalSlippage(boolean ribosomalSlippage) -
isUpstream
public boolean isUpstream(int pos) -
isUtr
public boolean isUtr(int pos) -
isUtr
-
isUtr3
public boolean isUtr3(int pos) -
isUtr5
public boolean isUtr5(int pos) -
isWarningStopCodon
public boolean isWarningStopCodon()Is the last codon a STOP codon? -
markers
A list of all markers in this transcript- Overrides:
markers
in classIntervalAndSubIntervals<Exon>
-
mRna
Retrieve coding sequence AND the UTRs (mRNA = 5'UTR + CDS + 3'UTR) I.e. Concatenate all exon sequences -
protein
Protein sequence (amino acid sequence produced by this transcripts) -
query
Query all genomic regions that intersect 'marker'- Overrides:
query
in classIntervalAndSubIntervals<Exon>
-
queryExon
Return the first exon that intersects 'interval' (null if not found) -
rankExons
public boolean rankExons()Assign ranks to exons -
reset
public void reset()Description copied from class:IntervalAndSubIntervals
Remove all intervals- Overrides:
reset
in classIntervalAndSubIntervals<Exon>
-
resetCache
public void resetCache() -
resetExons
public void resetExons() -
sanityCheck
Perfom some baseic chekcs, return error type, if any -
serializeParse
Parse a line from a serialized file- Specified by:
serializeParse
in interfaceTxtSerializable
- Overrides:
serializeParse
in classIntervalAndSubIntervals<Exon>
-
serializeSave
Create a string to serialize to a file- Specified by:
serializeSave
in interfaceTxtSerializable
- Overrides:
serializeSave
in classIntervalAndSubIntervals<Exon>
-
sortCds
public void sortCds() -
spliceSites
-
toString
-
toString
-
toStringAsciiArt
Show a transcript as an ASCII Art -
utrFromCds
public boolean utrFromCds(boolean verbose) Calculate UTR regions from CDSs -
variantEffect
Get some details about the effect on this transcript- Overrides:
variantEffect
in classMarker
-