Difference between revisions of "Xenopus reference"
From Marcotte Lab
Line 6: | Line 6: | ||
# If I find a sequence with '>gi|<gi number>|ref|<genbank accession>' header (means it is RefSeq entity), write it down. | # If I find a sequence with '>gi|<gi number>|ref|<genbank accession>' header (means it is RefSeq entity), write it down. | ||
− | [[xdata:/ | + | [[xdata:/release/XENLA_cDNA_ref.v1.fasta|XENLA_cDNA_ref.v1.fasta]] (8,879 sequences) |
* Used XenBase files: NcbiMrnaXenbaseGene_laevis.txt, xlaevisMRNA.fasta | * Used XenBase files: NcbiMrnaXenbaseGene_laevis.txt, xlaevisMRNA.fasta | ||
− | [[xdata:/ | + | [[xdata:/release/XENLA_prot_ref.v1.fasta|XENLA_prot_ref.v1.fasta]] (8,878 sequences; 'taf5' is not annotated as RefSeq in protein, although its corresponding mRNA sequence is annotated as RefSeq.) |
* Used XenBase files: NcbiProteinXenbaseGene_laevis.txt, xlaevisProtein.fasta | * Used XenBase files: NcbiProteinXenbaseGene_laevis.txt, xlaevisProtein.fasta | ||
Revision as of 18:05, 22 September 2011
All of these files are derived from XenBase (downloaded on May, 01, 2011).
Version 1. RefSeq of cDNA & protein
- Read gene name for each NCBI id from 'Ncbi...' file. Filter out genes with 'unnamed' in gene name field.
- Read all sequences from '.fasta' file. Convert all sequence character to upper case.
- If I find a sequence with '>gi|<gi number>|ref|<genbank accession>' header (means it is RefSeq entity), write it down.
XENLA_cDNA_ref.v1.fasta (8,879 sequences)
- Used XenBase files: NcbiMrnaXenbaseGene_laevis.txt, xlaevisMRNA.fasta
XENLA_prot_ref.v1.fasta (8,878 sequences; 'taf5' is not annotated as RefSeq in protein, although its corresponding mRNA sequence is annotated as RefSeq.)
- Used XenBase files: NcbiProteinXenbaseGene_laevis.txt, xlaevisProtein.fasta