TXGP Data Description

From Marcotte Lab
Jump to navigationJump to search
The printable version is no longer supported and may have rendering errors. Please update your browser bookmarks and please use the default browser print function instead.

Naming convention

  • Directory name: '(project group)_(species code)_(sample type)_(run ID)'
  • File name: '(run ID)_(species code)_(description)_(sample prep ID,barcode,F3/F5/R3)'
  • Species code
    • XENLA (Xenopus laevis)
    • XENTR (Xenopus tropicalis a.k.a. Silurana tropicalis)
    • ENGPU (Engystomops pustulosus a.k.a. Túngara Frog or Physalaemus pustulosus).

Data pre-processing

  • Remove reads with any no-call('N' in Illumina fastq file; '.' in SOLiD csfasta file).
  • Remove low-complex reads, with less than 4 letters ('0123' for color space, 'ATGC' for base space).

TXGP X. laevis BAC data

  • SAMPLE: One plate of CHORI-219 BAC library.

TXGP_XENLA_BAC2k_SA09023: Mate-pair(F3=50bp; R3=50bp; insert_size=2kbp), SOLiD v2

  • SA09023_XENLA_96BAC2kb_F3.called.fastq.gz: read_count=35M, file_size=1.8GB
  • SA09023_XENLA_96BAC2kb_R3.called.fastq.gz: read_count=35M, file_size=1.9GB

TXGP_XENLA_BAC5k_SA09023: Mate-pair(F3=50bp; R3=50bp; insert_size=5kbp), SOLiD v2

  • SA09023_XENLA_96BAC5kb_F3.called.fastq.gz: read_count=28M, file_size=1.3GB
  • SA09023_XENLA_96BAC5kb_R3.called.fastq.gz: read_count=28M, file_size=1.4GB

TXGP X. laevis whole genome data

  • SAMPLE: J-strain from Mustafa Khokha (Yale University).

TXGP_XENLA_WG1500_SA10026: Mate-pair(F3=50bp; R3=50bp; insert_size=1500bp), SOLiD v3

  • SA10026_XENLA_WG1500_HiAmp1ManF3: read_count=80M, file_size=4.1GB
  • SA10026_XENLA_WG1500_HiAmp1ManR3: read_count=79M, file_size=4.0GB
  • SA10026_XENLA_WG1500_HiAmp2ManF3: read_count=77M, file_size=3.9GB
  • SA10026_XENLA_WG1500_HiAmp2ManR3: read_count=77M, file_size=3.9GB
  • SA10026_XENLA_WG1500_HiAmpEZF3: read_count=83M, file_size=4.3GB
  • SA10026_XENLA_WG1500_HiAmpEZR3: read_count=82M, file_size=4.1GB
  • SA10026_XENLA_WG1500_LoAmpManF3: read_count=65M, file_size=3.4GB
  • SA10026_XENLA_WG1500_LoAmpManR3: read_count=64M, file_size=3.2GB

TXGP X. laevis RNA-seq data

TXGP_XENLA_RNA_SA11017: Paired-end(50bp/35bp), SOLiD v3

  • SA11017_XENLA_Heart_JA11050v3BC10F3: read_count=24M, file_size=1.5GB
  • SA11017_XENLA_Heart_JA11050v3BC10F5: read_count=23M, file_size=889M
  • SA11017_XENLA_Testis_JA11050v3BC04F3: read_count=33M, file_size=1.7GB
  • SA11017_XENLA_Testis_JA11050v3BC04F5: read_count=32M, file_size=1.3GB

TXGP_XENLA_RNA_SA11022: Paired-end(50bp/35bp), SOLiD v3

  • SA11022_XENLA_Egg_JA11015v4BC001F3: read_count=19.3M,
  • SA11022_XENLA_Egg_JA11015v4BC001F5: read_count=19.4M,
  • SA11022_XENLA_Stage24_JA11015v2BC13F3: read_count=16.5M,
  • SA11022_XENLA_Stage24_JA11015v2BC13F5: read_count=16.6M,

TXGP_XENLA_RNA_SA11024: Paired-end(50bp/35bp), SOLiD v3

  • SA11024_XENLA_Liver_JA11055v2BC12F3 (21.0M)
  • SA11024_XENLA_Liver_JA11055v2BC12F5 (22.0M)
  • SA11024_XENLA_Lung_JA11055v2BC11F3 (35.1M)
  • SA11024_XENLA_Lung_JA11055v2BC11F5 (36.7M)
  • SA11024_XENLA_Stomach_JA11055v4BC003F3 (27.8M)
  • SA11024_XENLA_Stomach_JA11055v4BC003F5 (29.1M)

TXGP_XENLA_RNA_OMRF20110730: Paired-end(100bp), Illumina

  • OMRF20110730_XENLA_EGG1_1.fastq.gz: read_count=94M, file_size= 8.5GB
  • OMRF20110730_XENLA_EGG1_2.fastq.gz: read_count=94M, file_size= 8.5GB
  • OMRF20110730_XENLA_EGG2_1.fastq.gz: read_count=128M, file_size= 14GB
  • OMRF20110730_XENLA_EGG2_2.fastq.gz: read_count=128M, file_size= 14GB

Contributed X. laevis Data

We are looking for X. laevis RNA-seq data to build a comprehensive set of gene models.

ConlonUNC_XENLA_RNA_Amin201106: Single-end (76bp), Illumina

Data from Frank Conlon lab at University of North Carolina at Chapel Hill.

  • Amin201106_XENLA_Stage38Heart_MO.fastq.gz: read_count=31M, file_size=2.3GB
  • Amin201106_XENLA_Stage38Heart_WT.fastq.gz: read_count=28M, file_size=2.0GB
  • Amin201106_XENLA_Stage45Heart_CtrlMO.fastq.gz: read_count=33M, file_size=2.2GB

HarlandUBC_XENLA_RNA_Park201106: Single-end (50bp), Illumina

Data from Richard Harland lab at University of California, Berkeley.

  • Park2011_XENLA_Arch1_WT.fastq.gz: read_count=101M, file_size=4.8GB
  • Park2011_XENLA_Arch2_WT.fastq.gz: read_count=102M, file_size=4.8GB
  • Park2011_XENLA_Arch3_WT.fastq.gz: read_count=96M, file_size=4.4GB
  • Park2011_XENLA_ArchD_WT.fastq.gz: read_count=115M, file_size=5.5GB
  • Park2011_XENLA_ArchV_WT.fastq.gz: read_count=103M, file_size=4.8G

LauBrandeis_XENLA_RNA_Lau201109: Single-end (38bp), Illumina

Data from Nelson Lau lab at Brandeis University.

  • Lau201109_XENLA_TadpoleBrain6.fastq.gz: read_count=25M, file_size=953MB
  • Lau201109_XENLA_TadpoleBrain7.fastq.gz: read_count=22M, file_size=879MB
  • Lau201109_XENLA_TadpoleBrain8.fastq.gz: read_count=27M, file_size=984MB

Contributed X. tropicalis Data

ConlonUNC_XENTR_RNA_Amin201106 (Illumina HiSeq)

Data from Frank Conlon lab at University of North Carolina at Chapel Hill.

  • ConlonLab2011_XENTR_Heart_WT1
  • ConlonLab2011_XENTR_Heart_WT2

TXGP other Xenopus data

TXGP_XENTR_WG5k_SA09023 (SOLiDv2)

TXGP_ENGPU_RNA_SA11022: Paired-end(50bp/35bp), SOLiD v3

  • SA11022_ENGPU_Larnyx_JA11015v4BC002F3 (21.1M)
  • SA11022_ENGPU_Larnyx_JA11015v4BC002F5 (21.1M)