Personal tools
 
Views

We need to specify a list of required and recommended fields for markerInfo, sampleInfo, and studyInfo slots of a geneSet object.

markerInfo:

  • Currently a dataframe. Columns can be specific to at least three different levels: (1) the particular set of genotypes stored in the corresponding row of the matrix in the callCodes slot - i.e. a "callSet", in Scott's terminology; (2) the assay design used to generate the set of genotypes in the corresponding row of the callCodes slot; and (3) the marker (i.e. polymorphism) for which the assay design is measuring genotypes.

Level 1 data:

  • Name - Marker name, or more generally, a unique name for the corresponding row of the callCodes matrix - Required, unique
  • CallRate? - Genotyping Call Rate - Recommended
  • ErrorRate? - Genotyping Error Rate - Optional
  • Polymorph - Indicator of whether row of genotypes contains more than one class - Optional
  • SegRatio? - Indicator of whether row of genotypes conforms sufficiently to an expected segregation ratio - Optional
  • HWE - Indicator of whether row of genotypes is in Hardy-Weinberg Equilibrium - Optional
  • UniqueLocus? - Indicator of whether row of genotypes is from an assay that appears to be detecting a unique locus - Optional
  • OtherQC? - Other QC measure for the row of genotypes - Optional
  • Description - Descriptive name - Optional
  • AssayRunID? - Unique numeric ID for the row of genotypes - Optional
  • CallSetID? - Unique numeric ID for the row of genotypes - Optional
  • ...

Level 2 data:

  • AssayID? - Identifier for assay design used to interrogate a particular polymorphism and thus generate the given set of genotypes - Optional
  • AssayTable? - Name of an object storing information about assay designs. You should be able to do a lookup in this table using AssayID?. - Optional
  • ...

Level 3 data:

  • MarkerID?/SNPID - Unique ID for the polymorphism - Optional
  • MarkerTable? - Name of an object storing information about markers. You should be able to do a lookup in this table using MarkerID?. - Optional
  • Gene - Gene name - Optional
  • RelPosition? - Marker position relative to start of gene - optional
  • AbsPosition? - Physical position - Optional
  • GenPosition? - Genetic map position - Optional
  • Chromosome - Chromosome - Optional
  • RefSNPID? - Optional
  • AccessionID? - Optional
  • Strand - Optional
  • SequenceID? - Optional
  • Allele1 - Code for first allele (e.g. base of allele 1 of a SNP) - Optional
  • Allele2 - Code for second allele (e.g. base of allele 2 of a SNP) - Optional
  • SexLinked? - Categorical variable indicating mode of inheritance: sex-linked (S), pseudo-autosomal (P), autosomal (A) - Optional
  • ...

sampleInfo:

  • SampleID? - required, unique
  • SubjectID? - optional
  • ...

studyInfo:

  • "Platform" - technology platform used to collect the data - Optional (though for some data, platform may vary from marker to marker, in which case this information should go in markerInfo)
  • ...


 

Powered by Plone, the Open Source Content Management System

This site conforms to the following standards: