|
|
Views
We need to specify a list of required and recommended fields for markerInfo, sampleInfo, and studyInfo slots of a geneSet object.
markerInfo:
- Currently a dataframe. Columns can be specific to at least three different levels: (1) the particular set of genotypes stored in the corresponding row of the matrix in the callCodes slot - i.e. a "callSet", in Scott's terminology; (2) the assay design used to generate the set of genotypes in the corresponding row of the callCodes slot; and (3) the marker (i.e. polymorphism) for which the assay design is measuring genotypes.
Level 1 data:
- Name - Marker name, or more generally, a unique name for the corresponding row of the callCodes matrix - Required, unique
- CallRate? - Genotyping Call Rate - Recommended
- ErrorRate? - Genotyping Error Rate - Optional
- Polymorph - Indicator of whether row of genotypes contains more than one class - Optional
- SegRatio? - Indicator of whether row of genotypes conforms sufficiently to an expected segregation ratio - Optional
- HWE - Indicator of whether row of genotypes is in Hardy-Weinberg Equilibrium - Optional
- UniqueLocus? - Indicator of whether row of genotypes is from an assay that appears to be detecting a unique locus - Optional
- OtherQC? - Other QC measure for the row of genotypes - Optional
- Description - Descriptive name - Optional
- AssayRunID? - Unique numeric ID for the row of genotypes - Optional
- CallSetID? - Unique numeric ID for the row of genotypes - Optional
- ...
Level 2 data:
- AssayID? - Identifier for assay design used to interrogate a particular polymorphism and thus generate the given set of genotypes - Optional
- AssayTable? - Name of an object storing information about assay designs. You should be able to do a lookup in this table using AssayID?. - Optional
- ...
Level 3 data:
- MarkerID?/SNPID - Unique ID for the polymorphism - Optional
- MarkerTable? - Name of an object storing information about markers. You should be able to do a lookup in this table using MarkerID?. - Optional
- Gene - Gene name - Optional
- RelPosition? - Marker position relative to start of gene - optional
- AbsPosition? - Physical position - Optional
- GenPosition? - Genetic map position - Optional
- Chromosome - Chromosome - Optional
- RefSNPID? - Optional
- AccessionID? - Optional
- Strand - Optional
- SequenceID? - Optional
- Allele1 - Code for first allele (e.g. base of allele 1 of a SNP) - Optional
- Allele2 - Code for second allele (e.g. base of allele 2 of a SNP) - Optional
- SexLinked? - Categorical variable indicating mode of inheritance: sex-linked (S), pseudo-autosomal (P), autosomal (A) - Optional
- ...
sampleInfo:
- SampleID? - required, unique
- SubjectID? - optional
- ...
studyInfo:
- "Platform" - technology platform used to collect the data - Optional (though for some data, platform may vary from marker to marker, in which case this information should go in markerInfo)
- ...
|
|