Views
TODO For Alpha Release (0.8?) -----------------------------
Goals -----
- Basic support for loading and manipulate genetic data
Status:
- David will try importing his data week of 15th
- Scott will try importing his data week of 22nd
- Support for calculating allele frequencies, HWE, and LD.
Status:
- HWE: Inefficient implementation in place.
- alleleFreq: Done
- LD: Inefficient biallelic implementation in place.
- Ross has code extracted from LDMAX C++ code from the GOLD package
in place. Need to get appropriate permission from Goncalo Abecasis.
+ Ross will follow up with Goncalo
- Support for FBAT - see the fbat package
Status: Done. Vignette in place.
- Basic tools for including genetic data in models (ie, carrier,
allele.count, homozygote, ...)
Status: Need to create vignette based on R-News article.
- Ability to produce a nice marker summary report
Status: Done. Vignette in place, may neeed more editing.
- Access to haplo.score code.
- Basic Wrappers in place. Check with Weiliang to see if in SVN.
- Tool for apply a model to each marker
Status:
Greg has code from another project that does this.
We should benchmark.
- A vignette demonstrating how to
- read data from various formats (genotypes and phenotypes)
- a nice marker summary report for each marker
- perform HWE and LD calculations
- apply FBAT
- add annotation data to the markers
Look at SNPer? package.
- apply a statistical model to each marker & produce a nice report
Status:
- All present execept for annotation task.
Specific Tasks --------------
- Make list of required and recommended fields for markerInfo,
sampleInfo, and studyInfo.
markerInfo: "Name" - Descriptive name - Required "Marker" - Marker name - Optional "Gene" - Gene name - Optional "RelPosition?" - Marker position relative to start of gene - optional "AbsPosition?" - Physical position "CallRate?" - Genotyping Call Rate - Recommended
sampleInfo: "SampleID?" - required "SubjectID?" - optional
studyInfo: "Platform" - technology platform used to collect the data - Optional
...
- as.data.frame: creates a data frame by taking the sampleInfo data frame and adding genotype data as columns.
- as.geneSet for "data.frame": accepts data frame and a vector/list of column
indexes/names containing gene information, optional information on how to
interpret the gene information. IE, (..., format=
sep, sep=/), (..., format=sep, sep=''), (..., format=longstring, codes=c(r,a,h) ) (count';sep=1, format= - Create a sample-size / experimental design package
"GeneticsDesign?"
pull in code from genetics / genutils / ...
+ Weiliang to create + Scott + Greg to toss things in
TODO for Release 1.0 --------------------
TODO for Eventually -------------------
> - Create a comprehensive set of unit tests ?>
--------------- Completed Tasks --------------- > Split useful code out of readGenes.* for use in geneCodeSet constructor. ?>
- Write check methods?
- We have decided to require row and column names in callCodes and errorMetrics slots. We will need to check consistency with markerInfo and sampleInfo slots, and enforce that the dimnames exist.
- code to decodeCallCodes to properly handle missing value codes
- code to alleleNames, and AlleleCodes? to properly handle missing value codes
- methods for
subset'[' '[<-' '[[' '[[<-'
genetics to GeneticsBase.
- Should callCodes slot be of mode "integer", not "numeric"? Currently, I AM using "integer". --> KEEP AS INTEGER
- Revise show method for class geneCallSet (see 7. above).
##### # Old #####