Personal tools
 
Views

TODO For Alpha Release (0.8?) -----------------------------

Goals -----

  • Basic support for loading and manipulate genetic data

    Status:

    • David will try importing his data week of 15th
    • Scott will try importing his data week of 22nd
  • Support for calculating allele frequencies, HWE, and LD.

    Status:

    • HWE: Inefficient implementation in place.
    • alleleFreq: Done
    • LD: Inefficient biallelic implementation in place.
    • Ross has code extracted from LDMAX C++ code from the GOLD package in place. Need to get appropriate permission from Goncalo Abecasis.

      + Ross will follow up with Goncalo

  • Support for FBAT - see the fbat package

    Status: Done. Vignette in place.

  • Basic tools for including genetic data in models (ie, carrier, allele.count, homozygote, ...)

    Status: Need to create vignette based on R-News article.

  • Ability to produce a nice marker summary report

    Status: Done. Vignette in place, may neeed more editing.

  • Access to haplo.score code.
    • Basic Wrappers in place. Check with Weiliang to see if in SVN.
  • Tool for apply a model to each marker

    Status:

    Greg has code from another project that does this.

    We should benchmark.

  • A vignette demonstrating how to
    • read data from various formats (genotypes and phenotypes)
    • a nice marker summary report for each marker
    • perform HWE and LD calculations
    • apply FBAT
    • add annotation data to the markers

      Look at SNPer? package.

    • apply a statistical model to each marker & produce a nice report

    Status:

    • All present execept for annotation task.

Specific Tasks --------------

  • Make list of required and recommended fields for markerInfo, sampleInfo, and studyInfo.

    markerInfo: "Name" - Descriptive name - Required "Marker" - Marker name - Optional "Gene" - Gene name - Optional "RelPosition?" - Marker position relative to start of gene - optional "AbsPosition?" - Physical position "CallRate?" - Genotyping Call Rate - Recommended

    sampleInfo: "SampleID?" - required "SubjectID?" - optional

    studyInfo: "Platform" - technology platform used to collect the data - Optional

    ...

  • as.data.frame: creates a data frame by taking the sampleInfo data frame and adding genotype data as columns.
  • as.geneSet for "data.frame": accepts data frame and a vector/list of column indexes/names containing gene information, optional information on how to interpret the gene information. IE, (..., format=sep, sep=/), (..., format=sep, sep=''), (..., format=longstring, codes=c(r,a,h) ) (count';sep=1, format=
  • Create a sample-size / experimental design package "GeneticsDesign?"

    pull in code from genetics / genutils / ...

    + Weiliang to create + Scott + Greg to toss things in

TODO for Release 1.0 --------------------

12. Document treatment of missing genotype or allele values! NA --> No knowledge of this genotype A/NA or NA/A --> partial knowledge NA/NA mapped to NA

14. Add a quality column to marker.info slot

TODO for Eventually -------------------

- Create a comprehensive set of unit tests

--------------- Completed Tasks --------------- Split useful code out of readGenes.* for use in geneCodeSet constructor.

13. Translate the object manipulation functions in the R genetics package

2. When creating geneCodeSet, need to check that all values in TransTable? column of markerInfo slot match a value in names of transTables slot!

  1. Write check methods?
  2. We have decided to require row and column names in callCodes and errorMetrics slots. We will need to check consistency with markerInfo and sampleInfo slots, and enforce that the dimnames exist.

Create 1 example data set each.

- allele list and missing value code specification to readGenes functions.

  • code to decodeCallCodes to properly handle missing value codes
  • code to alleleNames, and AlleleCodes? to properly handle missing value codes
  • methods for subset '[' '[<-' '[[' '[[<-'

Port LD code from genetics to GeneticsBase.

Optimize performance of readGenes functions. (Use scan instead of read.table).

Function to read hapmap .ped files

add print method -- fixed show method

1. Break source file into multiple files?

14. Create R helpfile templates --> Jason + Nitin

  1. Should callCodes slot be of mode "integer", not "numeric"? Currently, I AM using "integer". --> KEEP AS INTEGER

7. Write function decodeCallCodes. Call this in, e.g. show method for class geneCallSet. This function should loop over components of the transTables slot, NOT over rows of the callCodes slot! The efficiency gain can be VERY large for large objects.

  1. Revise show method for class geneCallSet (see 7. above).

10. Add ploidy slot? 11. Add phase slot? How will we encode phase info?

  1. Yes/No for all (logicial scaler) 2. Yes/No for each Marker (logical vector) 3. phaseObject (TBD): observation by marker by phase probabilities + definitions of contigs + probabilty of contigs

##### # Old #####


 

Powered by Plone, the Open Source Content Management System

This site conforms to the following standards: