Welcome to Rgenetics@Lazarus_Lab

Recent Bloggery

Development notes on using plink to analyse dosage files

 Rgalaxy Development site

Rgalaxy screencasts and demonstrations - unstable!

 Galaxy Wiki, Documentation, screencasts and source

 Main Galaxy instance

 Galaxy TEST site for Rgenetics Tools - development tools only

Other Lazarus_Lab activities:  ESP Electronic Support for Public Health

(The old Rgenetics development site is archived here)

Olds

November 2009October 2009April 2009February 2009Older news
Backups recovered to new hardware after ugly hardware meltdown http://test.g2.bx.psu.edu Rgenetics on the Galaxy TEST instanceSome design notes for a new projectCompetitive renewal due for review June 4thNews Archives

Rgenetics Project Summary

The  Rgalaxy rgenetics and Rexpression toolkits for  Galaxy are designed to provide biologists with easy access to a wide and growing range of popular third-party tools such as Plink for handling whole genome genotype and and Bioconductor for gene expression data. This access is through a deceptively simple looking web interface, provided by the Galaxy framework, which supports persistent, shareable workspaces where all steps are recorded, and transparent access to popular third party data repositories such as UCSC and GEO. The Rgalaxy tools hide most of the ugly complexities of working with heterogeneous data layout and command format syntax and semantics from the biologist, and requires little training for most users.

Doing Stuff

Essentially, if you have your genotype and pedigree data in Plink style linkage format (separate map and ped files), the steps to get them into Galaxy in a way that will allow the SNP/WGA tools to work are something like this:

1. Make yourself a new user account at the main Galaxy server ( http://usegalaxy.org) This matters because if you are logged in, all your histories will be preserved over time between logins Otherwise, as anonymous, your histories will irregularly change or vanish.

2. From the analysis window, left (tool) pane, click the Get Data tool group header to expand the group, then click the 'upload file' tool. A form will appear in the center pane of your browser.

3. Change the file format (first field on the form) from Auto to "lped" format as autodetect won't work for these multi-part datatypes 4. Make the 'ped' and 'map' file upload fields point to the right map and ped files on your local machine, set the 'build' to hg18 and change the name to reflect something informative about your data then click execute.

5. After the data are uploaded (should only take a minute or two for a small file) to your history, you can select the SNP/WGA QC LD Plots tool submenu in the tool pane and then click the QC tool. Another form will open in the center panel. Your new dataset should be the only one available in the drop-down list of files to process. Change the QC job name to a meaningful name, click 'execute'. For a small dataset, the whole process should run for a few minutes but you can safely log out and log back in later - your work will all be preserved.

6. The QC tool output (in the right side history pane) has an 'eye' icon which you can click to open up the report in the center panel - you should see HWE/missingness/Mendel and all sorts of other useful plots and there are some tabular files containing summary details by marker and by sample.

Some Technical notes

Technical notes here including some notes on datatypes and installing a local mirror

The first products of this project, the rgenetics tools in BioConductor, are packaged for interactive use in R and distributed through the normal BioConductor channels. The Galaxy versions of the tools are currently available for testing here, or at the test Galaxy site. If you want your own private copy, you need to follow the instructions below. The rgalaxy data types are now in the main Galaxy trunk distribution. A large number of external packages are also needed to make full use of the rgenetics tools but an svn checkout will give you a flying start after you adjust your tool_conf.xml file appropriately based on the samples in the svn.

All of the software we use (including this Trac/Subversion site) and produce is licensed under an approved open source license.

Support

Work available here is supported by NIH RO1HG003646 “A Genetic Association Research Statistical Framework” - Ross Lazarus is the PI

Odds and Ends

Galaxy Ubuntu Jeos vmbuilder wrapper  Channing Infrastructure

trac (this website) docs

For a complete list of local wiki pages, see TitleIndex.

Attachments