Welcome to Rgenetics@Lazarus_Lab
Recent Bloggery
- Best practice defaults for specific tool executable versions and tool use contexts
- Things you'll probably need when planning a new Galaxy tool
- graphical grammar of plotting in R
- Painful but possible - getting Eigensoft3 to run on an ubuntu desktop
- functional test enhancements for Galaxy continued
- functional test enhancements for Galaxy
- Some R dependencies on ubuntu (moved - old from june 2009)
- Travails trying to get the WTCCC data
- Testing shellfish with hapmap3 data
- Notes on Galaxy clustering setup as at November 2009
Development notes on using plink to analyse dosage files
Useful links
Rgalaxy Development site
Rgalaxy screencasts and demonstrations - unstable!
Galaxy Wiki, Documentation, screencasts and source
Galaxy TEST site for Rgenetics Tools - development tools only
Other Lazarus_Lab activities: ESP Electronic Support for Public Health
(The old Rgenetics development site is archived here)Olds
November 2009 October 2009 April 2009 February 2009 Older news Backups recovered to new hardware after ugly hardware meltdown http://test.g2.bx.psu.edu Rgenetics on the Galaxy TEST instance Some design notes for a new project Competitive renewal due for review June 4th News Archives
Rgenetics Project Summary
The Rgalaxy rgenetics and Rexpression toolkits for Galaxy are designed to provide biologists with easy access to a wide and growing range of popular third-party tools such as Plink for handling whole genome genotype and and Bioconductor for gene expression data. This access is through a deceptively simple looking web interface, provided by the Galaxy framework, which supports persistent, shareable workspaces where all steps are recorded, and transparent access to popular third party data repositories such as UCSC and GEO. The Rgalaxy tools hide most of the ugly complexities of working with heterogeneous data layout and command format syntax and semantics from the biologist, and requires little training for most users.
Doing Stuff
Essentially, if you have your genotype and pedigree data in Plink style linkage format (separate map and ped files), the steps to get them into Galaxy in a way that will allow the SNP/WGA tools to work are something like this:
1. Make yourself a new user account at the main Galaxy server ( http://usegalaxy.org) This matters because if you are logged in, all your histories will be preserved over time between logins Otherwise, as anonymous, your histories will irregularly change or vanish.
2. From the analysis window, left (tool) pane, click the Get Data tool group header to expand the group, then click the 'upload file' tool. A form will appear in the center pane of your browser.
3. Change the file format (first field on the form) from Auto to "lped" format as autodetect won't work for these multi-part datatypes 4. Make the 'ped' and 'map' file upload fields point to the right map and ped files on your local machine, set the 'build' to hg18 and change the name to reflect something informative about your data then click execute.
5. After the data are uploaded (should only take a minute or two for a small file) to your history, you can select the SNP/WGA QC LD Plots tool submenu in the tool pane and then click the QC tool. Another form will open in the center panel. Your new dataset should be the only one available in the drop-down list of files to process. Change the QC job name to a meaningful name, click 'execute'. For a small dataset, the whole process should run for a few minutes but you can safely log out and log back in later - your work will all be preserved.
6. The QC tool output (in the right side history pane) has an 'eye' icon which you can click to open up the report in the center panel - you should see HWE/missingness/Mendel and all sorts of other useful plots and there are some tabular files containing summary details by marker and by sample.
Some Technical notes
Technical notes here including some notes on datatypes and installing a local mirror
The first products of this project, the rgenetics tools in BioConductor, are packaged for interactive use in R and distributed through the normal BioConductor channels. The Galaxy versions of the tools are currently available for testing here, or at the test Galaxy site. If you want your own private copy, you need to follow the instructions below. The rgalaxy data types are now in the main Galaxy trunk distribution. A large number of external packages are also needed to make full use of the rgenetics tools but an svn checkout will give you a flying start after you adjust your tool_conf.xml file appropriately based on the samples in the svn.
All of the software we use (including this Trac/Subversion site) and produce is licensed under an approved open source license.
Support
Work available here is supported by NIH RO1HG003646 “A Genetic Association Research Statistical Framework” - Ross Lazarus is the PI
Odds and Ends
Galaxy Ubuntu Jeos vmbuilder wrapper Channing Infrastructure
trac (this website) docs
- TracGuide -- Built-in Documentation
- The Trac project -- Trac Open Source Project
- Trac FAQ -- Frequently Asked Questions
- TracSupport -- Trac Support
For a complete list of local wiki pages, see TitleIndex.
Attachments
-
Galaxy_IRB_briefing_paper_dec15.pdf
(135.1 KB) - added by rerla
3 years ago.
IRB Briefing notes
