Tuesday, July 13, 2010

Some useful SNP annotation databases:
Related web-tools
Gene-based test
  • VEGAS, or Versatile Gene-based Association Study

Monday, July 12, 2010

Convert PLINK file for Haploview

Suppose you have test.bim, test.bed, and test.fam for PLINK.
1) generate ped/map files
plink --noweb --bfile test --recode --alleleACGT --prune --out gene
This will generate a ped file, gene.ped, coded by letters (instead of numbers) and a map file, gene.map, including marker information.
The "--prune" option removes individuals with missing phenotype. If you don't use "--prune", you may need to update the ped file accordingly because in PLINK ped file, -9 is used for missing while in Haploview/Linkage ped file, 0 is used for missing. Here are some suggestion command lines in R:
----------------------------------------------------------------------------
ped = read.table("gene.ped")
ped[,6]==-9 -> x
ped[x, 6]=0
write.table(ped, file="gene.ped2", row.names=F, col.names=F, quote=F)
----------------------------------------------------------------------------
2) convert map file to contain only marker and position
cut -f 2,4 gene.map > gene.map2
The generated gene.ped and gene.map2 can be input into Haploview as "Linkage format".

Saturday, July 10, 2010

Shell command to convert PLINK output

PLINK results are sepecrated by multiple spaces for nice print-out: a variable number of spaces actually, which is not convenient for programming.
1) convert to tab separated file: first remove spaces in the beginning of every line, then replace other spaces to tab
sed -r 's/^\s+//g' plink.assoc | sed -r 's/\s+/\t/g' > plink.tab
2) extract certain columns: cut uses tab as delimiter for default
sed -r 's/^\s+//g' plink.assoc | sed -r 's/\s+/\t/g' | cut -f 2 > SNPlist
If the file is separated by space, like fam file, use this:
cut -d " " -f 2

Resources for miRNA target prediction

Target prediction
  1. TargetScan (http://www.targetscan.org/): data can be directly downloaded
  2. miRanda (http://www.microrna.org/microrna/home.do)
  3. RNAhybrid (http://bibiserv.techfak.uni-bielefeld.de/rnahybrid/): no data downloadable, but software is available; there is also a webservice which can be accessed through Perl or Java.
  4. PicTar (http://pictar.mdc-berlin.de/): online search or downloaded here:
  5. DIANA-MicroT (http://diana.cslab.ece.ntua.gr/microT/): data can be directly downloaded
Experimental database
  1. TarBase (http://diana.cslab.ece.ntua.gr/tarbase/): from the same DIANA lab
miRNA vs. disease
  1. miR2Disease Base (http://www.mir2disease.org/): data available
Comprehensive tools
  1. GOmir (http://www.bioacademy.gr/bioinformatics/projects/GOmir/): stand-alone Java tool

Next-generation sequencing seminar

Seminar Information:
--------------------------------
Tuesday, July 13, 2010
Vanderbilt University
Light Hall
2215 Garland Ave
Nashville, TN 37232
Room 512
-------------------------------

Seminar Schedule
-------------------------------
1:00 Registration
1:30 Introduction