Current approach for hmceuB36, etc.
Expression data:
ftp://ftp.sanger.ac.uk/pub/genevar/CEU_parents_norm_march2007.zip
etc.

SNP data:  note the getall.sh next door.

pull the files from http://ftp.hapmap.org/genotypes/2008-03/forward/non-redundant ...
They changed on 1 April from release 23 to 23a, to fix positions for
over 1000 snp!

The material at the moment is based on r23.  Stay tuned.


For previous versions, we used:

GETTING EXPRESSION DATA FOR Cheung/Spielman paper replications
--------------------------------------------------------------

On Oct 10 2006, the following command (named ko)

curl -O ftp://ftp.ncbi.nih.gov/pub/geo/DATA/supplementary/samples/GSM$1/GSM$1.CEL.gz

was applied for a series of 5 digit accession numbers corresponding to samples identified
as coming from 58 unrelated individuals in the  CEPH

./ko 25349 ./ko 25356 ./ko 25358 ./ko 25360
./ko 25377 ./ko 25385 ./ko 25399 ./ko 25401 ./ko 25409 ./ko 25426 ./ko 25479 ./ko 25481
./ko 25524 ./ko 25526 ./ko 25528 ./ko 25530 ./ko 25540 ./ko 25542 ./ko 25548 ./ko 25550
./ko 25552 ./ko 25561 ./ko 25563 ./ko 25565 ./ko 25568 ./ko 25570 ./ko 25578 ./ko 25580
./ko 25624 ./ko 25626 ./ko 25628 ./ko 25630 ./ko 25632 ./ko 25634 ./ko 25656 ./ko 25658
./ko 25660 ./ko 25662 ./ko 25680 ./ko 25682 ./ko 25684 ./ko 25686 ./ko 48650 ./ko 48651
./ko 48652 ./ko 48653 ./ko 48654 ./ko 48655 ./ko 48656 ./ko 48657 ./ko 48658 ./ko 48659
./ko 48660 ./ko 48661 ./ko 48662 ./ko 48663 ./ko 48664 ./ko 48665

producing the collection

GSM25349.CEL.gz GSM25426.CEL.gz GSM25548.CEL.gz GSM25580.CEL.gz GSM25660.CEL.gz GSM48653.CEL.gz GSM48662.CEL.gz
GSM25356.CEL.gz GSM25479.CEL.gz GSM25550.CEL.gz GSM25624.CEL.gz GSM25662.CEL.gz GSM48654.CEL.gz GSM48663.CEL.gz
GSM25358.CEL.gz GSM25481.CEL.gz GSM25552.CEL.gz GSM25626.CEL.gz GSM25680.CEL.gz GSM48655.CEL.gz GSM48664.CEL.gz
GSM25360.CEL.gz GSM25524.CEL.gz GSM25561.CEL.gz GSM25628.CEL.gz GSM25682.CEL.gz GSM48656.CEL.gz GSM48665.CEL.gz
GSM25377.CEL.gz GSM25526.CEL.gz GSM25563.CEL.gz GSM25630.CEL.gz GSM25684.CEL.gz GSM48657.CEL.gz
GSM25385.CEL.gz GSM25528.CEL.gz GSM25565.CEL.gz GSM25632.CEL.gz GSM25686.CEL.gz GSM48658.CEL.gz
GSM25399.CEL.gz GSM25530.CEL.gz GSM25568.CEL.gz GSM25634.CEL.gz GSM48650.CEL.gz GSM48659.CEL.gz
GSM25401.CEL.gz GSM25540.CEL.gz GSM25570.CEL.gz GSM25656.CEL.gz GSM48651.CEL.gz GSM48660.CEL.gz
GSM25409.CEL.gz GSM25542.CEL.gz GSM25578.CEL.gz GSM25658.CEL.gz GSM48652.CEL.gz GSM48661.CEL.gz

to which RMA is applied, using bioconductor 1.9

Mapping from GSM number to NAxxxxx indexing of CEPH participants is in sample.IDs

"ceph_id" "sample_name" "geoacc"
"1" "1334.1" "12144" "GSM25548"
"2" "1334.1" "12144" "GSM25549"
"3" "1334.11" "12145" "GSM25550"
"4" "1334.11" "12145" "GSM25551"
"5" "1334.12" "12146" "GSM25552"
"6" "1334.12" "12146" "GSM25553"
"7" "1334.13" "12239" "GSM25570"
"8" "1334.13" "12239" "GSM25571"
"9" "1340.09" "06994" "GSM25358"
"10" "1340.09" "06994" "GSM25359"
"11" "1340.1" "07000" "GSM25360"
"12" "1340.1" "07000" "GSM25361"
"13" "1340.11" "07022" "GSM25377"
"14" "1340.11" "07022" "GSM25378"
"15" "1340.12" "07056" "GSM25401"
"16" "1340.12" "07056" "GSM25402"
"17" "1341.11" "07034" "GSM25385"
"18" "1341.11" "07034" "GSM25386"
"19" "1341.12" "07055" "GSM25399"
"20" "1341.12" "07055" "GSM25400"
"21" "1341.13" "06993" "GSM25356"
"22" "1341.13" "06993" "GSM25357"
"23" "1341.14" "06985" "GSM25349"
"24" "1341.14" "06985" "GSM25350"
"25" "1344.13" "12057" "GSM48660"
"26" "1345.12" "07357" "GSM25426"
"27" "1345.12" "07357" "GSM25427"
"28" "1345.13" "07345" "GSM25409"
"29" "1345.13" "07345" "GSM25410"
"30" "1346.11" "12043" "GSM25540"
"31" "1346.11" "12043" "GSM25541"
"32" "1346.12" "12044" "GSM25542"
"33" "1346.12" "12044" "GSM25543"
"34" "1347.14" "11881" "GSM25479"
"35" "1347.14" "11881" "GSM25480"
"36" "1347.15" "11882" "GSM25481"
"37" "1347.15" "11882" "GSM25482"
"38" "1349.13" "11839" "GSM48654"
"39" "1350.1" "11829" "GSM48650"
"40" "1350.11" "11830" "GSM48651"
"41" "1350.12" "11831" "GSM48652"
"42" "1350.13" "11832" "GSM48653"
"43" "1358.11" "12716" "GSM48662"
"44" "1358.12" "12717" "GSM48663"
"45" "1362.13" "11992" "GSM25524"
"46" "1362.13" "11992" "GSM25525"
"47" "1362.14" "11993" "GSM25526"
"48" "1362.14" "11993" "GSM25527"
"49" "1362.15" "11994" "GSM25528"
"50" "1362.15" "11994" "GSM25529"
"51" "1362.16" "11995" "GSM25530"
"52" "1362.16" "11995" "GSM25531"
"53" "1375.12" "12234" "GSM48661"
"54" "1408.1" "12154" "GSM25561"
"55" "1408.1" "12154" "GSM25562"
"56" "1408.11" "12236" "GSM25568"
"57" "1408.11" "12236" "GSM25569"
"58" "1408.12" "12155" "GSM25563"
"59" "1408.12" "12155" "GSM25564"
"60" "1408.13" "12156" "GSM25565"
"61" "1408.13" "12156" "GSM25566"
"62" "1416.11" "12248" "GSM25578"
"63" "1416.11" "12248" "GSM25579"
"64" "1416.12" "12249" "GSM25580"
"65" "1416.12" "12249" "GSM25581"
"66" "1420.09" "12003" "GSM48655"
"67" "1420.1" "12004" "GSM48656"
"68" "1420.11" "12005" "GSM48657"
"69" "1420.12" "12006" "GSM48658"
"70" "1444.13" "12750" "GSM25624"
"71" "1444.13" "12750" "GSM25625"
"72" "1444.14" "12751" "GSM25626"
"73" "1444.14" "12751" "GSM25627"
"74" "1447.09" "12760" "GSM25628"
"75" "1447.09" "12760" "GSM25629"
"76" "1447.1" "12761" "GSM25630"
"77" "1447.1" "12761" "GSM25631"
"78" "1447.11" "12762" "GSM25632"
"79" "1447.11" "12762" "GSM25633"
"80" "1447.12" "12763" "GSM25634"
"81" "1447.12" "12763" "GSM25635"
"82" "1454.12" "12812" "GSM25656"
"83" "1454.12" "12812" "GSM25657"
"84" "1454.13" "12813" "GSM25658"
"85" "1454.13" "12813" "GSM25659"
"86" "1454.14" "12814" "GSM25660"
"87" "1454.14" "12814" "GSM25661"
"88" "1454.15" "12815" "GSM25662"
"89" "1454.15" "12815" "GSM25663"
"90" "1459.09" "12872" "GSM25680"
"91" "1459.09" "12872" "GSM25681"
"92" "1459.1" "12873" "GSM25682"
"93" "1459.1" "12873" "GSM25683"
"94" "1459.11" "12874" "GSM25684"
"95" "1459.11" "12874" "GSM25685"
"96" "1459.12" "12875" "GSM25686"
"97" "1459.12" "12875" "GSM25687"
"98" "1463.15" "12891" "GSM48664"
"99" "1463.16" "12892" "GSM48665"
"100" NA "12056" "GSM48659"

The resulting expression matrix needs to be connected to the genotypes, using HMworkflow
in GGtools

GETTING GENOTYPE DATA
---------------------

see getall.sh, run on Sep 4 2006


wget http://www.hapmap.org/genotypes/latest/fwd_strand/non-redundant/genotypes_chr1_CEU_r21_nr_fwd.txt.gz
wget http://www.hapmap.org/genotypes/latest/fwd_strand/non-redundant/genotypes_chr2_CEU_r21_nr_fwd.txt.gz

...
