330 likes | 510 Views
Identification and evaluation of causative genetic variants corresponding to a certain phenotype Xidan Li. Outline. SIT - identify and evaluate the causative genetic variants within a QTL/GWAS defined region .
E N D
Identification and evaluation of causative genetic variants corresponding to a certain phenotypeXidan Li
Outline • SIT - identify and evaluate the causative genetic variants within a QTL/GWAS defined region. • PASE - evaluate the effect of amino acid substitution to the hosting protein function • DIPT - to identify causative genes underlying an expression phenotype • Parallelizing computing
Working process of SIT VCF file Ensembl SNPs analysis in non-coding regions SNPs analysis in coding regions Non-synonymous SNPs Splicing sites CpG island UTR region PASE List of ranking Non-synonymous SNPs Candidate genes with candidate SNPs
Formula for conservation calculation (1-.95N)*(nobserved /Ntotal) Blast search clustalw Probability of 20 different AAs in a position for N random equal frequent sequences. 1-.95N nobserved /Ntotal
Protein kinase AMP-activated gamma 3 (PRKAG3) gene • (R200Q) in AMPK3 in purebred Hampshire pigs – RN • (V199I) in AMPK3 Co-participate in the effective process with R200Q • RNthat causes excess glycogen content in pig skeletal muscle • Milan D, et. al. (2000). A mutation in PRKAG3 associated with excess glycogen content in pig skeletal muscle.Science288 (5469): 1248–51. • Ciobanu,D, et. al. (2001). Evidence for New Alleles in the Protein Kinase Adenosine Monophosphate-Activated 3-Subunit Gene Associated With Low Glycogen Content in Pig Skeletal Muscle and Improved Meat Quality. Genetics, 159, 1151-1162.
(R200Q) Cause major increase in the muscle glycogen content (V199I) Contribute with smaller effect Ciobanu,D, et. al. (2001). Evidence for New Alleles in the Protein Kinase Adenosine Monophosphate-Activated 3-Subunit Gene Associated With Low Glycogen Content in Pig Skeletal Muscle and Improved Meat Quality. Genetics, 159, 1151-1162.
Features • Other tool SIFT, PolyPhen MAINLY rely on calculating sequence conservation scores (finding homologous sequences). • PASE not only uses the physico-chemical property changing score, but also combine with sequence conservation score Potentially being able to analyze the evolutionary-distant protein sequence
Usually in the loop • Data must be independent
Cuda Vs. C #include <stdio.h> int main(void) { printf("Hello World\n"); return 0; } #include <cuda.h> #include <stdio.h> // Prototypes __global__ void helloWorld(char*); // Host function int main(intargc, char** argv) { int i; // desired output char str[] = "Hello World!"; // mangle contents of output ; the nullcharacterisleft intact for simplicity for(i = 0; i < 12; i++) str[i] -= i; // allocatememory on the device char *d_str; size_t size = sizeof(str); cudaMalloc((void**)&d_str, size); // copy the string to the device cudaMemcpy(d_str, str, size, cudaMemcpyHostToDevice); // set the grid and block sizes dim3 dimGrid(2); // one block per word dim3 dimBlock(6); // one thread per character // invoke the kernel helloWorld<<< dimGrid, dimBlock >>>(d_str); // retrieve the resultsfrom the device cudaMemcpy(str, d_str, size, cudaMemcpyDeviceToHost); // free up the allocatedmemory on the device cudaFree(d_str); // everyone's favorite part printf("%s\n", str); return 0; } // Device kernel __global__ void helloWorld(char* str) { // determine where in the thread grid we are intidx = blockIdx.x * blockDim.x + threadIdx.x; // unmangle output str[idx] += idx; }