200 likes | 331 Views
Welcome to Introduction to Bioinformatics Wednesday, 19 September 2014. Scenario 2: Regulatory protein / Simulations Array indexing List notation Push/Pop Shift/Unshift foreach / for Array-based DiceRoll.pl. Implementing a Simulation General Strategy.
E N D
Welcome toIntroduction to BioinformaticsWednesday, 19 September 2014 Scenario 2: Regulatory protein / Simulations Array indexing List notation Push/Pop Shift/Unshift foreach / for Array-based DiceRoll.pl
Implementing a SimulationGeneral Strategy Problem: How frequent are NtcA binding sites in a random DNA sequence? Which random sequence?
Differentiation in cyanobacteriaFind primers to PCR out hetC ttgtcagttgtcagacgtagtagcgcgtctagtctaatgtgttgttatat tatttgctactagaaatgaggagagggttatttttctcactgcttcccaa ttctatgagaatataaaattttccttaagtttctcatggcaataatggaa aaaaccgaccattctgatgaataagtccggttttttccaaaaaatatttt tgctttttcgctttatttatctatatttccaagttttagtacatcggtga ggggtgacaactatcttgccaatattgtcgttattgttaggttgctatcg gaaaaaatcTGTAacatgagaTACAcaatagcatttatatttgctttagt atctctctcttgggtgggattctgcctgcaatttaaaaaccagtgttaac aattttcggctttattttccgggagttaaatcaaccaagggaaaatgtaa ctaatgtttaaatatcttcggatacacacaaagtaaaaccaatttttaca gatgtcgatgttgctcacattttttagaaatattactaaattaaaaatgt tattaaatttatgttcatagagaaccttttccaaataaaaaaataatttt cctgatgttttaagaaaattactgttgttataaattaaaggtgattcaac aaaatatagatagttctttcaataactatctacttttaccattaagtgaa cttactcatgaataatcaacaggaattaaaaataaagttcatgaatactg gttaaagattcagtaaagtttgaggaaataccggaataaatttccaccca aatatgattttttaaaagatacattggcagtacattaaaatgccgatgtt agataaatttgccttcatagctgttatctatttgctcagaactaagccaa gagtttacacaccaaacagaaattaaactatgaatccctcttcgtcgttahetC... GTA…(8)…TAC
Differentiation in cyanobacteria ttctatgagaatataaaattttccttaagtttct aaaaccgaccattctgatgaataagtccggtttt tgctttttcgctttatttatctatatttccaagt ggggtgacaactatcttgccaatattgtcgttat gaaaaaatctGTAacatgagaTACacaatagcatttatatttgcttTAgtaTctctctcttgggtggg …(20-24)…TAnnnT GTA…(8)…TACNtcA binding site Promoter
Implementing a SimulationGeneral Strategy Problem: How frequent are NtcA binding sites in a random DNA sequence? Which random sequence? I still don't entirely understand why we only need to create 847 bp
Implementing a SimulationGeneral Strategy Problem: How frequent are NtcA binding sites in random DNA sequence? Strategy: Modify DiceRoll.pl - (change to use arrays) - Modify Make_random_sequence (SQ.1-3) - Change Random_integer Random nucleotide (SQ. 4) - Modify Any_matches, test for exact match (SQ. 5) - Modify Any_matches, allow inexact matches (SQ. 7-11)
Roll that doesn’t work Roll that works The Alternative: Straight Math SQ1. Probability of getting at least one matched pair in a roll of five dice. i don't remember combination and permutation math very well. Probability (0 dice matching) = Probability (1 dice matching) = Probability (2 dice matching) = Probability (3 dice matching) = Probability (4 dice matching) = Probability (5 dice matching) =
Roll that doesn’t work Roll that works The Alternative: Straight Math SQ1. Probability of getting at least one matched pair in a roll of five dice. Probability (0 dice matching) = Probability (1 dice matching) = Probability (2 dice matching) = Probability (3 dice matching) = Probability (4 dice matching) = Probability (5 dice matching) =
SQ3: push / pop / shift / unshift I am just learning about push, pop, shift, and unshift in 600. A quick review of all of these would greatly help.
@codons = ATG GAT GCT TAT TTT CAA . . . TAA 3203 3206 3200 + 3 3200 + 6 Arrays: Assignment and Access 0 1 2 3 4 5 n Memory: 3200 Memory: ???? Which $codon[ ] is GCT? Where is $codon[n]? 3200 + 3*n Where is $codon[1]? Where is $codon[2]?
Arrays: Assignment and Access Scalar assignment of array values: my @days; $days[0] = “Sun”; $days[1] = “Mon”; ... Array assignment of array values: my @days = (“Sun”, “Mon”, ...); my @numbers = (1 .. 47); print @numbers;
Arrays: Assignment and Access SQ2. @letters contains all uppercase letters. How to print the letter "J"? my @letters = print
SQ3: push / pop / shift / unshift SQ3. Predict output of: @protein = ("cytochrome oxidase","hexokinase","glutamine synthetase");push @protein, "phosphofructokinase", "albumin";$protein[1]= "deleted";unshift @protein, "globin";$name1 = pop @protein;$name2 = shift @protein;$name3 = shift @protein;print"name1 = $name1 name2 = $name2 name3 = $name3", $LF;print"current protein[2] = $protein[2]", $LF;print"remaining names: ",join(", ", @protein);
SQ4: DiceRoll if with arrays SQ4. Rewrite these lines to use an array if ($number_of_ones>=$matches_wanted) { return $true}if ($number_of_twos>=$matches_wanted) { return $true} . . .if ($number_of_sixes>=$matches_wanted) { return $true}
for(my$number = 1; $number <= 100; $number = $number + 1) { $sum = $sum + $number; } for loops Problem: Add up the numbers from 1 to 100 • Where to begin?- Where to end?- How to get from here to there?- What to do in between?
foreach(my$number (1 .. 100) { $sum = $sum + $number; } foreach loops Problem: Add up the numbers from 1 to 100 • Where to begin?- Where to end?- How to get from here to there?- What to do in between?
foreach loops SQ5. Write a loop that prints out a table of numbers from 1 to 20 and their squares.
SQ6: Rewrite DiceRoll SQ6. Replace $number_of_ones and similar variables with an array.