590 likes | 734 Views
Computational Biology. Dr. Jens Allmer. Lecture Slides Week 5. Theory I. Console Applications. Why. MBG404 Overview. Processing. Pipelining. Generation. Data. Storage. Mining. HTML. What you need to know about hyper text markup language How to reach to it
E N D
Computational Biology Dr. Jens Allmer Lecture Slides Week 5
MBG404 Overview Processing Pipelining Generation Data Storage Mining
HTML • What you need to know about hyper text markup language • How to reach to it • Right click the document in your browser • Make sure you do not click on an image, link or some other non HTML element • Choose View Source or View Page Source. • What’s in the source • Sometimes things are not visible/ accessible on the web page but can be retrieved from the source
HTML Structure <HTML> <HEAD> <TITLE>Page title seen in the title bar</TITLE> <!-- Some other links and scripts can be here --> </HEAD> <BODY> Text and other visible elements go here </BODY> </HTML>
HTML Input <FORM action=“destination” method=“POST/GET”> <INPUT type=“TYPES” name=“” id=“” value=“” /> <TEXTAREA name=“” id=“”>value</TEXTAREA> <SELECT name=“” id=“”> <OPTION value=“”>display</OPTION> </SELECT> </FORM> TYPES: { text, password, checkbox, radio, submit, reset, file, hidden, image, button}
Why? • Why do you need this information? • Some information may be inaccessible on the website • In the HTML code it will be accessible • Sometimes you may be interested in all settings for the programs that you used online • Often these settings are in hidden input fields (you need to check the source then)
NCBI Blast • Contains many hidden variables here are some:
End Theory I • Let’s do some practice now
Make a Box Whisker Plot • Use Excel or OpenOffice to make a Box-Whisker Plot • Follow along as I do it in Excel using BWP.xlsz on the website • Use the data you analyzed before
Excel Box-Whisker Plotting • You need four series in columns (open, high, low, close) and at least four rows, or else Excel 2007/2010 reverses rows and columns, and you don't have enough series for that chart type. • Here's your workaround. Make a line chart with your limited data. You will get less than four series, with four X axis categories. Right click on the chart, choose Change Data, and in the dialog, click on Switch Row/Column. Now you have four series with fewer than four points each. Right click the chart, choose Change Chart Type, and choose the stock chart type you want.
XML / HTML / XHTML • For us: • HTML looks a lot like XML • HTML has defined elements and attributes • In XML you can define your own elements and attributes • HTML is loose (due to browser forgivefulness) • XML strictly ensures well formedness and validity • XHTML tries to achieve this for HTML as well
Data • Look around you • Heights, weights, male, female, .... • Collect the raw data • Sort, organize, eyeball the data, ... • Graph it • Choose a proper way to graph the data
Information • Analyse the plot together with a fellow student • How difficult is it to make your graph understood? • Aim: • A quick look without explanation should suffice
Caption • Write a proper caption for your graph
Different Data • http://www.statsci.org/datasets.html • http://www.vahealth.org/childadolescenthealth/data.htm#ta3 • Choose some data and repeat the steps before • Create two interesting graphs with caption
End Practice I • 5 min mindmapping • 15 min break
MBG404 Overview Processing Pipelining Generation Data Storage Mining
Database Management Systems Back Then
Database Management Systems Users View 1 View 2 View 3 Conceptual Schema Physical Schema DB
A Relation is a Table Tuples (rows) Attributes (column headers) Beers name manf Winterbrew Bud Lite Pete’s Anheuser-Busch Domain All possible values Contains data -> Instance
Schemas • Relation schema = relation name and attribute list. • Optionally: types of attributes. • Example: Beers(name, manf) or Beers(name: string, manf: string) • Database = collection of relations. • Database schema = set of all relation schemas in the database. • Instance of a relation = a table in a database with data
Anomalies • Goal of relational schema design is to avoid anomalies and redundancy. • Update anomaly : one occurrence of a fact is changed, but not all occurrences. • Deletion anomaly : valid fact is lost when a tuple is deleted.
Example of Bad Design Drinkers(name, addr, beersLiked, manf, favBeer) name addr beersLiked manf favBeer Janeway Voyager Bud A.B. WickedAle Janeway ??? WickedAle Pete’s ??? Spock Enterprise Bud ??? Bud Data is redundant, because each of the ???’s can be easily figuredout.
This Bad Design AlsoExhibits Anomalies name addr beersLiked manf favBeer Janeway Voyager Bud A.B. WickedAle Janeway Voyager WickedAle Pete’s WickedAle Spock Enterprise Bud A.B. Bud • Update anomaly: if Janeway is transferred to Intrepid, • will we remember to change each of her tuples? • Deletion anomaly: If nobody likes Bud, we lose track • of the fact that Anheuser-Busch manufactures Bud.
1st Normal Form All attributes need to be atomic
2nd Normal Form Must be in 1st NF a key must uniquely identify each tuple
3rd Normal Form Must be in 2nd NF attributes not part of a key must directly depend on one of the keys
One-One Relationships • In a one-one relationship, each entity of either entity set is related to at most one entity of the other set. • Example: Relationship Best-seller between entity sets Manfs (manufacturer) and Beers. • A beer cannot be made by more than one manufacturer, and no manufacturer can have more than one best-seller (assume no ties).
Many-One Relationships • Some binary relationships are many-one from one entity set to another. • Each entity of the first set is connected to at most one entity of the second set. • But an entity of the second set can be connected to zero, one, or many entities of the first set.
Many-Many Relationships • Focus: binary relationships, such as Sells between Bars and Beers. • In a many-many relationship, an entity of either set can be connected to many entities of the other set. • E.g., a bar sells many beers; a beer is sold by many bars.
End Theory II • 5 min mindmapping • 10 min break
MS Access • Create new Tables: • Plant • Features • FeatureTypes
Create the Three Tables • Plant • Features • FeatureTypes
Add Attributes • Plant • ID • Gender • Species • Strain • Clone
Add Attributes • Features • ID • FeatureType • Value
Add Attributes • Features • ID • Type • Unit