360 likes | 608 Views
MySQL/PHP Workshop. http://www.trii.org/courses/mysql/. 2 MySQL lectures 2 PHP lectures Each lecture builds on concepts taught and learned in the previous lectures.
E N D
MySQL/PHP Workshop http://www.trii.org/courses/mysql/ • 2 MySQL lectures • 2 PHP lectures • Each lecture builds on concepts taught and learned in the previous lectures. • The first two lectures discuss the concept of a relational database such as MySQL and show you how to manipulate the data stored in the database from the command line. It is essential to learn this first because PHP makes use of the language of the database. • The third and fourth lectures will introduce you to PHP, a server-side scripting language that allows you to interact with the MySQL database from a web browser and create fancy web pages to display the data. PHP is the go-between that fetches the data from the MySQL database and then spits it out dynamically as the nicely formatted HTML page that the browser expects.
Class #1: Introduction to MySQL Manda Wilson, BIC, cBio, MSKCC Trii Training: Bioinformatics and Computational Methods • Relational databases • Database design • SQL • Creating databases • Creating tables • Selecting from, deleting, and updating tables • Exercises • You should have • Class notes • Exercise handout • MySQL Pocket Reference • Useful resource: http://dev.mysql.com/doc/
Relational Databases • A database is a collection of tables • Columns define attributes of the data • All data in a column must have the same data type • A record is stored in a row table name row column
Use a Relational Database When… • You have a very large dataset • There is redundant data • Wastes disk space • Increases errors • Information must be updated in multiple locations • Security is important • Different users can be granted different permissions • Strict enforcement of data types is important
Spreadsheet Example Data is inconsistent! Now imagine you are designing the New York Public Library database which has tens of million books and well over a million cardholders.
Database Design Entity Relationship Design If you are interested in learning more formal design methods look up “normalization” and/or “third normal form”.
Our data What entities or “objects” are defined here? Is there any redundant data? What happens if we want to add another species attribute (e.g. genus)?
Our tables How do we know which organism a gene belongs to? Do we have unique identifiers? Can the organism have more than one gene? Can the gene have more than one organism?
Our tables Means each gene has “one and only one” organism, but that each organism can have more than one gene. This is an example of an entity relationship diagram.
Our tables • A primary key is a unique identifier for a record in a table. • A foreign key in one table refers to the primary key of another. • The gene table has a foreign key, Organism_ID. Each record in the gene table will have an organism id to link it to the correct species record in the organism table.
Database Design Caveat • Sometimes design is “sacrificed” for speed.
Data Types • float • integer • tinyint • varchar(size) • stores strings • size can be between 0 - 255, inclusive • datetime • + more What data types should our attributes (columns) be?
Complete Design Database name: ensmartdb Gene ColumnData Type gene_id integer ensembl_gene_id varchar(50) organism_id integer name varchar(35) locuslink varchar(10) chromosome tinyint chromo_start integer chromo_end integer description varchar(255) Organism ColumnData Type organism_id integer taxonomy_id integer common_name varchar(35) species varchar(35)
Connecting to MySQL from the Command Line mysql -uusername -p Example: >mysql -uroot To EXIT MySQL: EXIT;
Basic SQL Commands • SQL statements end with a semicolon • View databases SHOW DATABASES;
Importing a Database • Creating a database CREATE DATABASE trii; • From the command line: mysql -uusername -ppassworddatabasename < filename.sql • Example: • mysql -uroot trii < trii.sql
Basic SQL Commands • Use database databasename USE databasename; • Display all tables in a database SHOW TABLES;
Create Table database name CREATE TABLE organism ( organism_id INTEGER NOT NULL AUTO_INCREMENT, taxonomy_id INTEGER NOT NULL, common_name VARCHAR(35) NOT NULL, species VARCHAR(35) NOT NULL, PRIMARY KEY (organism_id), UNIQUE (taxonomy_id) ); column names
View column details for a table DESC tablename;
Selecting all data SELECT * FROM tablename;
Select only the columns you need (it will be faster) SELECT common_name, species FROM organism;
Limiting your data • Get the species name for a specific organism (you have the id) SELECT species FROM organism WHERE organism_id=1; How do we select all the gene names for chromosome 1?
Insert • Inserting a gene INSERT INTO gene (ensembl_gene_id, organism_id, name, chromosome, chromo_start, chromo_end) VALUES (‘MY_NEW_GENE’, 1, ‘MY GENE’, 1, 12345, 67890); • Get the id of the gene: SELECT gene_id FROM gene WHERE name='MY GENE';
Delete/Update • Deleting a gene DELETE FROM gene WHERE gene_id=19; • Updating a gene UPDATE gene SET name=‘NEW NAME’ WHERE name=‘AKIP’;
Table Joins • Sometimes you want data from more than one table. To do this we join the tables. The result is a new (temporary) table.
Cross Join • SELECT gene.name, organism.species FROM gene, organism; Note: There are only two records in the gene table with the name “TNFRSF18”. One is a mouse gene and the other is a human gene. What do you think happened?
Cross Join (Continued) Each row of the gene table was joined with every row in the organism table.
Cross Join (Continued) • We want to use the organism id to match a gene record with the correct organism record so that we get: Remember there are two gene records with the name “TNFRSF18”.
Cross Join (Continued) SELECT gene.name, organism.species FROM gene, organism WHERE gene.organism_id=organism.organism_id;
Notice that we have 18 rows and that there are 18 rows in the gene table.
Class #2: Introduction to MySQL (Continued) • Table Joins • Where clause (continued) • Aggregate Functions • Backing up your database • Indexes (if time) • Importing large datasets (if time)