260 likes | 392 Views
Introduction to Computing Using Python. Data Storage and Processing. How many of you have taken IT 240? Databases and Structured Query Language Python Database Programming. Data storage. Introduction to Computing Using Python.
E N D
Introduction to Computing Using Python Data Storage and Processing • How many of you have taken IT 240? • Databases and Structured Query Language • Python Database Programming
Data storage Introduction to Computing Using Python We wish to store data about Web pages in a way that Python programs can access the data conveniently three.html five.html Chicago × 3 • Beijing × 6 Nairobi× 7 • Bogota × 2 Beijing × 3 • Paris × 5 • Chicago × 5 Chicago× 3 • Paris × 2 • Nairobi× 1 two.html four.html one.html Bogota× 3 • Beijing × 2 • Paris× 1
Data storage Introduction to Computing Using Python To do this, we will use a database three.html five.html Chicago × 3 • Beijing × 6 Nairobi× 7 • Bogota × 2 Beijing × 3 • Paris × 5 • Chicago × 5 Chicago× 3 • Paris × 2 • Nairobi× 1 two.html four.html one.html Bogota× 3 • Beijing × 2 • Paris× 1
Databases Introduction to Computing Using Python Keywords A database consists of one or more tables Each table has a name and consists of rows (records) andcolumns (attributes) Each attribute has a name and contains data of a specific type Hyperlinks
Database files Introduction to Computing Using Python Database files are not text files – you can’t read from or write to them directly Instead, communication is performed by commands written in a database language called Structured Query Language (SQL)
SQL SELECT FROM statement Introduction to Computing Using Python SQL statement SELECT is used make queries into a database. The result called a result table Hyperlinks SELECT Link FROM Hyperlinks result table
SQL SELECT FROM statement Introduction to Computing Using Python SQL statement SELECT is used make queries into a database. Keywords SELECT Url, Word FROM Keywords Result
SQL SELECT FROM statement Introduction to Computing Using Python SELECT statements can use *, a wild card Hyperlinks SELECT * FROM Hyperlinks
SQL DISTINCT keyword Introduction to Computing Using Python SQL keyword DISTINCT removes duplicate records in the result table Hyperlinks SELECT DISTINCT Link FROM Hyperlinks
SQL WHERE clause Introduction to Computing Using Python Keywords “In which pages does word X appear in?” SQL clause WHERE is used to select only those records that satisfy a condition SELECT Url FROM Keywords WHERE Word = 'Paris'
SQL WHERE clause Introduction to Computing Using Python SQL clause WHERE is used to select only those records that satisfy a condition SELECT Column(s) FROM Table WHERE Column BETWEEN value1 AND value2 SELECT Column(s) FROM Table WHERE Column operator value
Exercise Introduction to Computing Using Python • Write an SQL query that returns: • The URL of every page that has a link to web page four.html Keywords SELECT DISTINCT Url FROM Hyperlinks WHERE Link = 'four.html' Hyperlinks
Exercise Introduction to Computing Using Python • Write an SQL query that returns: • The URL of every page that has an incoming link from page four.html Keywords SELECT DISTINCT Link FROM Hyperlinks WHERE Url = 'four.html' Hyperlinks
Exercise Introduction to Computing Using Python • Write an SQL query that returns: • The URL and word for every word that appears exactly three times in the web page associated with the URL Keywords SELECT Url, Word from Keywords WHERE Freq = 3 Hyperlinks
Exercise Introduction to Computing Using Python • Write an SQL query that returns: • The URL, word, and frequency for every word that appears between 3 and 5 times, inclusive, in the web page associated with the URL Keywords SELECT * from Keywords WHERE Freq BETWEEN 3 AND 5 Hyperlinks
SQL built-in functions Introduction to Computing Using Python Keywords “How many pages contain the word Paris?” SQL includes built-in math functions such as COUNT() and SUM() SELECT COUNT(*) FROM Keywords WHERE Word = 'Paris' There are 3 web pages that mention Paris
SQL built-in functions Introduction to Computing Using Python Keywords SQL includes built-in math functions such as COUNT(), SUM() and AVG() SELECT SUM(Freq) FROM Keywords WHERE Word = 'Paris' There are a total of 8 occurrances s of ‘Paris’ on these web pages
Another example database Introduction to Computing Using Python weather.db contains two tables: weatherdata (city text, country text, season int, temperature float) seasons (attributes name text, number int) weatherdata seasons
Introduction to Computing Using Python • SQL queries involving multiple tables “What is the average summer temperature in Mumbai’?” • Assume we don’t know the number coding of seasons, then this question requires a lookup of both tables: • Use seasons to find match to season name • Useweatherdata to find temperature
Standard Library module sqlite3 Introduction to Computing Using Python The Python Standard Library includes module sqlite3 that allows Python programs to access databases >>> import sqlite3 >>> con = sqlite3.connect('web.db') • sqlite3 function connect() takes as input the name of a database and returns an object of type Connection, a type defined in module sqlite3 • The Connection object con is associated with database file web.db • If database file web.db does not exists in the current working directory, • a new database file web.db is created
Standard Library module sqlite3 Introduction to Computing Using Python >>> import sqlite3 >>> con = sqlite3.connect('web.db') >>> cur = con.cursor() • Connection method cursor() returns an object of type Cursor, another type defined in the module sqlite3 • Cursor objects are responsible for executing SQL statements
Standard Library module sqlite3 Introduction to Computing Using Python • The Python Standard Library includes module sqlite3 provides an API for accessing database files • It is an interface to a library of functions that accesses the database files directly >>> import sqlite3 >>> con = sqlite3.connect('web.db') >>> cur = con.cursor() >>> cur.execute("CREATE TABLE Keywords (Url text, Word text, Freq int)") <sqlite3.Cursor object at 0x100575730> >>> import sqlite3 >>> con = sqlite3.connect('web.db') >>> cur = con.cursor() >>> cur.execute("CREATE TABLE Keywords (Url text, Word text, Freq int)") <sqlite3.Cursor object at 0x100575730> >>> cur.execute("INSERT INTO Keywords VALUES ('one.html', 'Beijing', 3)") <sqlite3.Cursor object at 0x100575730> The Cursor class supports method execute() which takes an SQL statement as a string, and executes it Hardcoded values
Parameter substitution Introduction to Computing Using Python In general, the values used in an SQL statement will not be hardcoded in the program but come from Python variables >>> cur.execute("INSERT INTO Keywords VALUES ('one.html', 'Beijing', 3)") <sqlite3.Cursor object at 0x100575730> >>> url, word, freq = 'one.html', 'Paris', 5 >>>
Querying a database Introduction to Computing Using Python The result of a query is stored in the Cursor object To obtain the result as a list of tuple objects, Cursor method fetchall() is used >>> import sqlite3 >>> con = sqlite3.connect('links.db') >>> cur = con.cursor() >>> cur.execute('SELECT * FROM Keywords') <sqlite3.Cursor object at 0x102686960> >>> cur.fetchall() [('one.html', 'Beijing', 3), ('one.html', 'Paris', 5), ('one.html', 'Chicago', 5), ('two.html', 'Bogota', 5), ('two.html', 'Beijing', 2), ('two.html', 'Paris', 1), ('three.html', 'Chicago', 3), ('three.html', 'Beijing', 6), ('four.html', 'Chicago', 3), ('four.html', 'Paris', 2), ('four.html', 'Nairobi', 5), ('five.html', 'Nairobi', 7), ('five.html', 'Bogota', 2)] >>>
Querying a database Introduction to Computing Using Python An alternative is to iterate over the Cursor object >>> cur.execute('SELECT * FROM Keywords') <sqlite3.Cursor object at 0x102686960> >>> for record in cur: print(record) ('one.html', 'Beijing', 3) ('one.html', 'Paris', 5) ('one.html', 'Chicago', 5) ('two.html', 'Bogota', 5) ('two.html', 'Beijing', 2) ('two.html', 'Paris', 1) ('three.html', 'Chicago', 3) ('three.html', 'Beijing', 6) ('four.html', 'Chicago', 3) ('four.html', 'Paris', 2) ('four.html', 'Nairobi', 5) ('five.html', 'Nairobi', 7) ('five.html', 'Bogota', 2) >>>
Exercises Introduction to Computing Using Python In week10exercisesstart.py