140 likes | 165 Views
brief introduction to relational database and big data analysis. Kunihiko Kaneko. Relational Database. Problems in data sharing Data is encoded in data files Other users can understand the data files ? Relational Database Relational Database is a standard of the followings
E N D
brief introduction torelational database andbig data analysis Kunihiko Kaneko
Relational Database • Problems in data sharing • Data is encoded in data files • Other users can understand the data files ? • Relational Database Relational Database is a standard of the followings • data format (i.e. the way to encode data) • data operations (query and update) • the way to describe data format • the way to describe constraints
describe data format relational database a relational database is a set of tables product(id, product_name, type, cost, created_at) data format description score(name, score, student_name, created_at, updated_at) data format description table_name(attribute name 1, attribute name 2, ...)
describe constraints score(name, score, student_name, created_at, updated_at) data format description constraints description (SQL language) keywords: INTEGER, REAL, TEXT, DATETIME NOT NULL, UNIQUE, PRIMARY KEY, etc
Data format of relational database relational database a relational database is a set of tables each table is a set of rows
list of the table names in a database database command editor a table Database Browser (SQLiteman)
description of data formats and constraints data sources various data formats relational database for data storage interactive command (written in SQL Language) programs (embedded SQL statements in a programming language)
cat >/tmp/a.$$.sql <<-SQL create table quote ( seq INTEGER PRIMARY KEY NOT NULL, at datetime, USD real, GBP real, EUR real, CAD real, CHF real, SEK real, DKK real, NOK real, AUD real, NZD real, ZAR real, BHD real, IDR100 real, CNY real, HKD real, INR real, MYR real, PHP real, SGD real, KRW100 real, THB real, KWD real, SAR real, AED real, MXN real, PGK real, HUF real, CZK real, PLN real, RUB real, TRY real, a01 real, IDR100b real, CNYb real, MYRb real, KRW100b real, TWD real ); SQL cat /tmp/a.$$.sql | sqlite3 /tmp/quotedb cat >/tmp/a.$$.sql <<-SQL .mode csv .import /tmp/a.$$.csv quote SQL # tail -n +2 /tmp/Book1.csv > /tmp/a.$$.csv cat /tmp/a.$$.sql | sqlite3 /tmp/quotedb date a program to read the data source and store into database Currency exchange data data source description of data formats and constraints M <- table_to_melt(T, T$at, "%Y/%m/%d") # ggplot(M, aes(x=Date, y=Value, colour=factor(AttrNum))) + geom_point(size=1); Plot program
Fukuoka-City map data A Digital elevation map data Plot Examples using Relational Database
A Point Cloud data A Polygon data Three-dimensional Plot Examples using Relational Database
Summary • Relational Database is easy • Describing data format and constraints is easy • Database browser (such as SQLiteman) • Relational Database can handle various type of data • Spatial • Temporal • There are already many types of data analysis methods