360 likes | 484 Views
Database Management Systems. Chapter 11 Distributed Databases. Definition Advantages / Uses Problems / Complications Client-Server / SQL Server Microsoft Access. Distributed Databases. SELECT Sales FROM Britain.Sales UNION SELECT Sales FROM France.Sales UNION SELECT Sales
E N D
Database Management Systems Chapter 11 Distributed Databases
Definition Advantages / Uses Problems / Complications Client-Server / SQL Server Microsoft Access Distributed Databases SELECT Sales FROM Britain.Sales UNION SELECT Sales FROM France.Sales UNION SELECT Sales FROM Italy.Sales Germany Britain France Italy
Multiple independent databases Each DBMS is a complete DBMS (engine, queries, locking, transactinos, etc.) Usually on different machines. Usually in different locations. Connected by a network. Might be different environments Hardware Operating System DBMS Software Distributed Database Definition Database Apollo Database Zeus England France Database Athena United States
C.J. Date Rule 0: Transparency: the user should not know or care that the database is distributed. Local autonomy. No reliance on a central site. Continuous operation. Location independence. Fragmentation independence (physical storage). Replication independence. Distributed query processing. Distributed transaction management. Hardware independence. Operating system independence. Network independence. DBMS independence. Distributed Database Rules
Each database can continue to run even if portion fails. Data and hardware can be moved without affecting operations or users. Expanding operations. Performance issues. System expansion and upgrades. Add new section without affecting others. Upgrade hardware, network and DBMS. Distributed Features
Business operations are often distributed Work and data are segmented by department. Work and data are segmented by geographical location. Improved performance Most updates and queries are performed locally. Maintain local control and responsibility over data. Can still combine data across the system. Scalability and expansion Add on, not replacement. Advantages and Applications local transactions future expansion
Creating a Distributed Database • Design administration plan. • Choose hardware and DBMS vendor, and network. • Set up network and DBMS connections. • Choose locations for data. • Choose replication strategy. • Create backup plan and strategy. • Create local views and synonyms. • Perform stress test: loads and failures.
Networks are slow Drives: 10 - 20 MB per sec. LANs: 1 - 10 MB per sec. WANs: 0.01 - 5 MB per sec. Faster is possible but expensive! Goal is to minimize transmissions. Each system must be capable of evaluating queries--preferably SQL. Results depend heavily on how the system joins tables. Distributed Query Processing WAN 0.1 - 5 MB 1 - 10 MB LAN 10 - 20 MB Disk drive
Example NY: Customers: 1 M rows LA: Production: 10 M rows Chicago: Sales: 20 M rows Query: List customers who bought blue products on 1-Mar-01 Bad idea #1 Transfer all rows to Chicago Then JOIN and select. Better idea #2 (probably) Transfer blue products from LA to Chicago Better idea #3 Get sale items on 1-Mar-01 Get blue products from LA Send C# to NY Distributed Query Processing NY Customers(C#, …) 1,000,000 C# list from desired P# Chicago Matching Customer data Sales(S#, C#, Sdate) 20,000,000 SaleItem(S#, P#,…) 50,000,000 P# sold on 1-Mar-01 Blue P# sold on 1-Mar-01 LA Products(P#, Color…) 10,000,000
Goals Minimize transmissions Improve performance Support heavy multiuser access. Problems Updating copies Bulk transmissions Site unavailable Concurrency Easier for two people to change the same data at the same time. Decision support systems. Data warehouse. Data Replication Market research & data corrections. Britain Britain: Customers & Sales France: Customers & Sales Spain: Customers & Sales Periodic updates Spain Britain: Customers & Sales France: Customers & Sales Spain: Customers & Sales Update data.
Each DBMS must maintain lock facility. To update, each DBMS must utilize and recognize other lock mechanisms and return codes. Each DBMS must have a deadlock resolution protocol that recognizes the distributed databases. Random wait. Optimistic updates. Two-phase commit. Concurrency and Locks DBMS #1 Accounts Jones 8898 Transaction A Locked Waiting Transaction B Waiting Locked DBMS #2 Accounts Jones 3561
Two (or more) separate lock managers. DBMS initiating update serves as the coordinator. Two phases Coordinator sends message and data to all machines to “get ready.” Local machines save data in logs, verify update status and return message. If all locals report OK, then coordinator writes log and instructs others to proceed. If any fail, it sends Rollback message. Transactions & Two-Phase Commit Database 1 Initiate Transaction 1. Prepare to commit. All agree? 2. Commit Database 2 Lock tables. Save log. Database 3 Update all tables.
Database Links Full database names. CONNECT command. Linking through synonyms. CREATE SYNONYM … Central control over permissions. Linking through Views/queries. CREATE VIEW AS … Can assign local permissions. Linking through stored procedures. DELETE … Strong control over actions. Distributed Databases In Oracle Schema.Table@Location Scott.Emp@hq.acme.com Server database Synonym: Employee Procedure: DELETE FROM Employee WHERE ... View User can only run procedure. No other access. user permissions
Client-Server Server Server Shared Database Front-end User Interface Clients Clients
Not a distributed database. Data file stored on server. Server is passive, appears as giant disk drive to PC. PC processes all data. Retrieves all needed data across the network. Performance improvements. Indexes are crucial. Store some data on each PC (replication). Store applications on PC (graphics & forms). Convert to SQL-Server LAN File Server File Server DBMS data file Application Shared Data All data from all tables are read by PC, which performs JOIN and WHERE test. If available, reads index first. SELECT Name, SaleDate FROM Customer INNER JOIN Sales ON Customer.C# = Sales.C# WHERE SaleDate BETWEEN #1-Mar-97# AND #9-Mar-97#;
LAN File Server: Slow File Server MyFile.mdb Forms CustID Name … 115 Jenkins … 125 Juarez ... Order ... Application and query transferred. DBMS software transferred. One row at a time transferred, until all rows are examined. SELECT * FROM Customer WHERE City = “Sandy”
One machine machine is dominant (server) and handles data for many clients. Client machines handle front-end tasks and small data tables that are not shared. Client-Server Databases File Server DBMS SQL Server Shared Data Return matching data. Send SQL statement. SELECT . . . application
Access is a client-side database. In file server environment. As a client to a database server. Attach or Link to other databases. Link to an Access database is file server. Link through ODBC is database server. ODBC: Open DataBase Connectivity Pass-Through Queries Microsoft Access
Microsoft connectivity standard. Most DBMS companies provide drivers. SQL Server, Oracle, Ingres, etc. Driver is installed on PC and Server. ODBC handles: Login to database. Send query. Interpret result codes. Exchange data. Open Database Connectivity: ODBC Server Computer Database Server ODBC driver SELECT … Results Client Computer ODBC driver Link Tables Access
Set the connection string. Datatype: odbc DSN: data source name holds linkage data--built with Windows Control Panel or RegisterDatabase. uid: login Username pwd: login Password database: full database name on remote system. ODBC Basics
ADO and Direct Connections Server Computer Database Server The Database vendor provides its own data transport (e.g,. Oracle or SQL Server) installed on the server and the client. ADO provides a driver that connects your application to the transport services. ODBC can serve as the data transport if nothing else is available DBMS transport SELECT … Results Client Computer DBMS transport ADO Visual Basic application
Server Databases Client front-end Middle Locate databases Business rules Program code Three-Tier Client-Server Database Servers Databases. Transactions. Legacy applications. Database links. Business rules. Program code. Middleware Application. Front-end. User Interface. Client
The Internet as Client-Server information Internet Router Router Server request Client Browser Web Server http://server.location/page HTML pages Forms Graphics
HTML Limited Clients <HTML> <HEAD> <TITLE>My main page</TITLE></HEAD> <BODY BACKGROUND=“graphics/back0.jpg”> <P>My text goes in paragraphs.</P> <P>Additional tags set <B>boldface</B> and <I>Italic</I>. <P>Tables are more complicated and use a set of tags for rows and columns.</P> <TABLE BORDER=1> <TR><TD>First cell</TD><TD>Second cell</TD></TR> <TR><TD>Next row</TD><TD>Second column</TD></TR> </TABLE> <P>There are form tags to create input forms for collecting data. But you need CGI program code to convert and use the input data.</P> </BODY> </HTML>
Web Server Database Fundamentals Request Server/Form.html 0 3 Client/Browser Database Page = Template + Result Data 2 1 2 3 CGI String SQL Server HTML Form Action= “Query.asp” Result 1 Query Form Web Server (IIS) Result Page 1 2 HTML form Query Template + Code Form.html Query.asp
Database Example: Client Side Request Server/Form.html 0 Server Initial form 1 Results 3 Call ASP page 2
Form Code 1 <HTML><HEAD><TITLE>Catalog Search</TITLE></HEAD> <BODY TEXT="#000000" BGCOLOR="#ffffff"> <FORM ACTION="PetStoreSearch.asp" METHOD="Put"> <P> <SELECT NAME="Category"> <OPTION SELECTED VALUE="Cat">Cat <OPTION VALUE="Dog">Dog <OPTION VALUE="Fish">Fish </SELECT> Category</P> <P> <INPUT TYPE="TEXT" NAME="Color" VALUE="Black"> Color</P> <P> <INPUT TYPE="SUBMIT" VALUE="Search" NAME="Submit"></P> </FORM></BODY> </HTML>
ASP Code Structure Connect to the Database Create the SQL Connect to the Database/Recordset Loop through the Recordset Get a Field Display it Move to the next row End Loop 2 Note: Access will create the basic structure. Export a query as asp. Then edit the file. Set objConn = CreateObject("ADODB.Connection") Set rst = CreateObject("ADODB.Recordset") sql = " SELECT …" rst.Open sql … rst.MoveFirst do while Not rs.eof Response.Write Server.HTMLEncode(rst(”LastName")) rst.MoveNext Loop
ASP Query: Build SQL <HTML><HEAD><TITLE>Animal Search Results</TITLE></HEAD> <BODY> <% Set objConn = Server.CreateObject("ADODB.Connection") objConn.open "PetStore","","" sql = "SELECT AnimalID, Name, Category, Breed, DateBorn, Color, ListPrice " sql = sql & "FROM Animal WHERE (Category = '" sql = sql & Request.Form("Category") & "’) AND (Color LIKE '*" sql = sql & Request.Form("Color") & "*’)" Set rst = Server.CreateObject("ADODB.Recordset") rst.Open sql, objConn, 3, 3 %>
ASP Query: Create Table <TABLE BORDER=1><CAPTION><B>ASPSearch1</B></CAPTION> <THEAD><TR> <TH>AnimalID</TH> <TH>Name</TH> <TH>Category</TH> <TH>Breed</TH> <TH>DateBorn</TH> <TH>Color</TH> <TH>ListPrice</TH> </TR></THEAD><TBODY> <% On Error Resume Next rst.MoveFirst do while Not rst.eof %> <TR VALIGN=TOP> <TD><%=Server.HTMLEncode(rst("AnimalID"))%> <BR></TD> <TD><%=Server.HTMLEncode(rst("Name"))%> <BR></TD> <TD><%=Server.HTMLEncode(rst("Category"))%> <BR></TD> <TD><%=Server.HTMLEncode(rst("Breed"))%> <BR></TD> <TD><%=Server.HTMLEncode(rst("DateBorn"))%> <BR></TD> <TD><%=Server.HTMLEncode(rst("Color"))%> <BR></TD> <TD><%=Server.HTMLEncode(rst("ListPrice"))%> <BR></TD> </TR> <% rst.MoveNext loop %> </TBODY></TABLE></BODY></HTML>
Client-Server Data Transfer Order Form Order ID 1015 Jones, Martha Customer Order Date 12-Aug What if there are 10,000 customers? How much time to load the combo box? How do you refresh/reload the combo box? Alternatives?
<!ELEMENT OrderList (Order+)> <!ELEMENT Order (OrderID,OrderDate,Comment?,ShippingCost,Items+)> <!ELEMENT OrderID (#PCDATA)> <!ELEMENT OrderDate (#PCDATA)> <!ELEMENT Comment (#PCDATA)> <!ELEMENT ShippingCost (#PCDATA)> <!ELEMENT Items (Item+)> <!ELEMENT Item (ItemID,Description,Quantity,Price)> <!ELEMENT ItemID (#PCDATA)> <!ELEMENT Description (#PCDATA)> <!ELEMENT Quantity (#PCDATA)> <!ELEMENT Cost (#PCDATA)> XML: Transferring Data DTD: Document Type Definition is hierarchical. Order: OrderID, OrderDate Item: ItemID, Quantity, Cost + Repeats 1 or more ? Optional * Repeats 0 or more #PCDATA: parsed character data Item: ItemID, Quantity, Cost Item: ItemID, Quantity, Cost
<?xml version="1.0"?> <!DOCTYPE OrderList SYSTEM "orderlist.dtd"> <OrderList> <Order> <OrderID>1</OrderID> <OrderDate>3/6/2001</OrderDate> <ShippingCost>$33.54</ShippingCost> <Comment>Need immediately.</Comment> <Items> <ItemID>30</ItemID> <Description>Flea Collar-Dog-Medium</Description> <Quantity>208</Quantity> <Cost>$4.42</Cost> <ItemID>27</ItemID> <Description>Aquarium Filter & Pump</Description> <Quantity>8</Quantity> <Cost>$24.65</Cost> </Items> </Order> </OrderList> XML Data Example XML: extensible markup language