200 likes | 360 Views
Management of External Data. SQL/MED and More. Database Seminar HS11/12. Overview. Introduction SQL/MED Linking PostgreSQL & MSSQL Further Information (about SQL/MED) Conclusion. Introduction (1/2). Different Database Managemenent Systems Each system has different benefits
E N D
Management ofExternal Data SQL/MED and More Database Seminar HS11/12
Overview • Introduction • SQL/MED • Linking PostgreSQL & MSSQL • Further Information (about SQL/MED) • Conclusion
Introduction (1/2) • Different Database Managemenent Systems • Each system has different benefits • Possible scenarios • … • …
Introduction (2/2) • SQL/MED gives new opportunities • Use other systems as needed • That’s possible? Really?
SQL/MED (1/3) • SQL/MED defined in ISO/IEC 9075-9:2003 • Management of External Data • Two concepts • Foreign Data Wrappers • Datalinks • At least 10 years old • Not very widespread • Most “googled” information is PostgreSQL related
SQL/MED (2/3) Foreign Data Wrappers AdvanceInitRequest AllocDescriptor AllocQueryContext AllocWrapperEnv Close ConnectServer FreeDescriptor FreeExecutionHandle FreeFSConnection FreeQueryContext FreeReplyHandle FreeWrapperEnv GetAuthorizationId GetBoolVE GetDescriptor GetDiagnostics GetDistinct GetNextReply GetNumBoolVE GetNumChildren GetNumOrderByElems GetNumReplyBoolVE GetNumReplyOrderBy GetNumReplySelectElems GetNumReplyTableRefs GetNumRoutMapOpts GetNumSelectElems GetNumServerOpts GetNumTableColOpts GetNumTableOpts GetNumTableRefElems GetNumUserOpts GetNumWrapperOpts GetOpts GetOrderByElem GetReplyBoolVE GetReplyCardinality GetReplyDistinct GetReplyExecCost GetReplyFirstCost GetReplyOrderElem GetReplyReExecCost GetReplySelectElem GetReplyTableRef GetRoutineMapping GetRoutMapOpt • Access external data • FDW is a library • Programming language neutral • Compile for different OS’s • Good idea – breakthrough? • API • Existing technologies GetRoutMapOptName GetSelectElem GetSelectElemType GetServerName GetServerOpt GetServerOptByName GetServerType GetServerVersion GetSPDHandle GetSQLString GetSRDHandle GetStatistics GetTableColOpt GetTableColOptByName GetTableOpt GetTableOptByName GetTableRefElem GetTableRefElemType GetTableRefTableName GetTableServerName GetTRDHandle GetUserOpt GetUserOptByName GetValExprColName GetValueExpDesc GetValueExpKind GetValueExpName GetValueExpTable GetVEChild GetWPDHandle GetWrapperLibraryName GetWrapperName GetWrapperOpt GetWrapperOptByName GetWRDHandle InitRequest Iterate Open ReOpen SetDescriptor TransmitRequest
SQL/MED (3/3) Data links • Link files like cell values • DBMS becomes “manager” • Only process allowed to change the file • Integrity mechanism • Good idea – breakthrough? • Very OS heavy • Existing technologies
Linking PostgreSQL & MSSQL (1/4) • Microsoft Linked Servers • SQL/MED: Foreign Data Wrappers
Linking PostgreSQL & MSSQL (2/4)Microsoft Linked Servers • OLE DB • Very similar to Foreign Data Wrappers • Connection to "wrappers" via interface • Related to ODBC • Not limited to SQL • C++ instead of C • Widespread • Many OLE DB providers available • Supports ODBC
Linking PostgreSQL & MSSQL (3/4)PostgreSQL Foreign Data Wrappers (1/2) • Using the OBDC_FDW extension • One time • Each time odbc_fdw.so CREATE FOREIGN DATA WRAPPER odbc_fdwLIBRARY 'odbc_fdw.so‘; CREATE EXTENSION odbc_fdw; CREATE SERVER odbc_serverFOREIGN DATA WRAPPER odbc_fdw OPTIONS (dsn‘…DSN…'); passed to FDW CREATE FOREIGN TABLE odbc_table( db_id integer, db_namevarchar(255) ) SERVER odbc_server OPTIONS (… sql_query'select id, name from `dbo`.`table`' …); SELECT passed to FDW
Linking PostgreSQL & MSSQL (4/4)PostgreSQL Foreign Data Wrappers (2/2) • PostgreSQL proprietary API for FDWs • ‘C’ Code • Method pointer in header odbc_fdw_handler(PG_FUNCTION_ARGS) { FdwRoutine *fdwroutine = makeNode(FdwRoutine); fdwroutine->PlanForeignScan = odbcPlanForeignScan; fdwroutine->ExplainForeignScan = odbcExplainForeignScan; fdwroutine->BeginForeignScan = odbcBeginForeignScan; fdwroutine->IterateForeignScan = odbcIterateForeignScan; fdwroutine->ReScanForeignScan = odbcReScanForeignScan; fdwroutine->EndForeignScan = odbcEndForeignScan; PG_RETURN_POINTER(fdwroutine); }
Further Information about SQL/MED (1/4) • Query costs • Interesting applications
Further Information about SQL/MEDQuery costs (1/2) • Consider the following tables • Row count of a JOIN statement (all employees) • Best case 500 rows • Worst case 100’000 rows • Best execution strategy • External system performs JOIN? • Perform JOIN locally? 200 500
Further Information about SQL/MEDQuery costs (2/2) SQL Server • Costs • 1.0 $ per transferred row • 0.1 $ per local join operation • Strategy #A • SELECT * FROM Employee JOIN City • Strategy #B • SELECT * FROM Employee • SELECT * FROM City • Clear win for #A • Important to implement PlanForeignScan PlanForeignScan FDW External System • 100’000$ Worst Case Scenario • Local JOIN 500x200 = 10’000$ • 500$ • 200$
Further Information about SQL/MEDInterestingapplications • Extension www_fdw to query all Restful Webservices CREATE SERVER google_server FOREIGN DATA WRAPPER www_fdw OPTIONS (uri 'https://ajax.googleapis.com/search/web?v=1.0'); CREATE FOREIGN TABLE google_table( titletext, snippettext, linktext, q text) SERVER google_server; Field legend • Response • Request select * from google_tablewhere q =’cat dog’ limit 1; title | snippet | link -------------------+----------------------------------------------------- CatDog – Wikipedia | CatDog is an American... | http://en.wikipedia...
Conclusion • Great concepts • FDW: Accessing external data via standard interfaces • Datalink: Create secure links fromtuplestofiles • Drawbackswhichpreventthebreakthrough • Fartoocomplex API • Existingtechnologies(Microsoft, Oracle) • Documentation • Do wereallyneedit? • Most environmentsarebased on 1 servertechnology • Usebuilt-in "MED" (Linked Servers, DBLink) • Other waystosolveproblems • Manyyearstostablerelease
Outlook • Relies on community • Stable wrappers needed • Other DBMS needto push it • Uncertainfuture
The End • Questions?