160 likes | 417 Views
Unlocking Proprietary Data with PostgreSQL Foreign Data Wrappers. Pat Patterson Principal Developer Evangelist ppatterson@salesforce.com @ metadaddy. Agenda. Foreign Data Wrappers Writing FDW’s in C Multicorn Database.com FDW for PostgreSQL FDW in action.
E N D
Unlocking Proprietary Data with PostgreSQL Foreign Data Wrappers Pat Patterson Principal Developer Evangelist ppatterson@salesforce.com @metadaddy
Agenda • Foreign Data Wrappers • Writing FDW’s in C • Multicorn • Database.com FDW for PostgreSQL • FDW in action
Why Foreign Data Wrappers? • External data sources look like local tables! • Other SQL database • MySQL, Oracle, SQL Server, etc • NoSQLdatabase • CouchDB, Redis, etc • File • LDAP • Web services • Twitter!
Why Foreign Data Wrappers? • Make the database do the work • SELECT syntax • DISTINCT, ORDER BY etc • Functions • COUNT(), MIN(), MAX() etc • JOIN external data to internal tables • Use standard apps, libraries for data analysis, reporting
Foreign Data Wrappers • 2003 - SQL Management of External Data (SQL/MED) • 2011 – PostgreSQL 9.1 implementation • Read-only • SELECT-clause optimization • WHERE-clause push-down • Minimize data requested from external source • Future Improvements • JOIN push-down • Where two foreign tables are in the same server • Support cursors
FDW’s in PostgreSQL • ‘Compiled language’ (C) interface • Implement a set of callbacks typedefstructFdwRoutine{NodeTagtype; /* These functions are required. */GetForeignRelSize_functionGetForeignRelSize;GetForeignPaths_functionGetForeignPaths;GetForeignPlan_functionGetForeignPlan;ExplainForeignScan_functionExplainForeignScan;BeginForeignScan_functionBeginForeignScan;IterateForeignScan_functionIterateForeignScan;ReScanForeignScan_functionReScanForeignScan;EndForeignScan_functionEndForeignScan; /* These functions are optional. */AnalyzeForeignTable_functionAnalyzeForeignTable;} FdwRoutine;
FDW’s in PostgreSQL • Much work! • CouchDBFDW • https://github.com/ZhengYang/couchdb_fdw/ • couchdb_fdw.c > 1700 LoC
Multicorn • http://multicorn.org/ • PostgreSQL 9.1+ extension • Python framework for FDW’s • Implement two methods…
Multicorn from multicorn import ForeignDataWrapper class ConstantForeignDataWrapper(ForeignDataWrapper): def __init__(self, options, columns): super(ConstantForeignDataWrapper, self).__init__(options, columns) self.columns = columns def execute(self, quals, columns): for index in range(20): line = {} for column_name in self.columns: line[column_name] = '%s %s' % (column_name, index) yield line
Database.com FDW for PostgreSQL • OAuth login to Database.com / Force.com • Refresh on token expiry • Force.com REST API • SOQL query • SELECT firstname, lastname FROM Contact • Request thread puts records in Queue, execute() method gets them from Queue • JSON parsing – skip embedded metadat • < 250 lines code
Conclusion • Foreign Data Wrappers make the whole world look like tables! • Writing FDW’s in C is hard! • Or, at least, time consuming! • Writing FDW’s in Python via Multicorn is easy! • Or, at least, quick! • Try it for yourself!
Resources • http://wiki.postgresql.org/wiki/SQL/MED • http://wiki.postgresql.org/wiki/Foreign_data_wrappers • http://multicorn.org/ • https://github.com/metadaddy-sfdc/Database.com-FDW-for-PostgreSQL