100 likes | 195 Views
SQL vs GPU. Richard Wilton Johns Hopkins University Department of Physics and Astronomy. Rationale. Terabyte-scale data is best managed in a DBMS Some classes of computation are best implemented using a GPU. How to do GPU computation in a terabyte-scale database.
E N D
SQL vs GPU Richard Wilton Johns Hopkins University Department of Physics and Astronomy
Rationale • Terabyte-scale data is best managed in a DBMS • Some classes of computation are best implemented using a GPU
How to do GPU computation in a terabyte-scale database • Do the computation “outside” the database (export the data / compute / re-import the data) • Do the computation “inside” the database • Something in between
SQLCLR out-of-process server The basic concept • Implement computational functionality in a separate process from the SQL Server process • Access that functionality using interprocess communication (IPC) SQL Server Out-of-process server IPC SQL code Special-case functionality SQLCLR procedure or function Why? • Avoid memory, threading, and permissions restrictions on SQLCLR implementations • Load dynamic-link libraries • Invoke native-code methods • Exploit lower-level APIs (e.g. SqlClient, bulk insert) to facilitate data movement between SQL Server and non-SQL computational resources
SQLCLR out-of-process server implementation SQL Server Out-of-process server IPC SQL code CUDA functionality SQLCLR procedure SQL declare @sql nvarchar(max) set @sql = N'exec SqA1.dbo.SWGPerf @qOffset=311, @chrNum=7, @dimTile=128' exec dbo.ExecSWG @sqlCmd=@sql, @targetTable='##tmpX08' SQLCLR implementation (in C#) • Allocates IPC buffer • Initializes IPC buffer with the result set from a specified SQL query • Signals out-of-process server • Waits for response from out-of-process server • Returns data from the IPC buffer as a SQL result set Out-of-process server implementation (in C#) • Loads the CUDA implementation (dynamic link library) • Invokes a method in the DLL with a reference to the IPC buffer • Waits for completion • Signals the SQLCLR implementation • Stand-alone application or Windows service CUDA implementation (in C++) • Implements GPU functionality • Uses the IPC buffer for both incoming and outgoing data
A test case: genomic sequence data • Nucleic acid sequence data • Data is processed in a “loose” workflow by a sequence of software tools • Example: • Align short sequences to a long reference sequence • Identify alignment regions of interest • Map regions of interest to annotation data (genome map, known variations, etc.)
A test case: genomic sequence data • Typical data-management scenario: • All data is maintained in file-system files • Processing software reads and writes files • “Database-like” operations (e.g. joins) are performed using command-line tools (e.g. Perl scripts) • Can we use a DBMS as a repository? • All data is maintained in the DBMS • Processing software reads and writes tables • “Database-like” operations (e.g. joins) are performed using SQL
A test case: genomic sequence data • Nucleic acid sequence data • DBMS: data repository • GPU: implementation of sequence alignment algorithms • Gapped alignment: Smith Waterman • Non-gapped alignment: bit vector mapping through a hash table
A test case: genomic sequence data • Results: so far, so good … • Sequence data can be managed using a DBMS • Use low-level (C#, C++) implementations to manipulate string and binary data • Sequence data from the DBMS can be processed using GPU implementations • GPU sequence alignment implementations can exploit task parallelism and execute 10x (or more) faster than equivalent CPU implementations • But … • DBMS data abstractions (tables, columns) must be mapped to and from native GPU data formats and layouts • Data movement out of and into the DBMS is slow
Going forward… • Compromise: maintain GPU-formatted data as BLOBs in the file system • BLOBs are in database “native” format • BLOBs correspond to database tables or table partitions • Much programming and testing remain to be done…