80 likes | 96 Views
SPLITD. Tom Madden NCBI. Problems. Biological databases growing faster than computer memory. Fastest CPU’s get put into 1 or 2 CPU machines first. Most OS’s work better with fewer CPU’s in the box. Splitd solution. Partition search so that it is spread over multiple machines.
E N D
SPLITD Tom Madden NCBI
Problems • Biological databases growing faster than computer memory. • Fastest CPU’s get put into 1 or 2 CPU machines first. • Most OS’s work better with fewer CPU’s in the box
Splitd solution • Partition search so that it is spread over multiple machines. • Break up database “virtually” so the number of chunks can be adjusted on the fly depending upon load, query, etc. • HSP’s (start/stop/score) of alignments are calculated by backends. • Tracebacks are calculated after merging.
Also… • Use MSSQL to store queries and results rather than a home-grown system. • Concatenate queries from different users ala megablast to minimize time spent scanning the database.
FR FR BD BD BD PD PD PD SD = SplitDaemon BD = blastsrvd+blastsrv4+nabrd PD = PartsDaemon MD =MergeDaemon FR = Formatter MD Browser SD MD Blast.cgi MSSQL12 MSSQL20
FR FR BD BD BD PD PD PD SD = SplitDaemon BD = blastsrvd+blastsrv4+nabrd PD = PartsDaemon MD =MergeDaemon FR = Formatter MD Browser SD MD Blast.cgi MSSQL12 MSSQL20
FR FR BD BD BD PD PD PD SD = SplitDaemon BD = blastsrvd+blastsrv4+nabrd PD = PartsDaemon MD =MergeDaemon FR = Formatter MD Browser SD MD Blast.cgi MSSQL12 MSSQL20
FR FR BD BD BD PD PD PD SD = SplitDaemon BD = blastsrvd+blastsrv4+nabrd PD = PartsDaemon MD =MergeDaemon FR = Formatter MD Browser SD MD hsp Blast.cgi traceback MSSQL12 MSSQL20