Clusterpoint

Clusterpoint Margarita Sudņika ms11077

RDBMS & NoSQL Databases & tables → Document stores Columns, rows → Schemaless documents Scales UP → Scales UP & OUT Replications → Sharding & Replications For table like data → Unstructured data Legacy & mature → New

Clusterpoint A scalable high-speed NoSQL database technology with Google-like search Manually ranking (svara piešķiršana) Solves 2 bigdataaccessproblems: • Long time waiting for query execution • Querry execution 0,005-0,5 seconds • Loadsofinformation

Application Arhitecture HTTP Clusterpoint server software STAND-ALONE SERVER CLUSTER NODES (multi-server hardware) DOCUMENTS CUSTOMERS CONTACTS PROJECTS MAILS EMPLOYEES

Clusterpoint Data storage model • xml Supported formats • Json • Xml • HTML • Text

Features Full context search Unlimted database size Guaranteed querry size <0,5 s Clustering as default feature Scallable database mirroring Snippets with search hits Web friendly api Flexible data relevancy rules

Access

Search Free text Phrase Wildcards Patterns matches by lookup • John Smith In XML database structure Did you mean “...?” feature Faceted search and navigation Full data index for xml data

API Simple, robust XML messagingXML request/response similar to SOAP Transport • http, https (post, get) • tcp • unix domain socket > 20 API commands Libraries: PHP, .NET (web service)

API message <?xml version=”1.0” encoding=”REQUEST-ENCODING”?> <cpse:request xmlns:cpse=”www.clusterpoint.com”> <cpse:storage>storage name</cpse:storage> <cpse:command>command name</cpse:command> <cpse:timestamp>message date and time</cpse:timestamp> <cpse:requestid>message number</cpse:requestid> <cpse:application>creator of message</cpse:application> <cpse:user>user name</cpse:user> <cpse:password>user password</cpse:password> <cpse:reply_charset>reply encoding</cpse:reply_charset> <cpse:content> </cpse:content> </cpse:request> Lookup <document> <id>document id</id> </document> Insert <document> <id>document id</id> <title>document title</title> <rate>document rate</rate> <info>meta data</info> <site>document id</site> <text>textual information</text> <hidden>information that is not shown</hidden> </document> Search <query> search query </query> <docs> number of documents </docs> <offset> intend from the beginning </offset> <case_sensitive> boolean type parameter</case_sensitive> <relevance> boolean type parameter</relevance> <group_size> maximum from one group</group_size> <rate_from> FROM value </rate_from> <rete_to> TO value </rate_to>

Platform Runs on *nix (tested on Linux and FreeBSD) Written in C/C++ Optimized for multi-core processors Source code is IP of Clusterpointwritten from the scratch PORTS Data tcp: 5550, 80 Unix domains sockets Cluster discovery UDP: 234.25.25.25:5550

Parameters Disk space • 1.,5-2 times more than disk space • Data of 100 GB = 150-200 GB • The amount doesn’t include space for log files, as its possible rotate and backup files, • While file load and indexing size can increase 3-4 times, then return to normal size RAM • more RAM - more cached data –better performance • usually recomended >4 GB

Use ComplementarySolving performance issues and bottlenecks of existing database systems StandaloneApplication is implemented using Clusterpoint DBMS USERS APP server Clusterpoint XML DBMS SQL XML USERS APP server

Thank you

Clusterpoint

Clusterpoint

Presentation Transcript

Clusterpoint Network Traffic Surveillance System