470 likes | 1.02k Views
NoSQL Databases Oracle - Berkeley DB. Content. A brief intro to NoSQL About Berkeley Db About our application. ???. 3. What is NoSQL?. Stands for N ot O nly SQL Class of non-relational data storage systems
E N D
Content • A brief intro to NoSQL • About Berkeley Db • About our application
??? 3
What is NoSQL? • Stands for Not Only SQL • Class of non-relational data storage systems • Usually do not require a fixed table schema nor do they use the concept of joins, group by, order by and so on. • All NoSQL offerings relax one or more of the ACID properties.
What is NoSQL ? • Next generation databases • Characteristic: • Large Data Volumes • Non-relational • Distributed • Open-source • Scalable replication and distribution
History of NoSQL • The term NoSQL was introduced by Carl Strozzi in 1998 to name his file based database. • It was again re-introduced by Eric Evans when an event was organized to discuss open source distributed databases. 8
Why NoSQL Databases ? • Bigness • Massive write performance • Fast key-value access • Flexible schema and Flexible data types • No single point of failure • Programming ease of use
Berkeley DB - Introduction • An open-source, embeddedtransactionaldata management system. • A key/value store. • Runs on everything from cell phone to large servers. • Distributed as a library that can be linked directly into an application. • Berkeley DB has high reliability and high performance.
Berkeley DB: The Design Philosophy • Provide mechanisms without specifying policies. • For example, Berkeley DB is abstracted as a store of <key, value> pairs. • Both keys and values are opaque byte-strings. • Berkeley DB has no schema. • Application that embeds Berkeley DB is responsible for imposing its own schema on the data.
Data Access Services • Indexing methods • B-Tree • Hash • Queue • A record-number-based index
Advantages of <key, value> pairs • An application is free to store data in whatever form is most natural to it. • Objects (like structures in C language) • Rows in Oracle, SQL Server • Columns in C-store • Different data formats can be stored in the same databases.
Data Management Services • Concurrency • Transactions • Recovery
Berkeley DB Applications • Local Directory Access Protocol • Mail Servers • Manage access control lists • Store user keys in a public-infrastructure • Record machine-to-network address mappings in address servers
Berkeley DB for Computationally Intensive Algorithms • Algorithms that repeatedly execute a computationally intensive operation • E.g. Factorial • Useful to create a cache containing the already computed results • Cache = Set of <key,value> pairs containing <n, factorial(n)> • Advantages: • avoid to re-compute results for the same input (even over different executions) • In a process crash, we can still start again the process and quickly go back to the point where it stopped
In memory map • Simple • Very efficient (b/s in completely memory) • Need considerable amount of memory • No fault tolerance (We need to manually save data to a file) • Relation Databases • ACID properties may not be necessary • Cannot handle Big data • Slow • NoSQL databases (Berkeley DB) • Fast key-value access • Flexible schema and Flexible data types • Ease of use • Fault tolerance
Open Environment: • EnvironmentConfig class specify environment configuration parameters • Open Class Catalog: • Class catalog : specialized database store that contain java class descriptions of all serialized objects stored in the database • Create Database and StoredClassCatalog object
Open Database: • Close Environment, Class Catalog and Databases: