130 likes | 294 Views
This presentation gives an overview of the Apache CouchDB project. It explains CouchDB architecture in relation to replication, usage, its UI and the platforms it is available for. <br> <br>Links for further information and connecting<br><br>http://www.amazon.com/Michael-Frampton/e/B00NIQDOOM/<br><br>https://nz.linkedin.com/pub/mike-frampton/20/630/385<br><br>https://open-source-systems.blogspot.com/
E N D
What Is Apache CouchDB ? ● A document oriented NoSQL database ● Open sourced / Apache 2.0 license ● Written in Erlang, JavaScript, C, C++ ● Stores documents using JSON ● Single node or cluster ● Takes offline first approach / uses bi directional replication ● DB access via HTTP requests
How Does CouchDB Work? ● It provides ACID support (Atomic Consistent Isolated Durable) ● It has a crash-only design – No shutdown, just termination ● CouchDB uses Multi-Version Concurrency Control (MVCC) ● OS crash or power failure – Partially flushed updates are simply forgotten (or) – Surviving copy of previous identical headers remains – Ensures coherency of all previously committed data ● Crash friendly design
Cross Platform ● Available for – Linux / Unix – FreeBSD – Windows – Mac OSX – Cloud – Mobile ( IOS / Android – Lite version ) ● Install from binary or source ● Install via Docker / Snap ● Install on Kubernetes
CouchDB Replication ● Synchronise two copies of same database ● One source and one target database ● Can be on same or different CouchDB instances ● Can be one way or bi directional ( Master – Master ) ● Controlling documents to replicate – Local documents never replicated – Filter functions to select documents – Use Selector Objects ●A query object to test document ●For replication
CouchDB Cluster ● CouchDB can be single node or clustered ● Cluster defined by – Number of shards or parts of database (q) – Number of document copies / replicas (n) ● Since V3 default is q=2, n=3 – Each database (and secondary index) – Split into 2 shards, with 3 replicas per shard – For a total of 6 shard replica files
CouchDB Cluster ● Replicas add failure resistance ● Some nodes can be offline ● Without everything crashing down – n=1 - All nodes must be up. – n=2 - Any 1 node can be down – n=3 - Any 2 nodes can be down ● Using default values and a single database – q x n = 2 x 3 = 6 nodes – A maximum of six nodes – Defines maximum nodes for horizontal scaling
CouchDB UI ● Fauxton CouchDB UI simplifies access ● Manage cluster or single node ● Manage CouchDB – Databases – Active tasks – Configuration – Replication – Users ● Access documentation ● Verify CouchDB install
CouchDB + CAP Theorum ● CAP Theorum examines – Consistency ●All database clients see the same data, even with concurrent updates. – Availability ●All database clients are able to access some version of the data. – Partition tolerance ●The database can be split over multiple servers ● CouchDB provides eventual consistency by – By balancing partition tolerance and availability
Available Books ● See “Big Data Made Easy” Apress Jan 2015 – See “Mastering Apache Spark” ● Packt Oct 2015 – See “Complete Guide to Open Source Big Data Stack ● “Apress Jan 2018” – ● Find the author on Amazon www.amazon.com/Michael-Frampton/e/B00NIQDOOM/ – Connect on LinkedIn ● www.linkedin.com/in/mike-frampton-38563020 –
Connect ● Feel free to connect on LinkedIn –www.linkedin.com/in/mike-frampton-38563020 ● See my open source blog at open-source-systems.blogspot.com/ – ● I am always interested in – New technology – Opportunities – Technology based issues – Big data integration