Couchbase Server is an open-source, distributed multi-model NoSQL document-oriented database software package that is optimized for interactive applications.  It has a long history and evolution.  It natively manipulates data in key-value form or in JSON documents.  Nevertheless Couchbase may be used to store non-JSON data for various use cases.


Hackolade was specially adapted to support the data modeling of multiple object types within one single bucket, while supporting multiple buckets as well.  The data model in the picture below results from the reverse-engineering of the sample travel application described here.

Buckets

There is a fundamental difference with many other NoSQL document databases: Couchbase strongly suggests to store documents of different kinds into the same "bucket".  A bucket is equivalent to a database. Objects of different characteristics or attributes are stored in the same bucket. It may seem counter-intuitive when moving from a RDBMS or MongoDB, but records from multiple tables should be stored in a single bucket, with a “type” attribute to differentiate the various objects stored in the bucket.  


Most deployments have a low number of buckets (usually 2 or 3) and only a few upwards of 5. Although there is no hard limit in the software, the maximum of 10 buckets comes from some known CPU and disk IO overhead of the persistence engine and the fact that Couchbase allocates a specific amount of memory to each bucket.  


But having multiple buckets is something that can be quite useful for different use cases:

- multi-tenancy: you want to be sure all data are separated

- different types of data: you can for example store all documents (JSON) in one bucket, and use another one to store "binary" content. The setup would have a bucket with views, and the other one without any.  

- for data with differing caching and RAM quota needs, compaction requirements, availability requirements and IO priorities, buckets act as the control boundary.


For example, if you choose to create 1 replica for medical-codes data that contain drug, symptom, and operation codes for a standard based electronic health record.  This data can be recovered easily from other sources, so a single replica may be fine.  However, patient data may require higher protection with 2 replicas.  To achieve better protection for patient data without wasting additional space for medical-codes you could choose separate buckets for these 2 types of information.


There are 2 types of buckets, each with its properties: Couchbase buckets and Memcached buckets.


Document kind

When mixing different kinds of objects into the same bucket, it becomes necessary to specify a "type" attribute to differentiate the various objects stored in the bucket.  In Hackolade, each Document Kind is modeled as a separate entity or box, so its attributes can be defined separately.  A specific attribute name must be identified to differentiate the different document kinds.  The unique key and the document kind field are common to all document kinds in the bucket, and displayed at the top of each box in the ERD document:


Keys

Another modeling characteristic distinguishes Couchbase from some other NoSQL document databases: the unique key of each document is stored 'outside' the JSON document itself.  Couchbase was originally a key-value store.  With version 2.0, Couchbase bridged the gap to being a multi-model database supporting JSON documents.  In essence, the key part remains, and the value part can also be a JSON document.  The fundamental difference is that a pure key-value database doesn't understand what's stored in the value, while a document database understands the format in which documents are stored and can therefore provide richer functionality for developers, such as access to documents through queries.


Couchbase does not automatically generate IDs.  Document IDs are assigned by the software application.  A valid document ID must:

  • Conform to UTF-8 encoding
  • Be no longer than 250 bytes

Users are free to choose any ID for their document, so long as they conform to the above restrictions.  This feature can be leveraged to define natural keys where possible, so they can be human-readable, deterministic, and semantic.


Attributes data types

Couchbase attributes support standard JSON data types, including lists and sets (arrays), and maps (objects).  The Hackolade menu items, contextual menus, toolbar icon tooltips, and documentation are adapted to Couchbase's terminology and feature set.  The following words are reserved.  


Hackolade was specially adapted to support the data types and attributes behavior of Couchbase.



Indexes (TBA)

An index is a data-structure that provides quick and efficient means to query and access data, that would otherwise require scanning a lot more documents.  Couchbase Server provides three types of indexers to build indexes.  Architecturally distributed databases can benefit from both local and global indexes.  Couchbase Server provides both.


More information can be found here.

Views (TBA)

Views and indexes support querying data in Couchbase Server.  Querying of Couchbase data is accomplished via the following:

  • MapReduce views accessed via the View API.
  • Spatial views accessed via the Spatial View API.
  • N1QL queries with Global Secondary Indexes (GSI) and MapReduce views.

There are a number of differences between views and GSIs. At a high level, GSIs are built for N1QL queries, which are great for supporting interactive applications that require fast response times. Views, on the other hand, provide sophisticated user defined functions to provide great flexibility in indexing. Views can support complex interactive reporting queries with a pre-calculated result.


More information on views and indexes can be found here and here.

Forward-Engineering

For those developing Node.js applications on top of a Couchbase database, you may want to leverage the object document mapper (ODM) Ottoman that allows you build what your object model would look like, then auto-generate all the boilerplate logic that goes with it.  Hackolade dynamically generates the Ottoman script based on model attributes and constraints.  More information on Ottoman here and here.

Reverse-Engineering

The connection is established using a connection string including (IP) address and port (typically 8091), and authentication using username/password if applicable.


The Hackolade process for reverse-engineering of Couchbase databases is different depending on the Couchbase version.  For versions 3.x, the indexed views are queried via the REST API.  Starting with version 4.0, Hackolade uses N1QL syntax to perform the statistical sampling followed by the schema inference.  And starting with version 4.5 Enterprise edition, Hackolade leverages the INFER statement.


For more information on Couchbase in general, please consult the website.