Documentation

No results
    gitHub

    ScyllaDB

    ScyllaDB is an open-source distributed NoSQL column-oriented data store, designed to be compatible with Apache Cassandra.  It supports the same CQL query language but is written in C++ instead of Java to increase raw performance and leverage modern multi-code servers self-tuning.

     

    ScyllaDB is also a hybrid between a key-value and a column-oriented database.  Rows are organized into tables.  The first component of a primary key is a partition key, and rows clustered by the remaining columns of the key.  Other columns may be indexed separately from the primary key.

     

    To perform data modeling for SycllaDB with Hackolade, you must first download the ScyllaDB plugin.  

     

    Hackolade was specially adapted to support ScyllaDB, following the principles of Cassandra data modeling, including User-Defined Types and the concepts of Partitioning and Clustering keys.  It lets users define, document, and display Chebotko physical diagrams.  The application closely follows the ScyllaDB terminology, data types, and Chebotko notation.  

     

    The data model in the picture below results from the reverse-engineering of a sample application imported in ScyllaDB.

     

    ScyllaDB workspace

    Keyspace

    A keyspace is a ScyllaDB namespace that defines data replication on nodes.  A cluster contains one keyspace per node.  A keyspace is logical grouping of tables analogous to a database in relation database systems. 

     

    Table

    Tables in ScyllaDB contain rows of columns, and a primary key identifies the location and order of stored data.  Tables can also be used to store JSON.  Tables are declared up front at schema definition time.

     

     

    ScyllaDB table schema tree view

     

    Primary, Partition, and Clustering Keys

    In ScyllaDB, primary keys can be simple or compound, with one or more partition keys, and optionally one or more clustering keys.  The partition key determines which node stores the data.  It is responsible for data distribution across the nodes.  The additional columns determine per-partition clustering.  Clustering is a storage engine process that sorts data within the partition.

     

    Attributes data types

    ScyllaDB supports a variety of scalar and complex data types, including lists, maps, and sets.

     

    Hackolade was specially adapted to support the data types and attributes behavior of ScyllaDB.

    ScyllaDB data types ScyllaDB string modes ScyllaDB numeric modes

    Some scalar types can be configured for different modes. 

     

    Hackolade also supports ScyllaDB User-Defined Types (UDTs) via its re-usable object definitions.

     

     

    Materialized Views

    ScyllaDB supports materialized views to handle automated server-side denormalization.  In theory, this removes the need for client-side handling and would ensure consistency between base and view data.  Materialized views work particularly well with immutable insert-only data, but should not be used in case of low-cardinality data.  Materialized views are designed to alleviate the pain for developers, but are essentially a trade-off of performance for connectedness.  See more info in this article.

     

    Hackolade supports ScyllaDB materialized views, via a SELECT of columns of the underlying base table, to present the data of the base table with a different primary key for different access patterns.  

     

    Forward-Engineering

    Hackolade dynamically generates the CQL script to create keyspaces, tables, columns and their data types, and indexes for the structure created with the application.

     

    The script can also be exported to the file system via the menu Tools > Forward-Engineering, or via the Command-Line Interface.

     

    ScyllaDB forward-engineering

     

    As many people store JSON within text or blob columns, Hackolade allows for the schema design of those documents.  That JSON structure is not forward-engineered in the CQL scrip, but is useful for developers, analysts and designers.

     

    Reverse-Engineering

    The connection is established using a connection string including (IP) address and port (typically 9042), and authentication using username/password if applicable. 

     

    The Hackolade process for reverse-engineering of ScyllaDB databases includes the execution of cqlsh DESCRIBE statements to discover keyspaces, tables, columns and their types, and indexes.  If JSON is detected in string columns, Hackolade performs statistical sampling of records followed by probabilistic inference of the JSON document schema.

     

    For more information on ScyllaDB in general, please consult the website.