Documentation

No results
    gitHub

    DynamoDB

    Amazon Web Services' DynamoDB is a fully managed NoSQL database service that provides fast and predictable performance with seamless scalability.  It is derived from the ground-breaking 2007 Dynamo paper which, along with Google’s 2006 Bigtable paper, popularized the concept of NoSQL databases.

     

    Hackolade was specially adapted to support the data modeling of DynamoDB tables including partition (hash) and sort (range) keys, supporting multiple regions as well.  The application closely follows the terminology of the database.  Support was enhanced with v5.0.8 for single-table storage, through the use of views (a more graphical equivalent to the concept of facets in the NoSQL Workbench.)

     

    The data model in the picture below results from the reverse-engineering of the sample application described here.

    DynamoDB workspace

     

    Table groups

    In DynamoDB, you can assign multiple tables to a single group to manage your resources. Simply provide a group name and a key-value pair for tagging your table to the group to be created.  Resources groups can be created through the AWS CLI or Console for Resource Groups and Tag Editor.

     

    Tables, items and attributes

    A table is collection of data.  Each table contains zero or more items. An item is a group of attributes that is uniquely identifiable among all of the other items. There is no limit to the number of items you can store in a table.  Each item is composed of one or more attributes.

    Keys

    As is often the case with NoSQL databases, DynamoDB has a unique storage model:  When a table is created, in addition to the table name, a primary key for the table has to be specified.  The primary key uniquely identifies each item in the table, so that no two items can have the same key.  DynamoDB supports two different kinds of primary keys:

    • Partition key – A simple primary key, composed of one attribute.  DynamoDB uses the partition key's value as input to an internal hash function.  The output from the hash function determines the partition (physical storage internal to DynamoDB) in which the item will be stored.  No two items in a table can have the same partition key value.
    • Partition key and sort key – Referred to as a composite primary key, this type of key is composed of two attributes.  The first attribute is the partition key, and the second attribute is the sort key.  DynamoDB uses the partition key value as input to an internal hash function.  The output from the hash function determines the partition (physical storage internal to DynamoDB) in which the item will be stored.  All items with the same partition key are stored together, in sorted order by sort key value.  It is possible for two items to have the same partition key value, but those two items must have different sort key values.

     

    Note:

    The partition key of an item is also known as its 'hash' attribute.  The term hash attribute derives from DynamoDB's usage of an internal hash function to evenly distribute data items across partitions, based on their partition key values.

    The sort key of an item is also known as its 'range' attribute.  The term range attribute derives from the way DynamoDB stores items with the same partition key physically close together, in sorted order by the sort key value.

     

    Each primary key attribute must be a scalar (meaning that it can only hold a single value).  The only data types allowed for primary key attributes are string, number, or binary.  There are no such restrictions for other, non-key attributes.

     

    Other table properties

    Given the fully managed nature of DynamoDB, when you create a table, you need to specify how much provisioned throughput capacity you want to reserve for reads and writes. DynamoDB will reserve the necessary resources to meet the specified throughput needs while ensuring consistent, low-latency performance. The provisioned throughput settings can be changed at any time, increasing or decreasing capacity as needed.  The throughput settings can be documented in Hackolade.

     

    Many applications can benefit from the ability to capture changes to items stored in a DynamoDB table, at the point in time when such changes occur.  DynamoDB Streams captures a time-ordered sequence of item-level modifications in any DynamoDB table, and stores this information in a log for up to 24 hours. Other applications can access this log and view the data items as they appeared before and after they were modified, in near real time.  When you enable a stream on a table, DynamoDB captures information about every modification to data items in the table.  The stream settings can be documented in Hackolade.

     

    Data types

    DynamoDB supports many different data types for attributes within a table.  They can be categorized as follows:

    • Scalar Types – A scalar type can represent exactly one value.  The scalar types are number, string, binary, Boolean, and null.
    • Document Types – A document type can represent a complex structure with nested attributes—such as you would find in a JSON document.  The document types are list and map.
      • A list type attribute can store an ordered collection of values.  Lists are enclosed in square brackets: [ ... ]  A list is similar to a JSON array.  There are no restrictions on the data types that can be stored in a list element, and the elements in a list element do not have to be of the same type.
      • A map type attribute can store an unordered collection of name-value pairs.  Maps are enclosed in curly braces: { ... }  A map is similar to a JSON object.  There are no restrictions on the data types that can be stored in a map element, and the elements in a map do not have to be of the same type.  Maps are ideal for storing JSON documents in DynamoDB. 
    • Set Types – A set type can represent multiple scalar values.  The set types are string set, number set, and binary set.  All of the elements within a set must be of the same type.  For example, an attribute of type Number Set can only contain numbers; String Set can only contain strings; and so on.

    Note that date values are stored as ISO-8601 formatted strings, shifted to UTC, with millisecond precision.

     

    Hackolade was specially adapted to support the data types and attributes behavior of DynamoDB.  The data model in the picture below is the direct result of the reverse-engineering of the sample application described here

     

    Image

     

    Indexes

    Whenever a write occurs on a table, all of the table's indexes must be updated. In a write-heavy environment with large tables, this can consume large amounts of system resources. In a read-only or read-mostly environment, this is not as much of a concern—however, one should ensure that the indexes are actually being used by the application, and not simply taking up space.

     

    Indexes in DynamoDB are different from their counterparts with relational databases. When you create a secondary index, you must specify its key attributes – a partition key and a sort key. After you create the secondary index, you can query it or scan it just as you would with a table. DynamoDB does not have a query optimizer, so a secondary index is only used when you query it or scan it.

     

    DynamoDB supports two different kinds of indexes:

    • Global secondary indexes – The primary key of the index can be any two attributes from its table.
    • Local secondary indexes – The partition key of the index must be the same as the partition key of its table. However, the sort key can be any other attribute.

     

    DynamoDB ensures that the data in a secondary index is eventually consistent with its table. You can request strongly consistent Query or Scan actions on a table or a local secondary index. However, global secondary indexes only support eventual consistency.

     

    More information can be found here.

    Views

    In Hackolade, we have enabled the views feature.  While these views do not get forward-engineered to the database instance, they are useful in the context of single-table storage, in order to visualize and document access patterns for a table.  Views are a more visual equivalent to the concept of facets in the NoSQL Workbench).

     

    DynamoDB views

    Forward-Engineering

    Hackolade dynamically generates CreateTable and ConditionExpression scripts based on model attributes and constraints.  These can be applied directly to the DynamoDB instance:

     

    The script can also be exported to the file system via the menu Tools > Forward-Engineering, or via the Command-Line Interface.

     

    DynamoDB apply to instance

    Reverse-Engineering

    A DynamoDB can be installed either locally or as a managed database at AWS.  The connection is established using a connection string including (IP) address and port (typically 8000), and if managed at AWS, authentication using the IAM awsAccessKeyId/awsSecretAccessKey for the region where the DynamoDB instance is located.  You should consult this page for more information on how to connect to DynamoDB.

     

    The Hackolade reverse-engineering process of a DynamoDB table includes a query for a representative random sample of items, followed by a schema inference based on the sample.

     

    For more information on DynamoDB in general, please consult the website.