Apache TinkerPop is an open source, vendor-agnostic, graph computing framework for both graph databases (OLTP) and graph analytic system (OLAP.) It is distributed under the Apache2 license. When a data system is TinkerPop-enabled, its users are able to model their domain as a graph and analyze that graph using the Gremlin graph traversal language.
Hackolade was specially built to support the data modeling of TinkerPop vertex labels and edge labels. The application closely follows the terminology of the database.
To be clear, Hackolade is not a graph visualization tool, but a tool for data modeling of TinkerPop graph databases.
To perform data modeling for TinkerPop with Hackolade, you must first download the TinkerPop plugin.
The data model in the picture below results from the reverse-engineering of the Gremlin ThaCrew example. Two views of the data model are available:
1) a graph view, with familiar circular node labels
2) an Entity-Relationship Diagram (ERD) view, with the advantage of displaying properties for both vertex labels and edge labels:
Vertex labels are a semantic representation of vertices (nodes) in the graph. Vertex labels are used to represent the role of the vertex in the domain, making it possible to query the graph, to define constraints, and add indexes for properties. Labels can also be used to mark temporary states of a vertex.
A vertex label usually has attributes, called "property keys" where the name (or key) is a string.
Gremlin is data types-agnostic. It supports several formats for ingestion of data, in particular, GraphML, GraphSON, and GRYO.
GraphML supports the following attribute types: string, float, double, int, long, and Boolean.
GraphSON is considered both a "graph" format and a generalized object serialization format. That characteristic makes it useful as a serialization format for Gremlin Server where arbitrary objects of varying types may be returned as results. Without embedded types, the original type system was restricted to standard JSON types of Object, List, String, Number, Boolean and that will lead to "lossyness" in the format (i.e. a float will be interpreted as double).
Version 3.0 of GraphSON was first introduced on TinkerPop 3.3.0 and is the default format when not specified. In GraphSON 3.0, there is explicit typed support for Map, List and Set as Gremlin relies on those types in quite specific ways that are not directly compatible with the JSON definitions of those collections. Null, Timestamp and UUID are also added.
Multiple properties (multi-properties): a vertex property key can have multiple values. For example, a vertex can have multiple "name" properties.
Properties on properties (meta-properties): a vertex property can have properties (i.e. a vertex property can have key/value data associated with it).
Edge labels are a semantic representation of edges in the graph. Every edge must have one and only label, and 2 vertices can be linked by several edge labels. Edge labels are used during complex traversals across the graph, when only certain kinds of paths from vertex to vertex are necessary for a specific query.
In TinkerPop, edges are unidirectional, going from one vertex to another vertex. In Hackolade we also represent edge labels that are implicitly bi-directional. For example, IS_MARRIED_TO should not require 2 edge labels, but instead be considered bi-directional. Since TinkerPop does not support the bi-directional concept, marking a relationship as bi-directional in Hackolade is for documentation purposes only.
As TinkerPop is a type of graph database known as 'property graph', edge labels may have attributes, or properties, just like vertex labels do:
TinkerPop is also agnostic when it comes to inex management. It is however possible to create an index on a single property key for any given label. A so-called composite index can also be created on more than one property key for a given label.
TinkerPop does not provide an abstraction for schemas. Gremlin is a functional, data-flow language that enables users to succinctly express complex traversals on (or queries of) their application's property graph. In order to provide added-value in forward-engineering, Hackolade provides a graph example in Gremlin syntax for the data model.
The process to reverse-engineer uses Gremlin functions to query the database, infer the schema and represent the vertex labels and edge labels with their respective properties and indexes.