No results

    Collibra Data Dictionary integration

    One of the primary challenges severely constraining organizations is to make business sense of technical data structures in applications and databases.  This complicates the ability of organizations to identify critical data elements and bring them under governance.  


    The integration of data modeling with governance tools and processes enables solving this problem at the source, i.e. where data structures are designed to create schemas and their technical metadata.  


    Note: You may wish to view the how-to video on this subject.


    Collibra is one of the leaders in the space of data governance and metadata management solutions.  Metadata management is a core aspect of an organization’s ability to manage its data and information assets. The term “metadata” describes the various facets of an information asset that can improve its usability throughout its life cycle. Metadata is used as a reference for business-oriented and technical projects, and lays the foundations for describing, inventorying and understanding data for multiple use cases.

    Hackolade has partnered with Collibra to provide an officially supported integration with Collibra's Data Dictionary, using its Core, Import, and output module APIs.  With this integration, users can easily publish into Collibra domains, and keep synchronized, their Hackolade data models for any of the many targets supported by Hackolade.  Even the schema definitions of REST APIs documented in Swagger or OpenAPI.

    The process automatically:

    • checks the configuration in Collibra
    • creates the necessary custom scopes, attributeTypes and assignments to support the granularity of Hackolade features
    • then creates and keeps in sync assets for schemas, tables, views, columns, models, entities, attributes, and foreign key relationships.

    The integration specifically handles complex data types, hierarchical structures, and polymorphism found in modern databases, JSON, Avro, Parquet, ProtoBuf, etc...  Custom properties defined for a plugin are also published as custom attributeTypes in Collibra.

    Hackolade Studio data models for physical targets are published to Physical Data Dictionaries in the form of schemas/tables/columns assets in Collibra, whereas since v7.3.1 of Hackolade Studio, Polyglot models are published to Logical Data Dictionaries in the form of models/entities/attributes assets in Collibra.

    With v7.6.1 of Hackolade Studio, we added publishing of lineage relations between logical Polyglot models and their derived physical targets for all their assets (model/schema, entity/table, attribute/column)

    Important note: the Collibra integration is an add-on feature which requires a specific license key which can be purchased from us here.

    Publishing process flow

    To publish a Hackolade data model to your Collibra Data Dictionary, you choose Tools > Forward-Engineering > Collibra Dictionary.  

    The diagram below describes the integration flow:

    Collibra integration flow


    Connect to your Collibra instance

    In order to feed data model information to the Collibra instance, it is assumed that you have sufficient credentials to do so.  If not, please contact your Collibra administrator.


    To connect to the Collibra instance, you must first specify connection settings:

    Collibra connection settings

    as well as authentication credentials:

    Collibra authentication


    User rights

    To successfully import a Hackolade model into Collibra, a user should have the author's license type. The role that is assigned to the user should have been provided with the following permissions:


    • For global role and permissions:
      • System administration - (This is necessary to apply the custom Hackolade configuration: attributeTypes, relationTypes, scope...)
    • For resource role and permissions:
      • Asset:
        • Add
        • Remove
        • Update
        • Attribute:
          • Add
          • Remove
          • Update
      • Attachment:
        • Add
      • Domain: (This is necessary for views and work with Hackolade Mapping Domain)
        • Add
        • Remove
        • Update


    We also recommend assigning a user with the above permissions to the parent community of the target domain. It is needed to create/update/delete Hackolade Mapping Domain.


    The Hackolade Mapping Domain is used to represent links between view columns and columns in tables, for example:

    Collibra Mapping Domain



    Check for the proper configuration

    To ensure successful processing of the Hackolade model information, the system uses the Core API to check that the Collibra setup is OK, and if not, asks the user for permission to create the necessary setup.


    The system will:

    1) confirm that the out-of-the-box assetTypes exist: model, entity, attribute, schema, table, database view, column, foreign key, mapping specification

    2) confirm that the out-of-the-box relationTypes exist: schema contains table, and table contains column

    3) confirm that the Hackolade setup exists:

    - custom attribute scope

    - custom attributeTypes to handle Hackolade-specific information

    - custom assignments of attributeTypes to out-of-the-box assetTypes: model, entity, attribute, schema, tables, database views, columns

    - custom characteristics for schema, table, columns, and database views

    - custom relationType "Column contains Column" to allow hierarchical view of nested objects, arrays, and polymorphism


    If the expected configuration cannot be found in Collibra, the user is prompted for confirmation that the setup should be automatically carried out in the Collibra instance.


    Fetch existing Communities and Domains

    If the configuration is correct, the application uses the Core API to retrieve the existing Communities and Domains and display them so the user can select where the Hackolade Data Model should be loaded.  If the domain does not exist, it should be created first.  It is recommended to create a new domain with type "Physical Data Dictionary" for physical models of Hackolade Studio, and "Logical Data Dictionary" for polyglot models.

    Collibra create domain

    Select the target domain

    All communities and domains are displayed in the box below so the user can select the one where the Hackolade data model should be loaded:

    Collibra resource selection


    Select the data model objects to be loaded

    The user then selects the entities to be loaded to the selected Collibra domains:

    Collibra object selection


    Publish data model to Collibra

    The application uses the Import API to bulk load the selected objects metadata and Entity-Relationship picture into the selected Collibra domain.  The system leverages the Synchronization API to keep data in Collibra up-to-date with model evolutions when invoking the integration repeatedly.  The synchronization is based on the internal model UUID.

    Important note: According the Collibra documentation, depending on the resource type, the Import API performs one of two operations: SET/REPLACE or MERGE.  For attributes, the operation is SET/REPLACE.  As a result, "if the resource exists with properties other than the ones defined in the input (i.e.; the Hackolade data model), the resource is replaced with the one provided in the input."  Meaning that edits made in Collibra risk disappearing with subsequent publications from Hackolade.  With the ability to reverse-engineer from Collibra into Hackolade, 2 approaches are possible:

    1) reverse-engineer from Collibra into the master Hackolade data model, and let the conflict resolution kick-in, letting the user decide whether to merge the information from Collibra.

    Conflict resolution

    Once the information is merged into the hackolade model, the whole model can be published to Collibra again.

    2) a user might want to have more control over the granularity of what gets merged and what does not.  There is the possibility to reverse-engineer into an empty model, save it, then do a Model Compare & Merge with the master Hackolade model, followed by the publishing back to Collibra of the merged model.


    View data model in Collibra console

    The data model information can immediately be viewed inside Collibra:

    Collibra view data models


    In order to view nested objects in the above screen, it is suggested to enable multipath hierarchy for the relation types: schema contains table, table contains column, and column contains column:

    Collibra configure hierarchy


    You may also display the Full Name field to view the nesting path in dot.notation, as well as the hackolade Data Type:

    Collibra fields config


    Users will notice that the data types of the specific target technology:

    Collibra data type view


    The Entity-Relationship Diagram image can also be viewed as a PNG file:

    Collibra view ERD file


    Reverse-engineer a Collibra Data Dictionary

    With v5.2.1, we introduced the possibility to reverse-engineer a Collibra physical data dictionary into a Hackolade data model for the target of your choice.  This operation can be done:

    - either into an empty model you wish to create,

    - or into an existing model, possibly the one used to originally publish to Collibra.  This is particularly handy if maintenance occurs in Collibra for models created in Hackolade Studio.  Refer to the important note above in the section "Publish data model to Collibra".


    Publish lineage relations

    With v7.6.1 of Hackolade Studio, it is now possible to publish lineage relations between logical Polyglot models and their derived physical targets for all their assets (model/schema, entity/table, attribute/column)


    This process requires the orchestration of several successive operations:

    - publish to Collibra the Polyglot model(s) from which physical target models are derived in Hackolade.  Each Polyglot model is typically published into a Collibra Logical Data Dictionary with the models/data entitty/data attribute structure (possibly specifying the hierarchy "Data Attribute contains Data Attribute) ;

    - make sure to save the model(s) in Hackolade Studio, so the Collibra internal IDs are persisted in the Polyglot model(s);

    - open the derived model(s) in Hackolade Studio and make sure to refresh the references to parent Polyglot model(s), which will ensure that the links between objects are persisted in the derived model(s);

    - publish the derived model(s) to Collibra, generally into a Collibra Physical Data Dictionary (unless the derived model is itself a Polyglot model, in which case it would be published to a Logical Data Dictionary.) 

    In Collibra, it is now possible to display the lineage relations automatically created by Hackolade Studio during publishing.