Documentation

gitHub

Lineage capture

Starting with v4.3.10, Hackolade introduced some basic lineage capture capabilities.  To be clear, Hackolade is not a data governance suite with fancy lineage visualization and transformation features.  It just adds the possibility to enable audit trail capture, which can in turn be used to feed a specialized lineage tool.

 

Object lineage states where an object comes from, where it is going, and what (optional) transformations are applied to it as it flows through multiple processes.  It helps understand the life cycle of the object.  Lineage is the documentation of the life cycle.

 

A Hackolade model can be the result of a combination of action such as: 

  • reverse-engineering of a DDL, 
  • denormalization actions, 
  • reverse-engineering of an additional JSON document, 
  • referencing external definitions,
  • generation of an API model
  • etc.

 

Users may then feed these models, including their lineage capture data to a Data Catalog or a Data Management suite, such as Collibra, IBM InfoSphere DG, or erwin EDGE, etc…

 

Lineage capture can be enabled simply by checking the lineage checkbox property for any Hackolade model.  If you wish to enable lineage capture by default for all new models, you may do so in Tools > Options > General.

 

 

1) Sources are declared at model level

Once lineage capture is enabled for a model, lineage sources are generally created automatically through operations such as reverse-engineering, denormalization, etc..  Lineage sources received a unique identifier for internal tracking and linking purposes.  Lineage sources are viewed in the Lineage tab of the Properties Pane.  The default source name can be edited with a more friendly and descriptive name, if desired.

 

Image

 

The different source formats available are: 

- instance: typically the database instance from which reverse-engineering is performed

- file: see list below

- or model: see list below

 

the different file types are:

- JSON Document

- JSON Schema

- Data Definition Language (DDL)

- XML Schema Definition  (XSD)

- Excel template import

- API generation

- reference to an external definition in JSON Schema

- other

 

The different model types are:

- External reference

- Copy/Paste/Duplicate

- Denormalization

- Reference

- Other

 

While lineage sources are typically created automatically by certain operations, it is also possible to create sources manually.

2) Lineage sources are referenced at individual attribute level

A lineage entry is created at the attribute level when it is impacted, referencing the model-level lineage source.  It displays the busines name and technical name for the source entity and the source attribute.  It is possible to document further with a description and possibly with a transformation.

 

Image

 

 

3) Lineage at the entity level is just a view of attribute level entries

Aside from the description is possible transformation entries, which can be defined, the rest of the information is just a view of attribute-level lineage entries that cannot be edit here.  If editing is required, it should be done at the attribute level.

 

Image