Big Data Analytics

Delta Lake
on Databricks
Hive
Redshift
Snowflake
Synapse
BigQuery
Presto

Delta Lake on DataBricks

Delta Lake on Databricks data modeling

Deliver a reliable single source-of-truth for all data

With support for ACID transactions and schema enforcement, Delta Lake provides the reliability that traditional data lakes lack. This enables to scale reliable data insights throughout the organization and run analytics and other data projects directly on data lakes. The Databricks platform runs on Azure, AWS, and Google cloud.

Hackolade was specially adapted to support the data modeling of Delta Lake, including the Databricks storage structure of clusters, databases, tables and views. It leverages Hive primitive and complex data types, plus user-defined types. And combines it all with the usual capabilities of forward-engineering of HiveQL scripts, reverse-engineering, documentation generation, model comparison, command-line interface integration with CI/CD pipelines, etc...The application closely follows the Delta Lake terminology.

View sample documentation

Learn more

Apache Hive

Hive data modeling

Hadoop Hive Data Modeling

Apache Hive is an open source data warehouse system built on top of Hadoop for querying and analyzing large datasets stored in Hadoop files, using HiveQL (HQL), which is similar to SQL. HiveQL automatically translates SQL-like queries into MapReduce jobs. This provides a means for attaching the structure to data stored in HDFS.

Hackolade was specially adapted to support the data modeling of Hive, including Managed and External tables and their metadata, partitioning, primitive and complex datatypes. It dynamically forward-engineers HQL Create Table scripts, as the structure is built in the application. You may also reverse-engineer Hive instances to display the corresponding ERD and enrich the model. The application closely follows the Hive, terminology and storage structure.

View sample documentation

Learn more

Amazon Redshift

Redhshift workspace

Amazon Redshift is a data warehouse product built on top of technology from massive parallel processing (MPP) to handle complex queries against large data sets.

Hackolade was specially adapted to support the data modeling of Redshift, including schemas, tables and views, plus the generation of DDL Create Table syntax as the model is created via the application. In particular, Hackolade has the unique ability to model complex semi-structured objects stored in columns of the SUPER data type. The reverse-engineering function, if it detects JSON documents, will sample records and infer the schema to supplement the DDL table definitions.

Learn more

Snowflake

Snowflake data modeling

Cloud-based data warehousing

Snowflake’s architecture is a hybrid of traditional shared-disk database architectures and shared-nothing database architectures. It supports the most common standardized version of SQL: ANSI.

Hackolade was specially adapted to support the data modeling of Snowflake, including schemas, tables and views, indexes and constraints, plus the generation of DDL Create Table syntax as the model is created via the application. In particular, Hackolade has the unique ability to model complex semi-structured objects stored in columns of the VARIANT data type. The reverse-engineering function, if it detects JSON documents, will sample records and infer the schema to supplement the DDL table definitions.

View sample documentation

Learn more

Azure Synapse Analytics

Azure Synapse data modeling

Data modeling for serverless analytics

Azure Synapse Analytics and Parallel Data Warehouse are a limitless analytics service that brings together enterprise data warehousing and Big Data analytics. It uses either serverless or provisioned resources with a unified experience to ingest, prepare, manage, and serve data for immediate BI and machine learning needs.

Hackolade has the unique ability to model complex semi-structured objects stored in columns of the (N)VARCHAR(MAX) data type. The reverse-engineering function, if it detects JSON documents, will sample records and infer the schema to supplement the DDL table definitions. Hackolade was specially adapted to support the data modeling of Azure Synapse Analytics and Parallel Data Warehouse, including schemas, tables and views, plus the generation of DDL Create Table syntax.

View sample documentation

Learn more

anchored page test

Anchored page text test