Big Data Analytics

BigQuery
Delta Lake
on Databricks
Hive
Redshift
Snowflake
Synapse
Teradata

Google BigQuery

Google BigQuery data modeling

Run analytics over vast amounts of data in near real time

BigQuery uses managed columnar storage, massively parallel execution, and automatic performance optimizations. With familiar ANSI-compliant SQL, BigQuery manages the technical aspects of storing structured data, including compression, encryption, replication, performance tuning, and scaling.

Hackolade was specially adapted to support the data modeling of BigQuery, including datasets, tables and views, plus the generation of DDL Create Table syntax, in Standard SQL or in JSON Schema. Hackolade natively supports the ability to represent nested complex data types: STRUCT (record) and ARRAY.

Learn more

Delta Lake on DataBricks

Delta Lake on Databricks data modeling

Deliver a reliable single source-of-truth for all data

With support for ACID transactions and schema enforcement, Delta Lake provides the reliability that traditional data lakes lack. This enables to scale reliable data insights throughout the organization and run analytics and other data projects directly on data lakes. The Databricks platform runs on Azure, AWS, and Google cloud.

Hackolade was specially adapted to support the data modeling of Delta Lake, including the Databricks storage structure of clusters, databases, tables and views. It leverages Hive primitive and complex data types, plus user-defined types. And combines it all with the usual capabilities of forward-engineering of HiveQL scripts, reverse-engineering, documentation generation, model comparison, command-line interface integration with CI/CD pipelines, etc...The application closely follows the Delta Lake terminology.

View sample documentation Databricks Data Modeling

Learn more

Apache Hive

Hive data modeling

Hadoop Hive Data Modeling

Apache Hive is an open source data warehouse system built on top of Hadoop for querying and analyzing large datasets stored in Hadoop files, using HiveQL (HQL), which is similar to SQL. HiveQL automatically translates SQL-like queries into MapReduce jobs. This provides a means for attaching the structure to data stored in HDFS.

Hackolade was specially adapted to support the data modeling of Hive, including Managed and External tables and their metadata, partitioning, primitive and complex datatypes. It dynamically forward-engineers HQL Create Table scripts, as the structure is built in the application. You may also reverse-engineer Hive instances to display the corresponding ERD and enrich the model. The application closely follows the Hive, terminology and storage structure.

View sample documentation

Learn more

Amazon Redshift

Redhshift workspace

Amazon Redshift is a data warehouse product built on top of technology from massive parallel processing (MPP) to handle complex queries against large data sets.

Hackolade was specially adapted to support the data modeling of Redshift, including schemas, tables and views, plus the generation of DDL Create Table syntax as the model is created via the application. In particular, Hackolade has the unique ability to model complex semi-structured objects stored in columns of the SUPER data type. The reverse-engineering function, if it detects JSON documents, will sample records and infer the schema to supplement the DDL table definitions.

Learn more

Snowflake

Snowflake data modeling

Cloud-based data warehousing

Snowflake’s architecture is a hybrid of traditional shared-disk database architectures and shared-nothing database architectures. It supports the most common standardized version of SQL: ANSI.

Hackolade was specially adapted to support the data modeling of Snowflake, including schemas, tables and views, indexes and constraints, plus the generation of DDL Create Table syntax as the model is created via the application. In particular, Hackolade has the unique ability to model complex semi-structured objects stored in columns of the VARIANT data type. The reverse-engineering function, if it detects JSON documents, will sample records and infer the schema to supplement the DDL table definitions.

View sample documentation Snowflake Data Modeling

Learn more

Azure Synapse Analytics

Azure Synapse data modeling

Data modeling for serverless analytics

Azure Synapse Analytics and Parallel Data Warehouse are a limitless analytics service that brings together enterprise data warehousing and Big Data analytics. It uses either serverless or provisioned resources with a unified experience to ingest, prepare, manage, and serve data for immediate BI and machine learning needs.

Hackolade has the unique ability to model complex semi-structured objects stored in columns of the (N)VARCHAR(MAX) data type. The reverse-engineering function, if it detects JSON documents, will sample records and infer the schema to supplement the DDL table definitions. Hackolade was specially adapted to support the data modeling of Azure Synapse Analytics and Parallel Data Warehouse, including schemas, tables and views, plus the generation of DDL Create Table syntax.

View sample documentation

Learn more

Teradata Vantage

Big data analytics and semi-structured data in the cloud

Teradata Vantage is a suite of big data analytics solutions based on a core relational database. It allows deep analysis and manipulation of semi-structured data format and can be deployed on multiple cloud providers, public or private but also on premises.

Hackolade has the unique ability to model complex semi-structured objects stored in columns of the JSON data type. The reverse-engineering function, if it detects JSON documents, will sample records and infer the schema to supplement the DDL table definitions. Hackolade was specially adapted to support the data modeling of Teradata Vantage, including schemas, tables and views, plus the generation of DDL Create Table syntax.

Learn more