Benefits of Data Modeling
In an age of data pipelines, data democratization, and self-service analytics, it becomes even more important to perform data modeling because models standardize information, enable interoperability, show intent, determine trust, and ensure proper data governance.
Information drives businesses who make decisions based on data. Data is a corporate asset. Data modeling is critical to understanding data, its interrelationships, and its rules. Yet, some people don‘t understand the value that data modeling provides. Some perceive it as just documentation, as a bottleneck to Agile development, or even as too expensive to be worth it.
And the buzz around the terms NoSQL, schemaless, and non-relational, has further promoted the illusion of a silver bullet. But is it realistic to think that one can actually design an application with no structure, no schema, and no relationships? Isn’t it ironic also that schema design is one of NoSQL’s toughest challenges, triggering countless how-to videos, blogs, and books?
A data model is not just documentation, because it can be forward-engineered into a physical database. Not only is data modeling not a bottleneck to application development, it has demonstrated time and again that it accelerates development, significantly reduces maintenance, increases application quality, and lowers execution risks across the enterprise. Experience has shown that relying on the intuition of software developers is not a repeatable process or one insuring first-time-right success.
"In many ways, up-front data design with NoSQL databases can actually be more important than it is with traditional relational databases [...] Beyond the performance topic, NoSQL databases with flexible schema capabilities require more discipline in aligning to a common information model." Ryan Smith, Information Architect at Nike
Benefits of data modeling
Higher application quality
A data model is the equivalent of an architect’s blueprint before a building construction starts. Data modeling is the visual expression of a development team’s understanding of the business and its rules. The data modeling process is the most effective way to gather correct and complete business data requirements and business rules, so as to ensure that the system will operate in the intended manner. The process generates more questions than any other modeling approach, leading to higher integrity and discovery of the relevant business rules. And its visual nature facilitates communication and collaboration between business users and subject matter experts.
Quicker time to market
Thanks to proper data modeling, application developers don’t have to discover unknown requirements themselves, and can focus on developing with fewer errors and reach their sprint commitments. This will in turn lead to earlier delivery of high-quality, value-adding functionality, easier acceptance testing, and a quicker payback on development.
Lower development and maintenance costs
Data modeling catches errors and inconsistencies early in the process, when they are easy and cheap to correct. Given the exponential evolution of bug fixing costs as a project progresses, it’s always better to evaluate and think through options early, rather than after the software has been written. Even more so in an Agile development environment, development costs can be reduced significantly because a good data model will reveal upfront otherwise unknown or unanticipated requirements. And with NoSQL’s flexibility, the data model can rapidly evolve in an organized manner.
Improved data quality
Data corruption and inaccurate data are even worse than application errors. A good data model defines the metadata so the data itself can be properly understood, queried, and reported on. To truly leverage the power and flexibility of NoSQL, it is still important to ensure the enforcement of domain definitions, field constraints, editing rules, and integrity of relationships. It actually turns out to be more important given that such enforcement is seldom possible at the database level, and needs to be maintained in the application code. A data model will provide the developers with a roadmap and checklist for such enforcement.
Better performance
Data modeling provides DBAs with the means to understand the database and tune it for fast performance, without having to search through the code to discover the schema. Given the nature of NoSQL, the data modeling process outlines a method to start thinking in terms of queries and data representation, rather than in terms of storage.
GDPR & PII
Companies around the world need to demonstrate compliance with privacy regulations on personally identifiable information. To do so, they need to document the proper handling of attributes concerned, and monitor daily that compliance is maintained. This monitoring becomes more of a challenge with Agile Development, self-managed teams, and dynamic schemas of NoSQL databases, but data modeling can rigorously manage this effort.
Business intelligence
What’s the use of possessing a great deal of data, only to have no efficient way – or no way at all – to use it? In other words: how can one effectively query his Big Data if he does not know what is in it, or how it is structured? A good data model, built on query and reporting requirements, is a starting point for data mining. It will spot trends and patterns, and make predictions to help a business navigate challenges and opportunities.
Documentation and knowledge transfer
Data modeling provides documentation to facilitate communication between business stakeholders and technical experts, using a common vocabulary and a business domain glossary. A data model is effective at expressing abstractions in a clear and succinct manner, and it serves as a training aid through staff turnover.
Enhanced integration
With data modeling of all corporate applications, the creation of a meta repository provides a common vocabulary, identifies relationships and redundancies, and resolves discrepancies so disparate systems are well integrated together.
Components of an effective data model
Entity-Relationship diagram
An ER diagram is the blueprint of an application’s foundations, showing a map of the data. When designing a data model for NoSQL, it is important to think in terms of queries and data representation in the application screens. Application scale and performance prescribes embedding and denormalization illustrated in an ER diagram adapted to display JSON object nesting.
Objects metadata
Object properties and constraints are defined for all entities, attributes, identifiers, and relationships describing the data model. Detailed descriptions and a log of team comments are gathered iteratively while the model is built incrementally in an easily understood hierarchical schema view.
Business rules
NoSQL databases don’t typically feature referential integrity or validation rules, the theory being that they are supposed to be maintained in the application code. But with complex application developed by a large team, it becomes a challenge to insure consistent ways to store data. A data model limits inconsistencies and inaccuracies in the data, thereby increasing the value when it comes time to mine and report on the data.
ROI calculator
ROI (Return On Investment) is a widely used measure to compare the effectiveness of IT projects and investments. The basic ROI calculation is to divide the net return from an investment by the cost of the investment, and express the result as a percentage.
The ROI formula is ROI % = (Benefit due to Data Modeling - Cost of Investment) / Cost of Investment X 100
Click here to download our free ROI calculator to help you get NoSQL Data Modeling approved by your company.
An alternative method is to calculate the Payback Period, or the length of time that it takes for the cumulative gains from an investment to equal cumulative costs. In other words, how long does it take for an investment to pay for itself.
You may also use NPV (Net Present Value) to represent the return a project will make at a specified discount rate. Or you may calculate the IRR (Internal Rate of Return) to show the yearly return percentage of the investment.
No project or approach has an automatic right to approval or budget. Decisions to invest in an IT methodology or software have to compete with all other business needs and initiatives. Make sure to use the arguments and demonstration best suited to your company.