Benefits of data modeling apply to NoSQL and Agile
The benefits of data modeling also apply to NoSQL and Agile development
"In many ways, up-front data design with NoSQL databases can actually be more important than it is with traditional relational databases... Beyond the performance topic, NoSQL databases with flexible schema capabilities similarly require more discipline in aligning to a common information model... The flexible schema is a great innovation for quick evolution of your data model, and yet it requires discipline to harvest the benefits without experiencing major data quality issues and frustrations as a result.“ Ryan Smith, Information Architect at Nike
- Higher quality: just as architects consider blueprints before constructing a building, you should consider data modeling before building an app. On average, about 70 percent of software development efforts fail, and a major source of failure is premature coding. A data model helps define the problem, enabling you to consider different approaches and choose the best one.
- Reduced cost: you can build applications at lower cost via data models. Data modeling typically consumes less than 10 percent of a project budget, and can reduce the 70 percent of budget that is typically devoted to programming. Data modeling catches errors and oversights early, when they are easy to fix. This is better than fixing errors once the software has been written or – worse yet – is in customer hands.
- Quicker time to market: you can also build software faster by catching errors early. In addition, a data model can automate some tasks – design tools can take a model as an input and generate the initial database structure, as well as some data access code.
- Clearer scope: a data model provides a focus for determining scope. It provides something tangible to help business sponsors and developers agree over precisely what is included with the software and what is omitted. Business staff can see what the developers are building and compare it with their understanding. Models promote consensus among developers, customers and other stakeholders. A data model also promotes agreement on vocabulary and jargon. The model highlights the chosen terms so that they can be driven forward into software artifacts. The resulting software becomes easier to maintain and extend.
- Faster performance: a sound model simplifies database tuning. A well-constructed database typically runs fast, often quicker than expected. To achieve optimal performance, the concepts in a data model must be crisp and coherent, including denormalization, indexing, and sharding strategies. Modeling provides a means to understand a database so that you are able to tune it for fast performance, without having to search through the code to discover the schema.
- Better documentation and knowledge transfer: models document important concepts and jargon, proving a basis for long-term maintenance. The documentation will serve you well through staff turnover. As a training aid, a data dictionary built from a well-executed data modeling exercise can be irreplaceable.
- Fewer application errors: a data model causes participants to crisply define concepts and resolve confusion. As a result, application development starts with a clear vision. Developers can still make detailed errors as they write application code, but they are less likely to make deep errors that are difficult to resolve.
- Fewer data errors: data errors are worse than application errors. It is one thing to have an application crash, necessitating a restart. It is another thing to corrupt data in a large database. A data model not only improves the conceptual quality of an application, it also lets you leverage database features that improve data quality.
- Understanding the business: the process of data modeling requires you and your teams to understand the details of how the business works in order to define the data that drives it. In order to build a customer database, for instance, you need to understand what data is gathered on customers and how it is used. The data and relationships represented in a data model provide a foundation on which to build an understanding of business processes.
- Manage data as a resource: without a good data model, you can find yourself in the possession of a great deal of data, and with no efficient way – or no way at all – to make use of it. With a good data model and well-designed database, business users can have access to information that – perhaps – they didn’t even realize was being collected.
- Integrate existing Information Systems: many businesses find themselves in the position of having data in a variety of systems that do not communicate with each other. By modeling the data in each of these systems, you can see relationships and redundancies, resolve discrepancies, and integrate disparate systems so they can work together.
- Sources: http://www.dataversity.net/data-models-many-benefits-10/ and https://datafloq.com/read/6-benefits-data-modeling-in-the-age-of-big-data/1479
- "Data modeling is still a priority" http://www.dataversity.net/data-modeling-age-nosql-big-data/
- "The key challenge in data modeling is balancing the needs of the application, the performance characteristics of the database engine, and the data retrieval patterns. The key decision in designing data models for MongoDB applications revolves around the structure of documents and how the application represents relationships between data." https://docs.mongodb.com/v3.2/core/data-modeling-introduction/
- "Picking the right data model is the hardest part of using Cassandra." http://www.datastax.com/dev/blog/basic-rules-of-cassandra-data-modeling
- "Flexible documents make schema evolution easier, but it’s still necessary to design the JSON for best performance and scalability. " http://developer.couchbase.com/documentation/server/current/data-modeling/concepts-data-modeling-intro.html
- “There is no longer a question of whether NoSQL data stores are being used – they are at an ever growing rate. There is also not a question of whether the data stored within them still needs to be modeled. A data model of some kind is vital: definitions, elements, transformations, and relationships still need to be understood. There is no way for an organization to gain a 360° view of their entire business at all data, system, application, decision process, transaction, and customer levels without some sort of models or maps to explain them.” http://whitepapers.dataversity.net/content50274
- "Even when not required, it is often highly desirable to enforce conformance to some schema for some or all of the data being stored, in order to make it more likely that only valid data is stored, and to give guarantees to application code that a certain degree of sanity is present in the data." Ted Hills, author of "NoSQL and SQL Data Modeling"
Insights into modeling NoSQL - a Dataversity report
"Programmers and developers who are creating applications that pull data from a specific data store – whether its Key-Value, Wide Column, Document, Graph, Relational, or some hybrid thereof – still have to understand what is going on with the data inside that system. There needs to be some type of “model” or “map” so that the data stream from ingestion through analysis is understood at every step. Without such an understanding, all that exists is Big Data chaos [...]
Just because NoSQL systems have flexible data types and differing schemas, it doesn’t mean that everyone working with the data – from data modelers to business analysts to data stewards to DBA’s to end users to executives – doesn’t need to understand the data. They do now more than ever [...]
The need to capture, describe, define, and clearly explain the vast store of data elements within an enterprise’s data architecture is vital. That is not going to change. It is only the techniques and methodologies that are changing [...]
As NoSQL spreads from technology startups to large enterprises with demanding data establishments, the requirements of enterprise users must be given due attention. In order to be accepted and successful beyond just working applications, enterprise users need to see their concerns addressed: modeling, governance, documentation, and exemplary tooling [...]
Clearly, there is a need for a new generation of data modelers: modelers who understand the specifics of the data store and are able to provide the level of decision making and documentation that is needed for an enterprise project […]
One of the most prominent criticisms against implementing any kind of NoSQL solution is the lack of sufficient tools and functionality that enterprises have relied upon for decades. Added to this issue is the complexity of so many different systems and their subsequent modeling challenges. The traditional mold has been unraveled. Trying to find adequate and understandable options without rewriting the entire book on Data Management (and Data Modeling) is a considerable challenge for many organizations."