Data architecture principles to help drive your modern data strategy!
We all know Ted, or a Ted-type-architect. Ted believes deeply in technical excellence, and the principles, and policies that govern Data Architecture, he sees them as commandments. He adheres to firm rules for collecting, integrating, using, and managing data assets at scale, in the enterprise. Below are Ted’s six "commandments" regarding Data Architecture.
1. Thou shall make data accessible
Ted's belief is that data should be easily accessible to users in order to efficiently perform their business functions. In our discussion on "Data Management Best Practices," we highlighted the importance of making data widely available for quick and efficient decision-making, as well as leveraging it as a catalyst for a successful digital strategy by unlocking its true value.
Of course, questions arise about which data should be made available to whom in the given environment. Additionally, there is a need for training on how to use the data, governance, disclosure, and other related components.
2. Honor thy data as an asset
Data as an asset is topping the hype cycle. Regardless, data is a valuable asset that ought to be managed accordingly, has real measurable value, and equal measure risk if disclosed inappropriately.
The purpose of data has always been to aid in decision-making but was often only historical in nature. Now, with Machine Learning and Artificial Intelligence, timely data can also be predictive data that lead not just to timely decisions, but to competitive advantages as well. Thus, like any other corporate asset, data must be well managed; not just to know where our data is located, but also know its accuracy and how best to access it when needed.
What is the relationship between the value of data, how it can be accessed and how it can be shared? Do we need data stewards? Do we need to update our processes? All good questions to ask.
3. Data shall be defined
In order to promote the sharing of data, a common vocabulary needs to be adopted throughout the enterprise. This simply means that data is defined consistently throughout the enterprise and that the definitions are well socialized and easily digestible to the data consumers.
From a development perspective, a common vocabulary will facilitate effective dialog between technology and the business. The implications here surface questions regarding data administration, vocabulary definitions, dealing with ambiguity, how to deal with data standardization and more.
4. Data shall be governed
Data governance should be a consideration in the architecting of any modern data architecture. But who exactly dictates what data can be and how it will be collected? Data governance requires accountability. A data steward role can provide oversight with an eye toward standardization and definition, but a data trustee will ensure data quality and fitness for purpose - a trustee’s responsibilities are narrower, focusing on the accuracy and currency of the data.
The implications for the modern enterprises are understanding the differences and how to navigate the culture change from ownership to data trusteeship.
5. Data shall be managed
All the business units within an enterprise take part in the information management decision-making process that supports the objectives of the business. In this sense, the management data becomes the employee's responsibility; they are the end-users, or consumers, of the data and systems in support of the business. The implications for traditional IT are that it is not enough to speak to a limited number of stakeholders, all business units within the organization must have a seat at the table with regard to decisions pertaining to the underlying information systems or environment.
How to engage experts and technicians from across the enterprise to come together in support of the information environment, and work together as a cohesive team, can be a challenge.
6. Thy data shall be secured
This should have been the first of Ted’s commandments given how widespread and commonplace data breaches are today. Despite our best efforts, the single greatest vulnerability to our information environments are people. From a data perspective, we want to ensure that data is protected from unauthorized access and, ultimately, disclosure; this includes personal and non-personal information such as pre-decisional, sensitive, source-selection-sensitive, and proprietary information.
Security is complex and there is a balance that needs to be established so that it is not so cumbersome that it interferes with the user’s ability to carry out their job function. Aggregating data in a data lake, while great for data scientists, creates a rich target for hackers. Are your controls sufficient? Do you have separate systems that contain different classifications; how are they secured? Are you managing at the data level, rather than the application level? Are you designing security into data elements from the beginning?
Comments