Sparx Systems Forum

Enterprise Architect => Suggestions and Requests => Topic started by: Mats Gejnevall on April 05, 2018, 05:06:03 pm

Title: Master data models from information flows
Post by: Mats Gejnevall on April 05, 2018, 05:06:03 pm: In our architecture we model information flows. Now we want to create models and analyze where information is created (mastered) and where it is used.
That can be done by hand by looking at our information flow models and create new models depicting this (or creating them in the relationship matrix).
That is a lot of work!
Is there a way to “automatically” create those models/matrices from the information flows? In most cases the direction of an information flows indicates the creator (master) and the user of information.

Thanks for any suggestion :)
Mats
Title: Re: Master data models from information flows
Post by: Uffe on April 05, 2018, 06:40:58 pm: Hej!

I'm not quite sure what you're asking here. You say you've got the information flows modelled, but you want them in a model?

As for relationships matrices, they're on-the-fly views, not something that gets stored in the project. You can create a matrix profile, but that's it. There's no API to manage matrix profiles, so building something that'll create profiles based on a model would require a bit of digging in the data model.

Also, since a relationship matrix only ever shows one link in a chain they're not a good tool for information flow analysis. Unless all your information flows only ever stretch from one source to one sink, in which case the analysis is trivial anyway.

/Uffe
Title: Re: Master data models from information flows
Post by: Glassboy on April 06, 2018, 07:53:59 am: Quote from: Mats Gejnevall on April 05, 2018, 05:06:03 pm
In our architecture we model information flows. Now we want to create models and analyze where information is created (mastered) and where it is used.
That can be done by hand by looking at our information flow models and create new models depicting this (or creating them in the relationship matrix).
That is a lot of work!
Is there a way to “automatically” create those models/matrices from the information flows? In most cases the direction of an information flows indicates the creator (master) and the user of information.

Interesting idea, but what comes immediately to mind is that the data that makes up your information is likely to be captured or created in a multitude of locations that may not match the system that is authoritative for the information.
Title: Re: Master data models from information flows
Post by: Nizam on April 06, 2018, 08:36:06 am: You can generate a report based on the connector direction, it will be a simple script to iterate through Data Entities and print the Source and Target.

But if you want to be assertive of which is your authoritative master / copy etc you might need to consider a slightly more detailed modeling option.

We use an extended information flow with a tagged value indicating if the Data Entity is a Master,Store,Source Of Truth , etc. if this information is captured at the information flow level, you can easily extract it as a matrix.
Title: Re: Master data models from information flows
Post by: Paolo F Cantoni on April 06, 2018, 09:58:22 am: Hi Mats,

If I understand you correctly, there are a couple of problems with what you are attempting to do. The first is conceptual. I hope you agree that the real Master data is the attribute or property, not the object or entity. "Objects' are "Master Data Objects" because they include one or more Master Data Items (attributes/properties/features). If you don't agree, good luck ;).

Each Master Data Item has a lifetime. Different sources can create/modify the value of the item at different points during its lifetime. Indeed, at the same point, multiple sources may change the value.

Normally, information flows document flows between objects/entities not between the attributes. There are ways around it, but you will need to decide what you are tracking.

Normally, you need to track which systems/business processes access the item and in what manner (CRUD). This suggests that an ArchiMate "Accesses" relationship may be more appropriate.

We have implemented a "source of Truth" relationship which is (effectively) derived from the accesses and indicates that the supplier can be considered a SoT for the related attribute.

HTH,
Paolo
Title: Re: Master data models from information flows
Post by: Richard Freggi on April 06, 2018, 11:25:24 am: Sorry Mats I don't think you can be successful in this one (I'd be interested to know how it works out for you...)
Some considerations:
1. Data does not 'flow', it is queried. Data flows are a hang-on from 1970s mainframe philosophy that stuck around because people liked the idea although it only has very limited usefulness (a bit like flowcharts). I don's know if an architecture based on information flows can be efficient/effective/flexible (I'd be interested to learn more about this)
2. Since data is queried / provided by each table/class, each query typically contains a mix of different attributes from different classes / tables / whathaveyou + the query logic. These are called messages (hello sequence diagram!!!)
3. Can we reconstruct a data model from a sequence diagram? Yes with some effort and some modeler judgement / experience / assumptions, as long as the messages are between participants (I think a Data Flow Diagram maps poorly on to an interaction/collaboration diagram)

TL;DR: There be dragons where you are going, methinks. DAMA website has some good resources about data architecture.
Title: Re: Master data models from information flows
Post by: Glassboy on April 06, 2018, 11:54:13 am: Quote from: Richard Freggi on April 06, 2018, 11:25:24 am
1. Data does not 'flow', it is queried. Data flows are a hang-on from 1970s mainframe philosophy that stuck around because people liked the idea although it only has very limited usefulness (a bit like flowcharts). I don's know if an architecture based on information flows can be efficient/effective/flexible (I'd be

Rubbish. I've documented thousands of flat files transferred between systems in one organisation alone. There was nothing that remotely resembled a structured query anywhere. It was a wholesale flow of data between systems that was then used to create information in those systems.

It's a very common occurrence.
Title: Re: Master data models from information flows
Post by: KP on April 06, 2018, 04:26:21 pm: Quote from: Richard Freggi on April 06, 2018, 11:25:24 am
Sorry Paolo I don't think you can be successful in this one

Did you mean to address Mats, not Paolo?
Title: Re: Master data models from information flows
Post by: Paolo F Cantoni on April 06, 2018, 04:40:29 pm: Quote from: KP on April 06, 2018, 04:26:21 pm
Quote from: Richard Freggi on April 06, 2018, 11:25:24 am
Sorry Paolo I don't think you can be successful in this one

Did you mean to address Mats, not Paolo?
I hope so because Richard seems to be saying the same as me (AFAICT). Although I agree with Glassboy that messages are flows of data between entities.

I kept going round and round in circles (especially with my business users) until I realised that it was the attributes and not the entities that were "Master Data". After that, we were even able to resolve the "that's MY data" problem.

I personally think that the concept of "Source of Truth" is misleading⁽¹⁾. The implication is that there is one source of absolute truth. Neither of which is correct.

Paolo
⁽¹⁾ And if you try to find a consistent definition on the interweb, you're unlucky.
Title: Re: Master data models from information flows
Post by: Richard Freggi on April 06, 2018, 11:23:13 pm: Yes Paolo I got the names mixed up. Gotta stop posting before my 2nd cup of coffee in the morning. Original post edited with correct name (Mats).
Title: Re: Master data models from information flows
Post by: Glassboy on April 09, 2018, 08:42:44 am: Quote from: Paolo F Cantoni on April 06, 2018, 04:40:29 pm
I personally think that the concept of "Source of Truth" is misleading⁽¹⁾. The implication is that there is one source of absolute truth. Neither of which is correct.

It's just a symptom of people mixing data and information and not understanding the context of either. There generally is only one source for each context. There's a reason that Zachman had a list of "what" at the contextual level. You need to create the list before you start looking at the lower levels; but people never do.
Title: Re: Master data models from information flows
Post by: Paolo F Cantoni on April 09, 2018, 10:17:25 am: Quote from: Glassboy on April 09, 2018, 08:42:44 am
Quote from: Paolo F Cantoni on April 06, 2018, 04:40:29 pm
I personally think that the concept of "Source of Truth" is misleading⁽¹⁾. The implication is that there is one source of absolute truth. Neither of which is correct.

It's just a symptom of people mixing data and information and not understanding the context of either. There generally is only one source for each context. There's a reason that Zachman had a list of "what" at the contextual level. You need to create the list before you start looking at the lower levels; but people never do.
"System of Record" is another misunderstood term. People often conflate it with "source of truth" and even "system of authorship", which may or may not (and often aren't) Systems of Record.

Paolo
Title: Re: Master data models from information flows
Post by: Glassboy on April 09, 2018, 11:12:26 am: Quote from: Paolo F Cantoni on April 09, 2018, 10:17:25 am
"System of Record" is another misunderstood term. People often conflate it with "source of truth" and even "system of authorship", which may or may not (and often aren't) Systems of Record.

A system of record is there to meet the requirements of a "why". If you don't know the "why" - the contractual or legal obligation - there is no "record".
Title: Re: Master data models from information flows
Post by: Paolo F Cantoni on April 09, 2018, 01:50:51 pm: Quote from: Glassboy on April 09, 2018, 11:12:26 am
Quote from: Paolo F Cantoni on April 09, 2018, 10:17:25 am
"System of Record" is another misunderstood term. People often conflate it with "source of truth" and even "system of authorship", which may or may not (and often aren't) Systems of Record.

A system of record is there to meet the requirements of a "why". If you don't know the "why" - the contractual or legal obligation - there is no "record".
Can you expand on that? I've not heard it that way before.

Paolo
Title: Re: Master data models from information flows
Post by: Glassboy on April 10, 2018, 08:22:24 am: Quote from: Paolo F Cantoni on April 09, 2018, 01:50:51 pm
Quote from: Glassboy on April 09, 2018, 11:12:26 am
A system of record is there to meet the requirements of a "why". If you don't know the "why" - the contractual or legal obligation - there is no "record".
Can you expand on that? I've not heard it that way before.

There are two sorts of things organisations do. Things they want to do and things they have to do. When you have a good look at what a system of record it is doing (in my experience) it is capturing data about entities or events that relate to something the organisation has to do.

For example (if you are using ArchiMate) at the motivation level you should have a Stakeholder and a Driver for example "NZ Police" and "Comply with suspicious transaction reporting requirements of the Anti-Money Laundering and Countering Financing of Terrorism Act 2009". There should also be a Goal along the lines of not trigger the punitive damages associated with not meeting the obligations. These motivational elements will all connect some how to a system of record for transactions. In an industry like banking this system probably predates business analysts and architects fucking things up. At some stage someone probably trained in systems analysis laid all the ground work for a mature and robust data model for transactions.

Where we run in to trouble is when legislation changes or a new concept is introduced and the design doesn't start at the conceptual layer. You don't know why you are making a change beyond what is in the project scope document. You end up with a system that records things, but not the record you need to meet the obligation. Or a very fragile record.
Title: Re: Master data models from information flows
Post by: Paolo F Cantoni on April 10, 2018, 10:02:44 am: Quote from: Glassboy on April 10, 2018, 08:22:24 am
Quote from: Paolo F Cantoni on April 09, 2018, 01:50:51 pm
Quote from: Glassboy on April 09, 2018, 11:12:26 am
A system of record is there to meet the requirements of a "why". If you don't know the "why" - the contractual or legal obligation - there is no "record".
Can you expand on that? I've not heard it that way before.

There are two sorts of things organisations do. Things they want to do and things they have to do. When you have a good look at what a system of record it is doing (in my experience) it is capturing data about entities or events that relate to something the organisation has to do.

For example (if you are using ArchiMate) at the motivation level you should have a Stakeholder and a Driver for example "NZ Police" and "Comply with suspicious transaction reporting requirements of the Anti-Money Laundering and Countering Financing of Terrorism Act 2009". There should also be a Goal along the lines of not trigger the punitive damages associated with not meeting the obligations. These motivational elements will all connect some how to a system of record for transactions. In an industry like banking this system probably predates business analysts and architects fucking things up. At some stage someone probably trained in systems analysis laid all the ground work for a mature and robust data model for transactions.

Where we run in to trouble is when legislation changes or a new concept is introduced and the design doesn't start at the conceptual layer. You don't know why you are making a change beyond what is in the project scope document. You end up with a system that records things, but not the record you need to meet the obligation. Or a very fragile record.
Thanks, Glassboy,

That's an interesting take. My definition of a system of record is somewhat simpler, but the end result, I think, is close to yours.

I believe (without actual proof - but else why coin it?) that the term "System of Record" derives from the epithet "Newspaper of Record" - such as is/was applied to the Washington Post, the Times of London etc. These newspapers are so designated because they are general purpose "and their editorial and news-gathering functions are considered comprehensive, professional and typically authoritative". In addition, should one wish to access information about a past event, one can consult their archives and determine the "facts" at that point in time. That is, they create factual records and retain them for later consultation.

From my point of view, a System of Record needs to be able to hold past data and how that data (or the understanding of that data) has evolved via any appropriate state episodes. So far, this corresponds with your "capturing data about entities or events that relate to something the organisation has to do".

Now where I think I align with your view is that as the facts to be held (one could say the "editorial and news-gathering functions") need to change because the environment or context changes and the system doesn't change accordingly, it can no longer be accorded the epithet "System of Record", since it can no longer record the necessary facts.

How's that sound? I'd like to come to a useful definition because I can then add it to our Ontological Model and use it to educate our modellers, architects and users.

Paolo
Title: Re: Master data models from information flows
Post by: Glassboy on April 10, 2018, 12:56:07 pm: Quote from: Paolo F Cantoni on April 10, 2018, 10:02:44 am
That's an interesting take. My definition of a system of record is somewhat simpler, but the end result, I think, is close to yours.

I believe (without actual proof - but else why coin it?) that the term "System of Record" derives from the epithet "Newspaper of Record" - such as is/was applied to the Washington Post, the Times of London etc. These newspapers are so designated because they are general purpose "and their editorial and news-gathering functions are considered comprehensive, professional and typically authoritative". In addition, should one wish to access information about a past event, one can consult their archives and determine the "facts" at that point in time. That is, they create factual records and retain them for later consultation.

There's a simple test for that proposition and the answer seems to be no :-) https://books.google.com/ngrams/graph?content=system+of+record%2C+newspaper+of+record&year_start=1800&year_end=2000&corpus=15&smoothing=3&share=&direct_url=t1%3B%2Csystem%20of%20record%3B%2Cc0%3B.t1%3B%2Cnewspaper%20of%20record%3B%2Cc0

Quote
From my point of view, a System of Record needs to be able to hold past data and how that data (or the understanding of that data) has evolved via any appropriate state episodes. So far, this corresponds with your "capturing data about entities or events that relate to something the organisation has to do".

Now where I think I align with your view is that as the facts to be held (one could say the "editorial and news-gathering functions") need to change because the environment or context changes and the system doesn't change accordingly, it can no longer be accorded the epithet "System of Record", since it can no longer record the necessary facts.

How's that sound? I'd like to come to a useful definition because I can then add it to our Ontological Model and use it to educate our modellers, architects and users.

In that it may only contain a subset or cause a perceptual problem (such as believing monotremes are no different than other mammals).
Title: Re: Master data models from information flows
Post by: Sunshine on April 10, 2018, 12:58:18 pm: FYI Gartner has something called pace layering which describes three types of system.

Systems of Record — Established packaged applications or legacy homegrown systems that support core transaction processing and manage the organization's critical master data. The rate of change is low, because the processes are well-established and common to most organizations, and often are subject to regulatory requirements.

Systems of Differentiation — Applications that enable unique company processes or industry-specific capabilities. They have a medium life cycle (one to three years), but need to be reconfigured frequently to accommodate changing business practices or customer requirements.

Systems of Innovation — New applications that are built on an ad hoc basis to address new business requirements or opportunities. These are typically short life cycle projects (zero to 12 months) using departmental or outside resources and consumer-grade technologies.

Reference https://www.gartner.com/newsroom/id/1923014 (https://www.gartner.com/newsroom/id/1923014)
Title: Re: Master data models from information flows
Post by: Paolo F Cantoni on April 10, 2018, 04:01:03 pm: Quote from: Glassboy on April 10, 2018, 12:56:07 pm
Quote from: Paolo F Cantoni on April 10, 2018, 10:02:44 am
That's an interesting take. My definition of a system of record is somewhat simpler, but the end result, I think, is close to yours.

I believe (without actual proof - but else why coin it?) that the term "System of Record" derives from the epithet "Newspaper of Record" - such as is/was applied to the Washington Post, the Times of London etc. These newspapers are so designated because they are general purpose "and their editorial and news-gathering functions are considered comprehensive, professional and typically authoritative". In addition, should one wish to access information about a past event, one can consult their archives and determine the "facts" at that point in time. That is, they create factual records and retain them for later consultation.

There's a simple test for that proposition and the answer seems to be no :-) https://books.google.com/ngrams/graph?content=system+of+record%2C+newspaper+of+record&year_start=1800&year_end=2000&corpus=15&smoothing=3&share=&direct_url=t1%3B%2Csystem%20of%20record%3B%2Cc0%3B.t1%3B%2Cnewspaper%20of%20record%3B%2Cc0
Interesting, but I strongly suspect that the meaning then (the 1800s) is not the meaning now (we need a system of record for the epithet, "System of Record" ;))

Seriously, though, I do suspect a change in meaning over time. Like the word "Naughty", for example.

Quote
Quote
From my point of view, a System of Record needs to be able to hold past data and how that data (or the understanding of that data) has evolved via any appropriate state episodes. So far, this corresponds with your "capturing data about entities or events that relate to something the organisation has to do".

Now where I think I align with your view is that as the facts to be held (one could say the "editorial and news-gathering functions") need to change because the environment or context changes and the system doesn't change accordingly, it can no longer be accorded the epithet "System of Record", since it can no longer record the necessary facts.

How's that sound? I'd like to come to a useful definition because I can then add it to our Ontological Model and use it to educate our modellers, architects and users.

In that it may only contain a subset or cause a perceptual problem (such as believing monotremes are no different than other mammals).
That last sentence isn't clear to me. can you elaborate?

Paolo
Title: Re: Master data models from information flows
Post by: Paolo F Cantoni on April 10, 2018, 04:07:38 pm: Quote from: Sunshine on April 10, 2018, 12:58:18 pm
FYI Gartner has something called pace layering which describes three types of system.

Systems of Record — Established packaged applications or legacy homegrown systems that support core transaction processing and manage the organization's critical master data. The rate of change is low, because the processes are well-established and common to most organizations, and often are subject to regulatory requirements.

Systems of Differentiation — Applications that enable unique company processes or industry-specific capabilities. They have a medium life cycle (one to three years), but need to be reconfigured frequently to accommodate changing business practices or customer requirements.

Systems of Innovation — New applications that are built on an ad hoc basis to address new business requirements or opportunities. These are typically short life cycle projects (zero to 12 months) using departmental or outside resources and consumer-grade technologies.

Reference https://www.gartner.com/newsroom/id/1923014 (https://www.gartner.com/newsroom/id/1923014)
I don't think the definition of System of Record above has much to do with the "of record" part.

From the descriptions (and they ARE descriptions - since they describe some properties, but you can't use the properties to classify) it may be that they were thinking of
"Recording Systems vs Differentiating Systems vs Innovating Systems", but just because you are a Recording System, it doesn't (ipso facto) make you a "System of Record". Just as "an Officer of a Statutory Entity" is not necessarily a "Statutory Officer".

Paolo
Title: Re: Master data models from information flows
Post by: Mats Gejnevall on April 10, 2018, 08:53:56 pm: Quote from: Paolo F Cantoni on April 06, 2018, 09:58:22 am
Hi Mats,

If I understand you correctly, there are a couple of problems with what you are attempting to do. The first is conceptual. I hope you agree that the real Master data is the attribute or property, not the object or entity. "Objects' are "Master Data Objects" because they include one or more Master Data Items (attributes/properties/features). If you don't agree, good luck ;).

Each Master Data Item has a lifetime. Different sources can create/modify the value of the item at different points during its lifetime. Indeed, at the same point, multiple sources may change the value.

Normally, information flows document flows between objects/entities not between the attributes. There are ways around it, but you will need to decide what you are tracking.

Normally, you need to track which systems/business processes access the item and in what manner (CRUD). This suggests that an ArchiMate "Accesses" relationship may be more appropriate.

We have implemented a "source of Truth" relationship which is (effectively) derived from the accesses and indicates that the supplier can be considered a SoT for the related attribute.

HTH,
Paolo

Thanks Paolo
Right now we solved it by having by having both information flows between applications and relations between applications and information elements (create and use). But it becomes hard to maintain over time. And some information elements are created by multiple applications. We have a set of sensors that send the same type of information to a central strategic application. And the information is sent over the barebone network, no services are called (as a comment to Richard).

Then we created some relatedElement compartment shapescripts so we can for each application easily see which information elements an application create and use.

But we would like to have the possibility to automatically suggest updates to the relations between the appplications and information elements (create, use) to avoid some work.

Setting an attribute on the information element does not work either because then you cannot see what application is the master, just that the information is mastered or used, NOT by whom it is created or used.
Title: Re: Master data models from information flows
Post by: Glassboy on April 11, 2018, 07:29:59 am: Quote from: Sunshine on April 10, 2018, 12:58:18 pm
FYI Gartner has something called pace layering which describes three types of system.

I'd partially forgotten about that. It's more a tool for getting management to understand different systems have different life cycle management requirements. Not that I've ever really seen it work.
Title: Re: Master data models from information flows
Post by: Sunshine on April 12, 2018, 05:58:53 pm: Quote from: Paolo F Cantoni on April 10, 2018, 04:07:38 pm
I don't think the definition of System of Record above has much to do with the "of record" part.

From the descriptions (and they ARE descriptions - since they describe some properties, but you can't use the properties to classify) it may be that they were thinking of
"Recording Systems vs Differentiating Systems vs Innovating Systems", but just because you are a Recording System, it doesn't (ipso facto) make you a "System of Record". Just as "an Officer of a Statutory Entity" is not necessarily a "Statutory Officer".
Agree with you 100% - was just pointing out how the term "System of Record" can be confused as major consulting firms coin phrases which muddy the water.
Title: Re: Master data models from information flows
Post by: Paolo F Cantoni on April 13, 2018, 09:36:53 am: Quote from: Sunshine on April 12, 2018, 05:58:53 pm
Quote from: Paolo F Cantoni on April 10, 2018, 04:07:38 pm
I don't think the definition of System of Record above has much to do with the "of record" part.

From the descriptions (and they ARE descriptions - since they describe some properties, but you can't use the properties to classify) it may be that they were thinking of
"Recording Systems vs Differentiating Systems vs Innovating Systems", but just because you are a Recording System, it doesn't (ipso facto) make you a "System of Record". Just as "an Officer of a Statutory Entity" is not necessarily a "Statutory Officer".
Agree with you 100% - was just pointing out how the term "System of Record" can be confused as major consulting firms coin phrases which muddy the water.
(my emphasis)
Yes, I took that point, but wanted to emphasise it as I wanted to combat "Nobody got fired for buying/following <insert vendor/consultancy of choice>".

I no longer expect better from "the experts".

In this game, "Rigour is your friend".

Paolo