Book a Demo

Author Topic: Inferring an XSD from and XML - aka reverse engineering a XML file  (Read 37460 times)

Modesto Vega

  • EA Practitioner
  • ***
  • Posts: 1183
  • Karma: +30/-8
    • View Profile
Perhaps I have dreamt or forgotten how to do it. From memory, with previous versions of Sparx EA - i.e., before v16 - it was possible to reverse engineer an XML and get a decent XSD.

With v16 this doesn’t seem to be very straightforward. Importing into the Schema Composer seems to be the only plausible route but CMI is the only option and a reference package is always needed.

What have I forgotten? Did I dreamt it?

Geert Bellekens

  • EA Guru
  • *****
  • Posts: 13523
  • Karma: +574/-33
  • Make EA work for YOU!
    • View Profile
    • Enterprise Architect Consultant and Value Added Reseller
Re: Inferring an XSD from and XML - aka reverse engineering a XML file
« Reply #1 on: January 27, 2026, 12:34:33 am »
You can import an XSD, but I don't think I've ever seen the option to import a XML to generate an XSD model.

There are a few other tools (such as XMLSpy) that do that kind of stuff, but it's always going to be guesswork. There are an infinite number of XSD's that match any given XML.

Geert

Modesto Vega

  • EA Practitioner
  • ***
  • Posts: 1183
  • Karma: +30/-8
    • View Profile
Re: Inferring an XSD from and XML - aka reverse engineering a XML file
« Reply #2 on: January 27, 2026, 12:49:01 am »
I am relatively certain that I have done with an earlier version of Sparx EA, many years ago when features did not change and you knew what you were doing or what you needed to do. Please accept my apologies for the slight rant.

I know if guess work but XML has become guess work, there aren't many XSD's published.

I also know there are many tools out there but they tend to cost money and if they are online there is always a data privacy/sensitivity issue.

Paolo F Cantoni

  • EA Guru
  • *****
  • Posts: 8626
  • Karma: +259/-129
  • Inconsistently correct systems DON'T EXIST!
    • View Profile
Re: Inferring an XSD from and XML - aka reverse engineering a XML file
« Reply #3 on: January 27, 2026, 11:23:47 am »
I am relatively certain that I did with an earlier version of Sparx EA, many years ago, when features did not change, and you knew what you were doing or what you needed to do.  Please accept my apologies for the slight rant.

I know if it’s guesswork, but XML has become guesswork; there aren’t many XSD’s published.

I also know there are many tools out there, but they tend to cost money, and if they are online, there is always a data privacy/sensitivity issue.
Hi Modesto,
As Geert implies, reverse engineering an XSD from an XML is ais an almost pointless exercise.  Even tools like XMLSpy (which I love) can only infer one (of many - as Geert says) XSDs that will validate that particular XML and some nominal variants.  If the intent (as it should be) is to validate a set of incoming (or outgoing) XMLs, then you’ve (as you suggested) “bought a plug nickel”.  If you know what the validation rules are (or should be, or even would like to be), then you should be creating XSDs and check the XMLs against them.

I mourn the passing of XSDs (as I suspect you do), but in the era of “post truth” and “Vibe Coding”, it IS about guesswork.  “I’m sending you some informationin this XML, but you need to decide if it’s misinformation or disinformation!  That’s not my job!”

Paolo
« Last Edit: January 27, 2026, 11:25:25 am by Paolo F Cantoni »
Inconsistently correct systems DON'T EXIST!
... Therefore, aim for consistency; in the expectation of achieving correctness....
-Semantica-
Helsinki Principle Rules!

Modesto Vega

  • EA Practitioner
  • ***
  • Posts: 1183
  • Karma: +30/-8
    • View Profile
Re: Inferring an XSD from and XML - aka reverse engineering a XML file
« Reply #4 on: January 28, 2026, 07:04:27 pm »
What baffles me most about this thread is not having potentially lost some functionality that is achievable with other tools.

What baffles me most is reading that people nowadays have an issue with a key cornerstone of modern science: inference.

Inference also used to be a key cornerstone of all the technical work I used to do, very often starting with requirements.
If the inferences were wrong, they got revised and tested again.

In the age of AI, I’ll rather reverse engineer/infer an XSD based on a large and representative XML file than second guess a development team.

Let the data tell me a story, instead of forcing the data to tell me the story I want.

Geert Bellekens

  • EA Guru
  • *****
  • Posts: 13523
  • Karma: +574/-33
  • Make EA work for YOU!
    • View Profile
    • Enterprise Architect Consultant and Value Added Reseller
Re: Inferring an XSD from and XML - aka reverse engineering a XML file
« Reply #5 on: January 29, 2026, 03:19:20 am »
The point I'm trying to make, is that the "proper" way of doing things is to first make the XSD, and then use that to make (and validate) the XML's.
In most cases where I worked with XML files, there was an XSD available (and I didn't have to "infer" an XSD based on a single sample XML)

The reason it might be less then useful, is because you are only using a single example xml.
How do you know it's representable for all possible variations?
How do you know all allowed values for your enums?
How do you know whether or not a field is optional
How do you know the maximum or minimum values of a field
How do you know the pattern of a text field
etc....

There are many, many unknowns when "infering" an XSD from a sample XML, so all you get is an XSD that corresponds to that one sample.

Geert

Modesto Vega

  • EA Practitioner
  • ***
  • Posts: 1183
  • Karma: +30/-8
    • View Profile
Re: Inferring an XSD from and XML - aka reverse engineering a XML file
« Reply #6 on: January 29, 2026, 04:51:30 am »
I don’t disagree with any of that but it depends on the context, like almost everything.

If XML is used for systems to interoperate directly via an API or messaging, I would expect XSDs and wouldn’t expect to have to reverse engineer an API payload or message. Although, I have now seen plenty of contractless APIs without XSDs or with minimal ones.

But XML is used for many other things, including full data extracts, large enough to cover most posible combinations. In this case, not having the functionality to infer/reverse engineer an XSD does make our jobs easier. And yes, I know others tools can be bought, but that also complicates our jobs.

Paolo F Cantoni

  • EA Guru
  • *****
  • Posts: 8626
  • Karma: +259/-129
  • Inconsistently correct systems DON'T EXIST!
    • View Profile
Re: Inferring an XSD from and XML - aka reverse engineering a XML file
« Reply #7 on: January 29, 2026, 10:42:25 am »
I don’t disagree with any of that, but it depends on the context, like almost everything.

[SNIP]
You’re right, of course, Modesto, but it also depends on the intent (as Geert and I have said).

Allow me a small digression...   Some forty years ago, I was sent from Melbourne (Victoria) to Hobart (Tasmania) to assist a customer of Digital Equipment Corporation (DEC) - some of us will remember them...
While there, I noticed that the LA36 console typewriter would “bing” (Ctrl+G - Bell character) several times a minute.  I asked the operators, and they said it would do this when issuing an error message due to anomalous sensor data.  The cause (I was later told) was a mixture of bad data and a bug.  I had occasion to return about a month later for another reason.  The room was silent!

“I see you’ve fixed the issue!” I said.  “No, we turned off the speaker,” came the reply.

It depends on your intent: to fix the issue or the symptom.   ;)

Cheers,
Paolo


Inconsistently correct systems DON'T EXIST!
... Therefore, aim for consistency; in the expectation of achieving correctness....
-Semantica-
Helsinki Principle Rules!

Modesto Vega

  • EA Practitioner
  • ***
  • Posts: 1183
  • Karma: +30/-8
    • View Profile
Re: Inferring an XSD from and XML - aka reverse engineering a XML file
« Reply #8 on: February 06, 2026, 10:37:30 pm »
“I see you’ve fixed the issue!” I said.  “No, we turned off the speaker,” came the reply.

It depends on your intent: to fix the issue or the symptom.   ;)
You got me laughing.

It is more about supressing the symptom, not fixing it. Something it is way too common nowadays.

Let's bring this thread back on track.

The schema composer has a Schema Set option always defaulted to Common Information Model (CIM), for both New Schema and New Model Transform. Let's assume that we want to do this properly and create a CIM. Is there anyway to specify the package containing the CIM that Scheme Composer should use. I have used this functionality before but I can no longer find it.

P.S.: I know Geert has a really nice add-in but we may not be able to deploy it to our work laptops.