Book a Demo

Author Topic: SQL - query within a given large hierarchy (or namespace)?  (Read 5578 times)

Svend Erik Nygaard

  • EA User
  • **
  • Posts: 131
  • Karma: +2/-2
  • Business Information Architect
    • View Profile
SQL - query within a given large hierarchy (or namespace)?
« on: December 19, 2019, 01:19:41 am »
How do I SQL-query items within a large hierarchy of nodes (or namespace)?
Of course, I can use joins (recursive in programming or through unions in a single SQL statement) - BUT that is not performance-friendly.
I thought (hoped) that EA would recursively mark all items down through the tree structure in a given namespace - but I haven't found any table column (neither in t_xref) for that.

I still hope that there is a perforamnce-friendly way search with SQL within a large hierarchy of nodes.

Any suggestions?
« Last Edit: December 19, 2019, 01:40:41 am by Svend Erik Nygaard »

Geert Bellekens

  • EA Guru
  • *****
  • Posts: 13523
  • Karma: +574/-33
  • Make EA work for YOU!
    • View Profile
    • Enterprise Architect Consultant and Value Added Reseller
Re: SQL - query within a given large hierarchy (or namespace)?
« Reply #1 on: December 19, 2019, 04:48:24 am »
Hi Svend,

If you are using the SQL Search you can use the #Branch# macro, that is pretty fast.

When I need something similar in a script, I use my own "Branch" function called getPackageTreeIDString(package)
Check https://github.com/GeertBellekens/Enterprise-Architect-VBScript-Library/blob/master/Framework/Utils/Util.vbs for more details.

It uses gets the list of package ID's from a given package, but in a rather efficient way (using an SQL Query for each layer)

It's not ideal, but it seems to be faster than having to do 10 or more joins in a query (and it is not limited to a fixed number of joins)

Geert

Svend Erik Nygaard

  • EA User
  • **
  • Posts: 131
  • Karma: +2/-2
  • Business Information Architect
    • View Profile
Re: SQL - query within a given large hierarchy (or namespace)?
« Reply #2 on: December 19, 2019, 11:01:27 pm »
Thanks, Gert.

It's mainly for SQL statements in:
  • EA's search
    • OK, so I can use #Branch# in this - I actually had not seen that - Thanks :)
  • Our home-build website's Java backend
    • Your getPackageTreeIDString(package) is exactly what I use in some scenarios
I'm considering to implement a kind of mark on every kind of node in a given branch in the model (what I hoped EA's namespace system would do for me).
- perhaps, simply adding a delimetered path string (a path with names and/or a path with IDs or guids)
- probably by db triggers (I'm sure I need to do this asynchronously)
- ... and perhaps pushing it into an indexing system - I tried pushing the EA database into Solr, and it's blazingly fast - and gives me best-match rankings etc.
- ... ... or into a no-sql db (e.g. MongoDB) tree structure ...

« Last Edit: December 19, 2019, 11:15:30 pm by Svend Erik Nygaard »

PeterHeintz

  • EA Practitioner
  • ***
  • Posts: 1001
  • Karma: +59/-18
    • View Profile
Re: SQL - query within a given large hierarchy (or namespace)?
« Reply #3 on: December 20, 2019, 01:37:00 am »
If you are using an SQL DB supporting the SQL “With” statement, you could use something like this below to get your name space path on the fly.
Not “super fast” because it is  recursive, but if you only use is once in a query it works fine.

WITH PathBuilder(Parent_ID, Package_ID, Package_Name, GUIDPath, NamePath, IDPath) AS (SELECT        Parent_ID, Package_ID, Name, CAST(ea_guid AS VARCHAR(1000)) AS GUIDPath, CAST(Name AS VARCHAR(1000)) AS NamePath,
                                                                                                                                                                                                                                    CAST(Package_ID AS VARCHAR(1000)) AS IDPath
                                                                                                                                                                                                          FROM            dbo.t_package AS pkg
                                                                                                                                                                                                          WHERE        (Parent_ID = 0)
                                                                                                                                                                                                          UNION ALL
                                                                                                                                                                                                          SELECT        pkg.Parent_ID, pkg.Package_ID, pkg.Name, CAST(pb.GUIDPath + '.' + pkg.ea_guid AS VARCHAR(1000)) AS GUIDPath,
                                                                                                                                                                                                                                   CAST(pb.NamePath + '.' + pkg.Name AS VARCHAR(1000)) AS NamePath,
                                                                                                                                                                                                                                   CAST(pb.IDPath + '.' + CAST(pkg.Package_ID AS VARCHAR(1000)) AS VARCHAR(1000)) AS IDPath
                                                                                                                                                                                                          FROM            dbo.t_package AS pkg INNER JOIN
                                                                                                                                                                                                                                   PathBuilder AS pb ON pkg.Parent_ID = pb.Package_ID)
    SELECT DISTINCT Parent_ID, Package_ID, Package_Name, GUIDPath, NamePath, IDPath
     FROM            PathBuilder AS pb
Best regards,

Peter Heintz

Ian Mitchell

  • EA User
  • **
  • Posts: 507
  • Karma: +22/-4
  • The eaDocX and Model Expert guy
    • View Profile
Re: SQL - query within a given large hierarchy (or namespace)?
« Reply #4 on: January 07, 2020, 03:51:11 am »
I've had some success from using the SQL 'IN' statement for the package IDs.
1. first read the whole package table and
2. create the whole package structure,
3. when you need to get everything under a package, read t_object with 'where package_Id IN (packageid1,packageid2....).
The only difficulty is the length of the list of package IDs, but I test this before submitting the SQL, and split it up into smaller queries if needed (unusual).
This seems to work consistently fast for in most cases, as it needs 2 x SQL queries - mostly.


...but I'm sure Geert's approach is better, if only I understood it...
Ian Mitchell, Designer, eaDocX


www.eaDocX.com
www.theartfulmodeller.com

Geert Bellekens

  • EA Guru
  • *****
  • Posts: 13523
  • Karma: +574/-33
  • Make EA work for YOU!
    • View Profile
    • Enterprise Architect Consultant and Value Added Reseller
Re: SQL - query within a given large hierarchy (or namespace)?
« Reply #5 on: January 07, 2020, 04:52:05 am »
I've had some success from using the SQL 'IN' statement for the package IDs.
1. first read the whole package table and
2. create the whole package structure,
3. when you need to get everything under a package, read t_object with 'where package_Id IN (packageid1,packageid2....).
The only difficulty is the length of the list of package IDs, but I test this before submitting the SQL, and split it up into smaller queries if needed (unusual).
This seems to work consistently fast for in most cases, as it needs 2 x SQL queries - mostly.


...but I'm sure Geert's approach is better, if only I understood it...
I'm basically doing the same thing, except that I only read the package structure of the packages I really need.
To do so I read the sub-packageID's level by level.
So if my package branch is 5 levels deep I only need 5 SQL queries to get all the packageID's of my branch.

I then join the packageID's with comma's inbetween, and use them in a query with an "IN" statement.

It would be an interesting test however to see if getting a dictionary with all packageID and their parentID, and then building the package tree in memory, would be faster then doing it my way.
Processing data structures in memory can be surprisingly fast, but in a large model the overhead of getting the whole package structure might make it slow down (and heavy on resources)

Geert