Sparx Systems Forum

Enterprise Architect => Automation Interface, Add-Ins and Tools => Topic started by: Ian Mitchell on February 26, 2020, 11:11:24 pm

Title: Get all elements in a package, and their related elements
Post by: Ian Mitchell on February 26, 2020, 11:11:24 pm: I need to create a performance enhancement bit of SQL, but I think I just went beyond my abilities.
The problem is to get
(1) all the elements in a package and
(2) where they have connections to other element, details of the other element as well.
I managed to get 1/2 way with:
Code: [Select]
SELECT * FROM (T_OBJECT LEFT OUTER JOIN T_CONNECTOR ON T_OBJECT.OBJECT_ID = T_CONNECTOR.START_OBJECT_ID) WHERE T_OBJECT.package_ID=1235
(no, I won't be using Select *...)
This is great - it returns all the elements, and details of their connectors. OK, there are some duplicate rows where an element has multiple connectors, but I'm fine with that.
What I can't seem to get to work it to re-join this back to t_object, to get all the details of each related object.
I've tried all kinds of variations with more joins, lots of brackets/no brackets, but I now feel like a monkey writing random SQL and hoping for an answer...
Title: Re: Get all elements in a package, and their related elements
Post by: philchudley on February 26, 2020, 11:53:36 pm: I am sure Geert will reply with a fancy SQL statement using Inner Joins etc, but I don't tend use these, will the following work for you?

SELECT source.Name, source.Object_Type, t_connector.Connector_Type, target.Name, target.Object_Type
FROM t_object source, t_connector, t_package, t_object target
WHERE source.Object_ID = t_connector.Start_Object_ID
AND target.Object_ID = t_connector.End_Object_ID
AND source.Package_ID = t_package.Package_ID
AND t_package.Package_ID = 1235
ORDER By source.Name ASC

This query uses alias as specified in the FROM clause

All the best

Phil
Title: Re: Get all elements in a package, and their related elements
Post by: Ian Mitchell on February 27, 2020, 12:03:49 am: Thanks Phil, but this one doesn't seem to return elements which don't have connectors, and I need them as well.
I guess I could just run a second query without the clever 'join' bits..
Cheers
Title: Re: Get all elements in a package, and their related elements
Post by: Eve on February 27, 2020, 09:12:55 am: Code: [Select]
select * from t_object where Package_ID=1235 or exists (select * from t_connector, t_object other_end where other_end.Package_ID=1235 and t_object.Object_ID=t_connector.Start_Object_ID and other_end.Object_ID=t_connector.End_Object_ID) or exists (select * from t_connector, t_object other_end where other_end.Package_ID=1235 and t_object.Object_ID=t_connector.End_Object_ID and other_end.Object_ID=t_connector.Start_Object_ID)The eap file I used to test could not handle checking both connector ends in the one exists and ended up doing a complete scan over the entire t_object, t_connector space.
Title: Re: Get all elements in a package, and their related elements
Post by: Geert Bellekens on February 27, 2020, 06:12:04 pm: It's really not that hard. ???

You need to to a left join to t_connector and another left join to t_object.
You need left joins because you also want to return those that don't have connectors.

This is the query I would write for SQL Server

Code: [Select]
select * from t_object o left join t_connector c on o.Object_ID in (c.Start_Object_ID, c.End_Object_ID) left join t_object o2 on o2.Object_ID in (c.Start_Object_ID, c.End_Object_ID) and o2.Package_ID <> o.Package_ID where o.Package_ID = 1235
I join with t_connector on both start and end object ID
and then I join with t_object again, but with the extra condition that the packageID has to be different (otherwise we would get duplicates as it would join with itself, or with elements from the same package

Now if you want to make this query work on .eap files as well you will have to dumb it down.
Consider the JET SQL interpreter as a slightly mentally challenged 6 year old

Code: [Select]
select * from ((t_object o left join t_connector c on (o.Object_ID = c.Start_Object_ID or o.Object_ID = c.End_Object_ID)) left join t_object o2 on (o2.Object_ID = c.Start_Object_ID or o2.Object_ID = c.End_Object_ID)) where o.Package_ID = 1235 and ( o2.Package_ID is null or o2.Package_ID <> o.Package_ID)
It's exactly the same thing, just written differently.

Geert
Title: Re: Get all elements in a package, and their related elements
Post by: Ian Mitchell on February 27, 2020, 09:33:15 pm: Thanks Geert - this works perfectly, of course.
The only bit I don't understand is the last line:
Code: [Select]
and ( o2.Package_ID is null or o2.Package_ID <> o.Package_ID)Without this, I get lots more duplicate data, but I don't really understand why. Other than the nice one was done by you, and other by me...

Also, I was very brave, and did another join from 'o2' to any t_xref entries (to get additional stereotype information).
Something like
Code: [Select]
select * from (( (t_object o left join t_connector c on (o.Object_ID = c.Start_Object_ID or o.Object_ID = c.End_Object_ID) ) left join t_object o2 on (o2.Object_ID = c.Start_Object_ID or o2.Object_ID = c.End_Object_ID)) left join t_xref x on (o2.Ea_guid= x.Client) ) where o.package_ID IN (1235,1236) and ( o2.Package_ID is null or o2.Package_ID <> o.Package_ID)
I think I may have discovered
Code: [Select]
Geerts Secret of SQL - add lots of brackets, everywhere, until it works. Then stop.
Title: Re: Get all elements in a package, and their related elements
Post by: Geert Bellekens on February 28, 2020, 08:55:50 pm: Quote from: Ian Mitchell on February 27, 2020, 09:33:15 pm
Thanks Geert - this works perfectly, of course.
The only bit I don't understand is the last line:
Code: [Select]
and ( o2.Package_ID is null or o2.Package_ID <> o.Package_ID)Without this, I get lots more duplicate data, but I don't really understand why. Other than the nice one was done by you, and other by me...
That part is actually the equivalent of the second join constraint in the "regular" format.
It says that element o2 should be from a different package as o. (and it also allows for o2 to be null, otherwise we would have an inner join again)
If you don't include that constraint the query would join at least each element o with itself in the form of o2.

You see, we didn't specify if o or o2 had to be the source or the target. So it could be twice the source and therefore join with itself.

I hope I'm making sense to you... :-\

Geert
Title: Re: Get all elements in a package, and their related elements
Post by: Eve on March 04, 2020, 09:33:04 am: Is the query I suggested any easier to understand?
Title: Re: Get all elements in a package, and their related elements
Post by: Ian Mitchell on March 05, 2020, 04:39:20 am: If I'm honest, no.
But that says more about my SQL knowledge that the quality of your reply.
I'll have a go at yours, when I have some spare brain space.

I finally decided what I really, really wanted.
("Tell me what you ...")
It's a monster query which, for the elements in a given set of packages...
- the elements, and their t_xref stereotype data
- all their connectors, with their clever stereotype data from t_xref
- all the target elements, with their t_xref data as well.
..and all playing nicely when there are no connectors, or no t_xref stereotype data for anyone.

At the moment, it's two queries, joined in code. I tried with one, and my head exploded.
Title: Re: Get all elements in a package, and their related elements
Post by: Eve on March 05, 2020, 02:08:48 pm: My query isn't returning any information about any connectors, but it can easily be extended to get the full stereotype description.

Get all the objects in a package
Code: [Select]
select t_object.*, t_xref.Description from t_object left join t_xref on (t_xref.Client=t_object.ea_guid and t_xref.Name='Stereotypes') where Package_ID=1235
Also include any objects where a connector is linked to that object with the other end in the target package.
Code: [Select]
or exists (select * from t_connector, t_object other_end where other_end.Package_ID=1235 and t_object.Object_ID=t_connector.Start_Object_ID and other_end.Object_ID=t_connector.End_Object_ID) or exists (select * from t_connector, t_object other_end where other_end.Package_ID=1235 and t_object.Object_ID=t_connector.End_Object_ID and other_end.Object_ID=t_connector.Start_Object_ID)Each of those lines is a bit like doing a join on the tables in the inner query, without actually doing the join.
Title: Re: Get all elements in a package, and their related elements
Post by: Ian Mitchell on March 06, 2020, 10:46:35 pm: Just as an FYI for other users, I just replaced some of my own SQL, which used the kind of non-explict join (which Phil also uses above) like WHERE table1.col1 = tale2.col2
with the Geert style, using explicit JOIN statements.

The two sets of SQL were pretty much the same, but the Geert-style was around 60x faster. 2500ms vs 42.

I always believed that DBMS optimizers were smart enough to make the two SQLs run the same way, but it seems not.
BTW - each SQL is running via the EA 'SQLQuery' API.
I'm not saying this is a universal result, but it has encouraged my to be smarter in my SQL.
@Eve - sorry, haven't tried your "OR EXISTS" approach yet, so don't know how it compares.
Title: Re: Get all elements in a package, and their related elements
Post by: Geert Bellekens on March 07, 2020, 12:12:02 am: I can imagine that there is sometimes a difference.

With the JOIN style you already limit the table before joining.

With the WHERE style, it first makes a gigantic cartesian product (joining each row of each table B with each row of table B, with each row of table C,...) and only then you tell it do not include certain results (like 99,99% of the results probably)

I guess the more advanced databases (SQL Server, MySQL) will be better at optimizing stuff like that then the more basic ones (JET == MS Access)

Query optimization is sometimes weird. I've have wildly different query times between SQL Server and MySQL depending on the query style i used
The difference was unions against "or" statements

simplified
Code: [Select]
select * from t_object o where o.Object_Type = 'Class' union select * from t_object o where o.Object_Type = 'Actvity'versus
Code: [Select]
select * from t_object o where o.Object_Type = 'Class' or o.Object_Type = 'Activity'
The union variant was much faster on SQL Server, whereas the "or" variant was much faster on MySQL

And with "much" I mean a difference between 20 seconds and 20 minutes

Geert