Book a Demo

Author Topic: Generating Large Excel Files using API, What Do You Use?  (Read 6666 times)

Rich Anderson

  • EA User
  • **
  • Posts: 142
  • Karma: +8/-0
    • View Profile
    • LinkedIn
Generating Large Excel Files using API, What Do You Use?
« on: October 27, 2018, 08:53:03 am »
Hi There,  I'm writing lots of VB.NET code that reads and generates Excel files.  I've just been using the Excel API directly to exchange data between EA and Excel and this works really well most of the time.  However, there are times when I want to generate very large Excel files (thousands of rows) and that gets pretty slow, even when I turn the calculations off and do other tricks.  Writing to a CSV file to be opened by Excel, would be faster but I really prefer the convenience of using the Excel API to do things such as formatting and navigating between rows and columns as I please.  I see that there are some 3rd party APIs that don't have Excel dependencies to do this, so I'm wondering what people have used for this or whether you have a recommendation that I might try.   Thanks for your help.
« Last Edit: October 27, 2018, 08:55:34 am by Rich Anderson »
Rich Anderson
Urgnt Limited

qwerty

  • EA Guru
  • *****
  • Posts: 13584
  • Karma: +397/-301
  • I'm no guru at all
    • View Profile
Re: Generating Large Excel Files using API, What Do You Use?
« Reply #1 on: October 27, 2018, 09:36:21 am »
What's the reason for having Excel tables with thousands of rows? Nobody can handle them. Things that large need to go to a database instead.

q.

Geert Bellekens

  • EA Guru
  • *****
  • Posts: 13523
  • Karma: +574/-33
  • Make EA work for YOU!
    • View Profile
    • Enterprise Architect Consultant and Value Added Reseller
Re: Generating Large Excel Files using API, What Do You Use?
« Reply #2 on: October 27, 2018, 04:13:27 pm »
Rich,

A major improvement is to use range.Value2.

How that works is that you create a 2 dimensional array with your data.
The select the range you need and set Value2 to it.

That works almost instantly, as opposed to adding field by field.

See https://github.com/GeertBellekens/Enterprise-Architect-VBScript-Library/blob/master/Framework/Utils/ExcelFile.vbs for an utility class to deal with excel files

Geert

Rich Anderson

  • EA User
  • **
  • Posts: 142
  • Karma: +8/-0
    • View Profile
    • LinkedIn
Re: Generating Large Excel Files using API, What Do You Use?
« Reply #3 on: October 28, 2018, 06:03:36 am »
Thanks, for pointing me to these resources.  I also found this library that avoids Interop and writes to the XML files directly, which looks pretty promising as I do pretty much all my scripting from Visual Studio.   https://github.com/ClosedXML/ClosedXML.   

As to querty's question, all I can say is that I have several use cases such as, importing and exporting APQC process libraries (1,800 rows), generating large Excel files to be read in by Tableau (3,000 rows),  generating a mapping table for analysts to map one enterprise chart of accounts to another (2,000 rows), and generating multiple Excel files to ETL into an ERP system for configuration.   Particularly with business analysts when you are generating and capturing large catalogs (a very common occurrence in my world), they don't want things in databases which are strange and foreign objects to most of them. They almost always want Excel.   It would be better if they used EA directly, but this is not always possible.
« Last Edit: October 28, 2018, 06:05:51 am by Rich Anderson »
Rich Anderson
Urgnt Limited

qwerty

  • EA Guru
  • *****
  • Posts: 13584
  • Karma: +397/-301
  • I'm no guru at all
    • View Profile
Re: Generating Large Excel Files using API, What Do You Use?
« Reply #4 on: October 28, 2018, 08:10:03 am »
What I thought: having only hammer as tool, anything is a nail :-/ I usually kick as much butts as I can when people come with such requirements.Of course sometimes you're not in the position to do that.

q.

Geert Bellekens

  • EA Guru
  • *****
  • Posts: 13523
  • Karma: +574/-33
  • Make EA work for YOU!
    • View Profile
    • Enterprise Architect Consultant and Value Added Reseller
Re: Generating Large Excel Files using API, What Do You Use?
« Reply #5 on: October 28, 2018, 03:42:48 pm »
Thanks, for pointing me to these resources.  I also found this library that avoids Interop and writes to the XML files directly, which looks pretty promising as I do pretty much all my scripting from Visual Studio.   https://github.com/ClosedXML/ClosedXML

We did that with a word document generator a few years ago. I remember that avoiding the Office Interop resulted in a much faster generation of documents.
With Excel I haven't needed this yet as the Value2 works almost instantaneous.
Usually it takes a lot longer to get the information out of EA than it takes to get it into Excel.

Geert

EXploringEA

  • EA User
  • **
  • Posts: 172
  • Karma: +8/-0
    • View Profile
Re: Generating Large Excel Files using API, What Do You Use?
« Reply #6 on: October 29, 2018, 08:47:00 pm »
Hi Rich

In my experience there are 2 issues when exporting large amounts of data to Excel:
  • Excel Interop - each call has an overhead and this can soon be significant
  • Extracting data from Sparx using the API
To address the Excel issue my approach is similar to where you got to, i.e. not using Excel at all but using a library that can generate Excel format files.  In my case, I have used EPPlus .  This had the further advantage of having no dependency on Excel, hence can be used on a computer where Office may not be installed.  When developing code for a large export, tests indicated a 100-fold improvement.

Having addressed the Excel issue, and as Geert mentioned, the time taken to extract information from EA can be significant when working with large databases (or those that are in a cloud somewhere).  My approach is  to reduce the number of interactions with EA; ideally at most a few SQL queries.  I then:
  • Either have the data I need directly from the query
  • OR can process the query results in my code to produce the required output data

The results in my case has been that reports from repositories with 10K's elements and excel files with 1000's rows can be produced in seconds rather than in 10's of mins.

And all done using VB.Net.

BR
Adrian




EXploringEA - information, utilities and addins