Use Jasper to generate large data only CSV report

0

I am running into performance problems when using Jasper as an embedded library to generate CSV reports on the backend that can be uploaded later.  The report currently has over 1/2 a million records and each record has roughly 120 columns.  The call to FillReport ran out of memory after running for over 10 hours even with a file virtualizer defined (although I believe the 'ignorePagination' = true caused that error). As a benchmark, when I use SQL Developer to run the same query and export, it takes only 20-30 minutes.

Currently the query is tied with the main data set and the report only has one detail section and in it, a textfield definition for each column, so it looks something like this:

<detail>
   <band height="42">
      <textField isStretchWithOverflow="true" isBlankWhenNull="true">
         <reportElement x="0" y="0" width="6" height="30" uuid="2493377b-d1a4-4caa-bea5-f66aed6c8620">
            <property name="net.sf.jasperreports.export.csv.column.name" value="ORDER_STATUS"/>
            <printWhenExpression><![CDATA[!$P{CNXS_HIDE}.contains( "AO_STATUS" )]]></printWhenExpression>
         </reportElement>
         <textFieldExpression><![CDATA[$F{AO_STATUS}]]></textFieldExpression>
      </textField>
etc.

Is there a better way to define this report in order to get better performance?

Thanks

Bas

basven's picture
4
Joined: Oct 11 2017 - 9:08am
Last seen: 7 hours 16 min ago

To answer your comments, no pojo is created, it is a simple SQL query to an Oracle database that needs to be exported to a CSV file. I am trying to find out what the best report option is for this use case to minimize overhead that Jasper needs to render the data.  I don't want Jasper to spend a lot of time trying to figure out if rows of data fit on one line, one page, in a band etc., it is really just one contineous report of data.

I reran my test and kept my query to return 10.000 records with a db fetch size of 50 records and a swap virtualizer.  I see the following results:

  • Using SQL Developer to run the query and export: 2 minutes
  • Using the report setup as defined in my original post: 7.5 minutes
  • Using the original report which is based on the provided 'Letter_Landscape_Tablebased' template: out of memory after 5.5 hours.

Obviously using the template report is causing some infinite loop with calculating the next page.  I was unable to turn on ignorePagination on that report.  

The reason the newly defined report is still 3 to 4 times slower than my baseline is in the fillReport call. I assume that is because it does all the kind of calculations that I mentioned before and that I am trying to minimize. 
Is there a way to minimize that overhead by, for example, setting the pageWidth/height or reportElement width/height to a large number or something else?
I read about setting the text measure property to the SingleLineTextMeasurerFactory class, but I can only find that class in the jasperserver jar.

basven - 4 months 2 weeks ago

I ran the test through a profiler and my gutt feel was right.  Out of the 433 seconds it took to generate the report, 260 seconds were spent in SimpleTextLineWrapper.measureExactLineBreakIndex

Ideas on how to minimize that call ?

basven - 4 months 2 weeks ago

3 Answers:

0

Performance is related to the kind of report and also on the data source and on available jvm memory.

If your program is running 10 hours, it's probably thrashing. If you're creating pojo for each row-column, that may be using up the memory.

hozawa's picture
49173
Joined: Apr 24 2010 - 4:31pm
Last seen: 3 months 1 week ago
0

For others.  The magic answer was: 

parameters.put("net.sf.jasperreports.measure.simple.text", "false");

Now it exports in 3 minutes. The exported data looks like shite, probably because the width of columns is incorrect. The studio interface is pretty useless when creating rows that have this many columns.

basven's picture
4
Joined: Oct 11 2017 - 9:08am
Last seen: 7 hours 16 min ago
0

FYI, I usually use OpenCSV or Apache common csv to create csv files or Jaspersoft ETL.

JasperReports require alignments of all vertical columns to make a good csv file. I don't think there's too much benefit in using JasperReports to just created csv files.

hozawa's picture
49173
Joined: Apr 24 2010 - 4:31pm
Last seen: 3 months 1 week ago
Feedback
randomness