I am running into performance problems when using Jasper as an embedded library to generate CSV reports on the backend that can be uploaded later. The report currently has over 1/2 a million records and each record has roughly 120 columns. The call to FillReport ran out of memory after running for over 10 hours even with a file virtualizer defined (although I believe the 'ignorePagination' = true caused that error). As a benchmark, when I use SQL Developer to run the same query and export, it takes only 20-30 minutes.
Currently the query is tied with the main data set and the report only has one detail section and in it, a textfield definition for each column, so it looks something like this:
<detail> <band height="42"> <textField isStretchWithOverflow="true" isBlankWhenNull="true"> <reportElement x="0" y="0" width="6" height="30" uuid="2493377b-d1a4-4caa-bea5-f66aed6c8620"> <property name="net.sf.jasperreports.export.csv.column.name" value="ORDER_STATUS"/> <printWhenExpression><![CDATA[!$P{CNXS_HIDE}.contains( "AO_STATUS" )]]></printWhenExpression> </reportElement> <textFieldExpression><![CDATA[$F{AO_STATUS}]]></textFieldExpression> </textField> etc.
Is there a better way to define this report in order to get better performance?
Thanks
Bas
3 Answers:
For others. The magic answer was:
parameters.put("net.sf.jasperreports.measure.simple.text", "false");
Now it exports in 3 minutes. The exported data looks like shite, probably because the width of columns is incorrect. The studio interface is pretty useless when creating rows that have this many columns.
To answer your comments, no pojo is created, it is a simple SQL query to an Oracle database that needs to be exported to a CSV file. I am trying to find out what the best report option is for this use case to minimize overhead that Jasper needs to render the data. I don't want Jasper to spend a lot of time trying to figure out if rows of data fit on one line, one page, in a band etc., it is really just one contineous report of data.
I reran my test and kept my query to return 10.000 records with a db fetch size of 50 records and a swap virtualizer. I see the following results:
Obviously using the template report is causing some infinite loop with calculating the next page. I was unable to turn on ignorePagination on that report.
The reason the newly defined report is still 3 to 4 times slower than my baseline is in the fillReport call. I assume that is because it does all the kind of calculations that I mentioned before and that I am trying to minimize.
Is there a way to minimize that overhead by, for example, setting the pageWidth/height or reportElement width/height to a large number or something else?
I read about setting the text measure property to the SingleLineTextMeasurerFactory class, but I can only find that class in the jasperserver jar.
I ran the test through a profiler and my gutt feel was right. Out of the 433 seconds it took to generate the report, 260 seconds were spent in SimpleTextLineWrapper.measureExactLineBreakIndex
Ideas on how to minimize that call ?