Can crosstab memory utilization be optimized?

0

I am new to the Jaspersoft Communituy so if this is not the right forum to post this question, please direct me to the right place.

There are a number of posts about memory utilization when generating crosstab repots with a large nubmer of rows and many variable columns (column groups). It appears that the way the generation of a crosstab currently works, in order to create the first row of output with all the correct column groups in the correct order, the entire datadate must be processed in memory. Although this makes sense, it does result in the use of a large amount of Java heap space, and often leads to the failure of the report with an OutOfMemory error.

Wouldn't it be possible to modify Jasper to allow a separate "header" dataset to be specified that was responsbile to return all the unique column group values in the correct order. This would allow Jasper to create the necessary columns based on this "header" dataset and not the main report dataset.This would also result in a substantial reduction in the amount of memory required to create the report by allowing Jasper to process each row in the dataset (presorted?) and build each crosstab row without reading the entire dataset into memory.

Hopefully this suggestion can be considered for a subsequent version of the Jasper Reports libraries.

 

rjensen_1's picture
Joined: Sep 20 2019 - 5:32pm
Last seen: 1 month 3 weeks ago

Looking at the data for a specific example that runs out of memory with 4GB of Java heap space allocated...the report would have 18000 rows, 9 column group values (in addition to the 15 fields that make up a row group), and each column with 8 measures. That sounds big, but even allowing for each cell to be 100 bytes of data (they are not), that is still less than .5GB to keep all the cells in memory. There must be a huge amount of overhead to account for the other 3.5GB of heap being used up before it runs out of memory. Perhaps I'm over simplifying it but the numbers just don't make sense to me. Given these numbers, how would crosstab_bucket_measure_limit be set to allow this report to run?

rjensen_1 - 1 month 3 weeks ago

1 Answer:

0

For a crosstab, the entire data set is read, but only the cells for calculation are held in memory. I think this is sort of what you meant by a "header data set".

Memory problems come with crosstabs with large numbers of cells: # rows X # columns X # measures.

http://jasperreports.sourceforge.net/config.reference.html#net.sf.jasperreports.crosstab.bucket.measure.limit can be set to limit the number of cells for a crosstab and try avoid out of memory issues.

The way around this would be to have some off-heap cache for crosstab contents.

 

 

swood_1's picture
743
Joined: Nov 15 2012 - 10:47am
Last seen: 3 days 11 hours ago
Feedback
randomness