I am new to the Jaspersoft Communituy so if this is not the right forum to post this question, please direct me to the right place.
There are a number of posts about memory utilization when generating crosstab repots with a large nubmer of rows and many variable columns (column groups). It appears that the way the generation of a crosstab currently works, in order to create the first row of output with all the correct column groups in the correct order, the entire datadate must be processed in memory. Although this makes sense, it does result in the use of a large amount of Java heap space, and often leads to the failure of the report with an OutOfMemory error.
Wouldn't it be possible to modify Jasper to allow a separate "header" dataset to be specified that was responsbile to return all the unique column group values in the correct order. This would allow Jasper to create the necessary columns based on this "header" dataset and not the main report dataset.This would also result in a substantial reduction in the amount of memory required to create the report by allowing Jasper to process each row in the dataset (presorted?) and build each crosstab row without reading the entire dataset into memory.
Hopefully this suggestion can be considered for a subsequent version of the Jasper Reports libraries.
For a crosstab, the entire data set is read, but only the cells for calculation are held in memory. I think this is sort of what you meant by a "header data set".
Memory problems come with crosstabs with large numbers of cells: # rows X # columns X # measures.
http://jasperreports.sourceforge.net/config.reference.html#net.sf.jasperreports.crosstab.bucket.measure.limit can be set to limit the number of cells for a crosstab and try avoid out of memory issues.
The way around this would be to have some off-heap cache for crosstab contents.