Jump to content
Changes to the Jaspersoft community edition download ×

export huge PDF


zol777

Recommended Posts

I have a urgent need for exporting a 1g pdf but got out of memory error in the following code

 

JRExportManager.exportReportToPdfFile(hugeJasperPrint, "c:\a.pdf");

 

 

After tuning Java VM argument as follow, the exporting problem is still existed. (this is the maximium limit for me to use)

 

 

-Xms 256m, -Xmx2048m

 

 

After reading some article on web , i has try to use JRVirtualizer like JRSwapFileVirtualizer and JRGZipFileVirtualizer. But they seem make help on filling report but not export Report (am i understand right?). so same problem occur?

Lastly, i have export hugejasperprint as part1 and part2

 

 

FileOutputStream fos = new FileOutputStream(new File("C:\a.pdf"),true);

JRExportManager.exportReportToPdfStream(part1, fos);

JRExportManager.exportReportToPdfStream(part2, fos);

 

 

But it does show part2's content in a.pdf only. Does anyone has got the similar problem on exporting(not filling) a huge pdf like me? If you have any idea to my problem , please help to let me know too.

Thanks



Post Edited by zol777 at 09/04/2009 08:33
Link to comment
Share on other sites

  • Replies 14
  • Created
  • Last Reply

Top Posters In This Topic

Please consider to use JRFileVirtualizer while you are filling out your report. Then you can use JasperPrint object for your exporter.

There is my code example:

 

Code:



Post Edited by sergeleo at 09/04/2009 01:01 PM



Post Edited by sergeleo at 09/04/2009 02:42 PM
Link to comment
Share on other sites

OOM during PDF exports can occur for several reasons.

A case in which the PDF exporter will quickly exhaust the JVM memory is when you have many (different) images in the generated report. Is that your case?

If that's not the case, the PDF export will still run out of memory eventually. That happens because a few objects are created for each PDF page. Some of them are created by JasperReports, and we could change the code not to do so. But others are inherent to the PDF creation, so the export will never be truly streamable (that is, it will not run in constant memory).

Regards,

Lucian

PS: You cannot directly concatenate two PDF and expect a valid PDF document. If you want to concatenate two reports, use batch exporting (see the /demo/samples/batchexport sample in the JR distribution).

Link to comment
Share on other sites

With your kindly recommendation , it is ok to export those jasperPrint object, around 3000 jasperPrint objects, via batch mode into a single pdf file (1.8g large).

 

Thanks

 

But i would like to ask if some of my jasper print objects conatin duplicate images. Any ideas to reduce the output pdf files size by point those different references to the same reference?

 

Best regards,

Keung

Link to comment
Share on other sites

zol777
Wrote:

But i would like to ask if some of my jasper print objects conatin duplicate images. Any ideas to reduce the output pdf files size by point those different references to the same reference?

Image caching (currently) works per report fill process.  So if you fill 20 reports, you'll end up with 20 image objects even if they use the same source.

You can make all reports use the same image object only by instantiating the image renderer in your code and sending it as a parameter to the reports.

Regards,

Lucian

Code:
JRRenderable image = JRImageRenderer.getInstance(..image location..);..params1.put("image", image);JasperPrint print1 = JasperFillManager.fillReport(report1, params1);..params2.put("image", image);JasperPrint print2 = JasperFillManager.fillReport(report2, params2);....export print1, print2 to PDF
Link to comment
Share on other sites

my case is a little bit complex

 

since i has 4 computers to generate a report to a customer. before exporting a pdf, jasperprint object is serialized. (let's day if i have 2000 customer, then 2000 individualjasperprint serialized object are existed)

 

after 4 computer has completed their job, one of them is selected to consolidate them into batch jasper print by deserialize them.

 

With your recommendation , is it still applicable to my case? or i should process image processing during consolidation? (which is difficult in my current business logic)

 

Thanks

Link to comment
Share on other sites

I have a few points would like to clarify

 

1. Is custom image renderer = subClass of JRImageRenderer?

 

2. Is readResolve() one of the methods declared in  JRImageRenderer/or its interface ? since i decompile the JRImageRenderer , i can't find it.

 

3. Is readResolve() is called in PDF exporting of jasper report? so it further tell the exporter to use image store instead of reading it again?

 

4. if i use my custom ImageRenderer, how can other components in jasper know should use my ImageRenderer instead of default ImageRenderer? just declaring it in the jrxml is ok?

 

5. can you give me some brief idea/pseudo code on writing readResolve()? by using "key" attribute or physical file name to identify the incoming image is existed or not?

 

Thanks

 

Link to comment
Share on other sites

zol777
Wrote:

I have a few points would like to clarify

 

1. Is custom image renderer = subClass of JRImageRenderer?

 

2. Is readResolve() one of the methods declared in  JRImageRenderer/or its interface ? since i decompile the JRImageRenderer , i can't find it.

 

3. Is readResolve() is called in PDF exporting of jasper report? so it further tell the exporter to use image store instead of reading it again?

 

4. if i use my custom ImageRenderer, how can other components in jasper know should use my ImageRenderer instead of default ImageRenderer? just declaring it in the jrxml is ok?

 

5. can you give me some brief idea/pseudo code on writing readResolve()? by using "key" attribute or physical file name to identify the incoming image is existed or not?

 

Thanks

 

1.  Not necessarily, a custom image renderer is an implementation of JRRenderable.

2.  No, it's a standard Java method used on deserialization (see java.io.Serializable).

3.  readResolve() would be called during the deserialization of the JasperPrint objects, see #2.

4.  You'd have to explicitly use it as image source expression:

<imageExpression class="net.sf.jasperreports.engine.JRRenderable">new CachedImageRenderer("..image location..")</imageExpression>

5.  I think you would use the image location/file to determine whether an image renderer for that location is already loaded.  See a sample class below.

Regards,

Lucian

Code:
package jr;import java.awt.Graphics2D;import java.awt.geom.Dimension2D;import java.awt.geom.Rectangle2D;import java.io.ObjectStreamException;import java.util.HashMap;import java.util.Map;import net.sf.jasperreports.engine.JRException;import net.sf.jasperreports.engine.JRImageRenderer;import net.sf.jasperreports.engine.JRRenderable;import net.sf.jasperreports.engine.JRRuntimeException;public class CachedImageRenderer implements JRRenderable{	private static final Map cache = new HashMap();		synchronized private static CachedImageRenderer getCachedInstance(			CachedImageRenderer source)	{		CachedImageRenderer cached = (CachedImageRenderer) cache.get(				source.location);		if (cached == null)		{			source.loadImageRenderer();			cache.put(source.location, source);			cached = source;		}		return cached;	}		synchronized public void clearCache()	{		cache.clear();	}		private String location;	private transient JRRenderable renderer;		public CachedImageRenderer(String location)	{		this.location = location;		loadImageRenderer();	}		private void loadImageRenderer()	{		try		{			this.renderer = JRImageRenderer.getInstance(location);		} 		catch (JRException e)		{			throw new JRRuntimeException("Error while loading image from " 					+ location, e);		}	}	private Object readResolve() throws ObjectStreamException	{		return getCachedInstance(this);	}		public Dimension2D getDimension() throws JRException	{		return renderer.getDimension();	}	public String getId()	{		return renderer.getId();	}	public byte[] getImageData() throws JRException	{		return renderer.getImageData();	}	public byte getImageType()	{		return renderer.getImageType();	}	public byte getType()	{		return renderer.getType();	}	public void render(Graphics2D grx, Rectangle2D rectangle)			throws JRException	{		renderer.render(grx, rectangle);	}}
Link to comment
Share on other sites

With your kindly suggestion, it is workable for producing a large PDF file by using JasperPrint.addPage(JRPrintPage jp) to add the individual jasperPrint file page by page.

 

However , the caching concept is failed for producing in batch mode. Would you mind suggest direction/method for using image caching exporting pdf in batch mode?

 

For your investagiation, hereby i attach my testing material as testing.zip . here is those details

 

CacheImageRenderer.java -> updated version of your program. since i change those private property/method to protected property/method respectively

 

TestImageCache.java-> my testing program . I have add some comment in code section of this discussion thread

 

test.jpg , test.jrxml, test.jasper -> testing image , testing jrxml and testing jasper file

 

ok.pdf -> final result  with one large jasper print object and image cache

 

notok.pdf -> final result by production one large pdf in batch mode

 

Thanks

Code:
import java.util.ArrayList;import java.util.HashMap;import java.util.List;import java.util.Map;import net.sf.jasperreports.engine.JRException;import net.sf.jasperreports.engine.JRExporterParameter;import net.sf.jasperreports.engine.JRImageRenderer;import net.sf.jasperreports.engine.JRPrintPage;import net.sf.jasperreports.engine.JRRenderable;import net.sf.jasperreports.engine.JasperExportManager;import net.sf.jasperreports.engine.JasperFillManager;import net.sf.jasperreports.engine.JasperPrint;import net.sf.jasperreports.engine.JasperReport;import net.sf.jasperreports.engine.export.JRPdfExporter;import net.sf.jasperreports.engine.util.JRLoader;import net.sf.jasperreports.engine.util.JRSaver;public class TestImageCache {	protected JRImageRenderer image1;	protected JRRenderable initImage1() throws JRException {		return new CachedImageRenderer("C:\\test.jpg");	}	public JasperPrint prepareJasperPrint() throws JRException {		Map map = new HashMap();		map.put("Image1", initImage1());		return JasperFillManager.fillReport(			(JasperReport) JRLoader.loadObject("C:\\test.jasper"),			map);	}	public static void prepareSerializeObject() throws Exception {		TestImageCache test = new TestImageCache();		for (int i = 0; i < 20; i++) {			JRSaver.saveObject(				test.prepareJasperPrint(),				"C:\\" + i + ".jasper");		}	}	public static void main(String[] args) {		try {//prepare 20 serialized jasperprint objects for testing			prepareSerializeObject();			TestImageCache test = new TestImageCache();			List list = new ArrayList();//prepare the large jasper print object			JasperPrint jp = new JasperPrint();			jp.setPageHeight(842);			jp.setPageWidth(540);			jp.setOrientation(Byte.parseByte("1"));			for (int i = 0; i < 20; i++) {				JasperPrint currJp =					(JasperPrint) JRLoader.loadObject("C:\\" + i + ".jasper");				for (int j = 0; j < currJp.getPages().size(); j++) {					jp.addPage((JRPrintPage) currJp.getPages().get(j));				}				list.add(currJp);			}//it can reuse this image in this large jasper //print			JasperExportManager.exportReportToPdfFile(jp, "C:\\ok.pdf");//however in batch mode , JasperPrint cannot support those //"cacheing" image			JRPdfExporter exporter = new JRPdfExporter();			exporter.setParameter(JRExporterParameter.JASPER_PRINT_LIST, list);			exporter.setParameter(				JRExporterParameter.OUTPUT_FILE_NAME,				"C:\\notok.pdf");			exporter.exportReport();			System.out.println("Fin");		} catch (Exception e) {			e.printStackTrace();		}	}}
Link to comment
Share on other sites

You're right, I didn't notice that the PDF exporter resets the image cache for each report in the batch list.

You can override this by extending the PDF exporter as below.

Regards,

Lucian

Code:
public class MyPdfExporter extends JRPdfExporter{	private Map firstImageCache;	protected void setJasperPrint(JasperPrint jasperPrint)	{		super.setJasperPrint(jasperPrint);							if (firstImageCache == null && loadedImagesMap != null)		{			firstImageCache = loadedImagesMap;		}	}	protected void exportImage(JRPrintImage printImage)		throws DocumentException, IOException, JRException	{		if (firstImageCache != null)		{			loadedImagesMap = firstImageCache;		}				super.exportImage(printImage);	}}
Link to comment
Share on other sites

ok now. since my jasper report is too old (version 1.3.3). I can't adopting your suggestion directly due to setJasperPrint() method does not existed in old JRPdfExporter .

 

The basic concept is idenitcal to your suggested approach. By extending JRPdfExporter, i override method exportReportToStream(). In batch exporting mode, loadedImageMap do not clear itself during exporting report one by one.

 

One of the side effect is my exporting PDF cannot contain any javascript (fortunately, my pdf does not require javascript). pdfJavascipt is one of the private properties of JRPdfExporter whereas it does not allwoed to alter in my class (subclass of JRPdfExporter). 

 

Hereby i attach my code for other who encounter similar problem to references

 

Thanks for your kindly support!

Code:



Post Edited by zol777 at 09/24/2009 02:02



Post Edited by zol777 at 09/24/2009 02:07
Link to comment
Share on other sites

  • 10 months later...

Hi,

Right now I am trying to optimize my report generation task for memory footprint. With certain level of Xmx (128m) I receive an OutOfMemoryError during exporting to pdf step on PdfDocument.close:

Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
    at com.lowagie.text.pdf.ByteBuffer.toByteArray(Unknown Source)
    at com.lowagie.text.pdf.PdfContentByte.toPdf(Unknown Source)
    at com.lowagie.text.pdf.PdfFormXObject.<init>(Unknown Source)
    at com.lowagie.text.pdf.PdfTemplate.getFormXObject(Unknown Source)
    at com.lowagie.text.pdf.PdfWriter.addSharedObjectsToBody(Unknown Source)
    at com.lowagie.text.pdf.PdfWriter.close(Unknown Source)
    at com.lowagie.text.pdf.PdfDocument.close(Unknown Source)
    at com.lowagie.text.Document.close(Unknown Source)
    at net.sf.jasperreports.engine.export.JRPdfExporter.exportReportToStream(JRPdfExporter.java:644)
    at net.sf.jasperreports.engine.export.JRPdfExporter.exportReport(JRPdfExporter.java:363)

As far as I am concerned this has nothing to do with virtualizers as they are active only during filling step.

Lucian mentioned that lots of different images may be the reason for draining memory quickly. In my report I have about 1000 images generated dynamically via custom JRRenderable. Resulting PDF, when created, is 300 pages long and has size of about 14M (compressed).

Now, my questions are:

  • Is this the number of images you consider "many", ie am I the case you wrote about?
  • Is there any solution/parameter/technique to optimize this step? I figured that during export garbage collector can reclaim very little memory. There is no such issue during filling.
  • Why this happens on the PdfDocument.close method? I am not an iText expert and this bothers me a little.

Thank you for any suggestions,

Marcin



Post Edited by marcin-k at 07/28/2010 07:59



Post Edited by marcin-k at 07/28/2010 08:00
Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

×
×
  • Create New...