Trying to use unicode font Error

bz3x · January 29, 2008

Hi, this is the first time I'm trying to add i18n to some reports I created with iReport (2.0.4) in a Spring application.

My starting point is that when the report tries to display some Hebrew characters (×©×œ×•×), the output is "של", which I presume is the unicode encoding for the first two letters. Eventually I will be adding Russian, Turkish, Portugese and Arabic as well.

I've looked through the documentation and forums but haven't found any good, clear descriptions of how to address this. So far, I've 1) set the textfield to use the Arial Unicode MS font, 2) set PDF-encoding to Identity-H (Unicode horizontal) and 3) have experimented both options for "PDF Embedded". When I run the report in the application I get the following:

Java heap space

java.lang.OutOfMemoryError: Java heap space at java.util.Arrays.copyOf(Unknown Source) at java.io.ByteArrayOutputStream.write(Unknown Source) at com.lowagie.text.pdf.RandomAccessFileOrArray.InputStreamToArray(RandomAccessFileOrArray.java:130) at com.lowagie.text.pdf.RandomAccessFileOrArray.(RandomAccessFileOrArray.java:97) at com.lowagie.text.pdf.TrueTypeFont.process(TrueTypeFont.java:577) at com.lowagie.text.pdf.TrueTypeFontUnicode.(TrueTypeFontUnicode.java:95) at com.lowagie.text.pdf.BaseFont.createFont(BaseFont.java:394) at com.lowagie.text.pdf.BaseFont.createFont(BaseFont.java:342) at com.lowagie.text.FontFactory.getFont(FontFactory.java:225) at net.sf.jasperreports.engine.export.JRPdfExporter.getFont(JRPdfExporter.java:1701) at net.sf.jasperreports.engine.export.JRPdfExporter.writePageAnchor(JRPdfExporter.java:588) at net.sf.jasperreports.engine.export.JRPdfExporter.exportReportToStream(JRPdfExporter.java:528) at net.sf.jasperreports.engine.export.JRPdfExporter.exportReport(JRPdfExporter.java:322) at org.springframework.ui.jasperreports.JasperReportsUtils.render(JasperReportsUtils.java:113) at org.springframework.web.servlet.view.jasperreports.AbstractJasperReportsSingleFormatView.renderReport(AbstractJasperReportsSingleFormatView.java:100) at org.springframework.web.servlet.view.jasperreports.JasperReportsMultiFormatView.renderReport(JasperReportsMultiFormatView.java:208) at org.springframework.web.servlet.view.jasperreports.AbstractJasperReportsView.renderMergedOutputModel(AbstractJasperReportsView.java:578) at org.springframework.web.servlet.view.AbstractView.render(AbstractView.java:251) at org.springframework.web.servlet.DispatcherServlet.render(DispatcherServlet.java:1160) at org.springframework.web.servlet.DispatcherServlet.doDispatch(DispatcherServlet.java:901) at org.springframework.web.servlet.DispatcherServlet.doService(DispatcherServlet.java:809) at org.springframework.web.servlet.FrameworkServlet.processRequest(FrameworkServlet.java:476) at org.springframework.web.servlet.FrameworkServlet.doGet(FrameworkServlet.java:431) at javax.servlet.http.HttpServlet.service(HttpServlet.java:689) at javax.servlet.http.HttpServlet.service(HttpServlet.java:802) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:252) at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:173) at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:213) at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:178) at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:126) at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:105) at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:107)

Is this because the font is too large? If I use a Hebrew font I will need to somehow change fonts at runtime depending on the locale for other languages. Or is there something else going on?

I'd appreciate any comments from someone with experience with this.

Thanks!

bz3x · January 30, 2008

I dumped my previous approach of using a unicode font - the 22Mb Unicode font was too much.

So instead I'm testing a string that contains Turkish characters (ÅŸlkjpoÄ±u), only two of which aren't found in the normal ASCII set. The field is set to use a Turkish font, with PDF-encoding Identity-H (Unicode horizontal), and am embedding the PDF font. The report compiles and even displays, so I assume the font is actually being embedded.

The result? When the string is taken from a field ($F{foo}), I see şlkjpoıu, and when the string is taken from a properties file ($R{foo}), I see "lkjpou" - the special characters are missing.

Hasn't anyone run into this problem? Anyone with experience, please help!

bz3x

lucianc · January 31, 2008

JasperReports includes a "unicode" sample that illustrates how text in various languages can be displayed in JR reports. Take a look at it, maybe you'll spot the cause of your issue.

Otherwise, I don't think one would be able to tell what's wrong unless you post a self-contained sample that can be used to replicate the behaviour.

Regards,

Lucian

bz3x · February 1, 2008

Thanks for replying!

I can reproduce a similar error by modifying the Unicode sample report, adding two new fields. The first just shows $P{TurkishText} and works correctly just like the Cyrillic, Greek, Arabic and Hebrew examples. The second is supposed to show $R{test}, and I've added a properties file saved in UTF-8. Attached is the report.

I'm inputting the same value for $P{TurkishText} and $R{test}, but the result in the PDF is vastly different: $P{TurkishText} shows "oÄ±xÅŸ" (correct) while $R{test} shows "oÃ„Â±xÃ…".

Apparently the error is about how the report gets the text. The fact that in my real-world application (based on Spring) the $R{} and the $F{} fields show different errors for the same strings leads me to believe that Spring (or rather, my configuration of Spring) passes the report source and the properties file in two different ways, neither of them are working though.

Thanks for any thoughts!

[file name=UnicodeReport.jrxml size=11418]http://www.jasperforge.org/components/com_joomlaboard/uploaded/files/UnicodeReport.jrxml[/file]

lucianc · February 1, 2008

Note that properties files are assumed by Java to be encoded with ISO 8859-1 (aka Latin 1), see [http://java.sun.com/javase/6/docs/api/java/util/Properties.html]java.util.Properties' Javadoc. Unicode characters (other than Latin 1) in properties files need to be escaped with u sequences.

HTH,

Lucian

bz3x · February 1, 2008

I get it. I ran native2ascii and was able to see the $R{} string correctly in both the unicode example and my real-world application (and both the PDF and Excel views of my report!). The JSPs, which worked even before escaping the characters, work fine with the new properties file too.

This leaves the problem of the $F{} strings. For a JRDataSource I'm using a wrapper around a Collection or a Map - (taken from somewhere on these forums, see code below). Apparently either this class or the report doesn't realize that the text is unicode, since it shows me "oıxş". Any ideas on the best place to attack this? In the class? On the report?

Thanks again for your input!!

Code:

import java.lang.reflect.Method;
import java.util.Collection;
import java.util.HashMap;
import java.util.Iterator;
import java.util.Map;

import net.sf.jasperreports.engine.JRDataSource;
import net.sf.jasperreports.engine.JRException;
import net.sf.jasperreports.engine.JRField;

import org.apache.commons.beanutils.PropertyUtils;


public class ReportSource implements JRDataSource {
	protected HashMap fieldsToIdxMap=new HashMap();
	protected Iterator iterator;
	protected Object currentValue;
	
	public ReportSource(Collection list) {
		this.iterator = list.iterator();
	}
	public ReportSource(Map list) {
		this.iterator = list.values().iterator();
	}
	
	private Object nestedFieldValue(Object object, String field) {
		Object value = null;
		if (field.indexOf("__"Â«Â»)>-1) {
			try {
				Method nestedGetter = PropertyUtils.getReadMethod(PropertyUtils.getPropertyDescriptor(object, field.substring(0,field.indexOf("__"Â«Â»))));
				Object nestedObject = nestedGetter.invoke(object, null);
				
				value = nestedFieldValue(nestedObject, field.substring(field.indexOf("__"Â«Â»)+2, field.length()));
			} catch (Exception ex) {
				ex.printStackTrace();
			}
		} else {
			try {
				Method getter = PropertyUtils.getReadMethod(PropertyUtils.getPropertyDescriptor(object, field));
				value = getter.invoke(object, null);
				
				if(Collection.class.isAssignableFrom(getter.getReturnType())) {
					return new ReportSource((Collection)value);
				}
				if (Map.class.isAssignableFrom(getter.getReturnType())) {
					return new ReportSource((Map)value);
				}
			} catch (Exception ex) {
				ex.printStackTrace();
			}
		}
		return value;
	}
	
	public Object getCurrentValue() throws JRException {
		return currentValue;
	}
	public boolean next() throws JRException {
		currentValue = iterator.hasNext() ? iterator.next() : null;
		return currentValue != null;
	}

	public Object getFieldValue(JRField field) throws JRException {
		return nestedFieldValue(currentValue, field.getName());
	}

}

bz3x · February 2, 2008

It's always difficult to locate problems in an application built on multiple frameworks or components. In this case the problem in the end wasn't related to reporting, only that the problem only showed up in the reports. Aside from the issue of escaping the unicode characters in the property files, the user-input data wasn't stored as unicode at all, but in HTML encoding. So I simply ran a HTML unencoding function on the text in my POJO-JRDataSource wrapper, as follows. Hopefully this will be of help to the next person who runs into this issue.

Code:

private String convertHtmlToUnicode(String html) {
		int begU = html.indexOf(""Â«Â»);
		if (begU==-1) {
			return html;
		} else {
			String before = html.substring(0,begU);
			int endU=html.indexOf(";", begU);
			char uni = (char)Integer.parseInt(html.substring(begU+2,endU));
			String after = html.substring(endU+1, html.length());
			return before+uni+convertHtmlToUnicode(after);
		}
	}

Sign In

Trying to use unicode font Error

Recommended Posts

bz3x

Link to comment

Share on other sites

Top Posters In This Topic

Popular Days

Top Posters In This Topic

Popular Days

bz3x

Link to comment

Share on other sites

lucianc

Link to comment

Share on other sites

bz3x

Link to comment

Share on other sites

lucianc

Link to comment

Share on other sites

bz3x

Link to comment

Share on other sites

bz3x

Link to comment

Share on other sites

Create an account or sign in to comment

Create an account

Sign in

Activity

Products

Explore