Category: | Enhancement request |
Priority: | High |
Status: | Feedback Requested |
Project: | Severity: | Feature |
Resolution: | Open |
|
Component: | Reproducibility: | Always |
Assigned to: |
Hi,
The following code (lines 2628 of HtmlExporter.java) hard-codes the font style:
// do not put single quotes around family name here because the value might already contain quotes,
// especially if it is coming from font extension export configuration
writer.write("<span style=\"font-family: ");
writer.write(fontFamily);
writer.write("; ");
Also, lines 2594 - 2602 could be written as:
if (highlightStarted||startedSpan) {
writer.write("</span>");
}
Note that "<span" exists multiple times throughout the code. A more robust design is required. The "<span" and "</span>" should only appear once in the HtmlExporter class. For example, lines 2562 - 2564 show:
writer.write("<span title=\"");
writer.write(JRStringUtil.xmlEncode(tooltip));
writer.write("\">");
This could be written using a decorator:
writer.write(getHtmlDecorator().decorate(tooltip));
Then (plus subsequent lines):
String fontFamily = resolveFontFamily(attributes, locale);
Should be:
FontFamily fontFamily = resolveFontFamily(attributes, locale);
Color forecolor = (Color)attributes.get(TextAttribute.FOREGROUND);
Color runBackcolor = (Color)attributes.get(TextAttribute.BACKGROUND);
FontSize fontSize = new FontSize( (Float)attributes.get(TextAttribute.SIZE) );
And those can be simplified using a fluent interface to:
Font font = new Font.Builder()
.withFontFamily( resolveFontFamily(attributes, locale) )
.withForegroundColour( (Color)attributes.get(TextAttribute.FOREGROUND) )
.withBackgroundColour... etc.
Then the span tag can be written simply as:
writer.write(getHtmlDecorator().decorate(font));
That would take care of writing the "<span", the font-family, font-weight, text-decoration, title, text, and so forth. Moreover, it would allow developers to easily extend the HtmlExporter class with their own decorator instance. For example:
HtmlExporter exporter = new HtmlExporter(); // Jaspersoft's
HtmlDecorator decorator = new CustomHtmlDecorator(); // Subclass of Jaspersoft's HtmlDecorator class
exporter.setHtmlDecorator( decorator );
Next, each object to write would need a corresponding decorator that uses XOM (http://www.xom.nu/apidocs/nu/xom/Element.html):
public class FontDecorator extends HtmlSpanDecorator {
private Font font;
public Element toElement() {
Element element super.getElement(); // Returns "span" element
element.addAttribute( attr( "font-family", font.getFamily() ) );
return element;
}
}
This satisfies the single responsibility principle whereby if the font attributes need to change, only a single class must be modified.
There are some missing details, but the general idea is sound:
* Use decorators
* Use DOM
* Avoid hard-coding attributes that makes it impossible to override without changing the library code base
6 Comments:
Using DOM elements means:
writer.write(getHtmlDecorator().decorate(font));
Would probably have to use something like:
public HtmlDecorator {
public String decorate( HtmlDecorator decorator ) {
Element element = decorator.toElement();
TransformerFactory transFactory = TransformerFactory.newInstance();
Transformer transformer = transFactory.newTransformer();
StringWriter buffer = new StringWriter();
transformer.setOutputProperty(OutputKeys.OMIT_XML_DECLARATION, "yes");
transformer.transform(new DOMSource(node),
new StreamResult(buffer));
return buffer.toString();
}
}
But, in theory, once the other decorators are in place, the HTML document could be built up as a DOM in a single go, then transformed once at the end into an XHTML document.
This would open up a world of possibilities for developers because once they have access to the DOM, they can make any changes as desired. This could even mitigate the need for decorators. Build a DOM instead of a String.
Hi,
The current HTML exporter in JR Lib is made in such a way that it produces HTML output that closely resembles the PDF output or the native Grapchis2D output of a report.
Can you give us an example in which this would not be desired and people would want their reports created with a WYSIWYG tool like Jaspersoft Studio to look different than when they designed them?
I think if we want to open the HTML export to customizations, we would probably be doing it by taking out styling completely from the HTML and isolate it into replaceable CSS files.
This is not easy to achieve, but I guess would suit better web developers for whom CSS is the only thing they need and don't want to deal with implementing Java interfaces to customize JR Lib exporter.
Thanks,
Teodor
> The current HTML exporter in JR Lib is made in such a way that it produces HTML output that closely resembles the PDF output or the native Grapchis2D output of a report.
And it does an excellent job of it, too.
> Can you give us an example in which this would not be desired and people would want their reports created with a WYSIWYG tool like Jaspersoft Studio to look different than when they designed them?
The Ministry of Education for the Province of British Columbia currently produces student transcripts using an archaic system. The Ministry would like to upgrade the transcripts (i.e., reports) to a modern engine (JasperReports). They want to offer students the ability to see their transcripts to confirm that it is correct prior to mailing a hard-copy. There are two modes of presentation for the transcript preview: PDF and HTML. The PDF versions are perfect. The HTML versions cannot be sufficiently tailored for a web/mobile experience to match the government standards (for font families, font sizes, and other criteria). This is mostly due to hard-coded styles in the HTML exporter.
Also, due to the repetition of style attributes in the HTML, the web pages themselves are nearing 100k. This is fine for cities, but in rural Yukon, bandwidth isn't still super-speedy and satellite phones remain expensive. The repetition would be eliminated by using classes rather than inline styles everywhere. (Another option is to eliminate styles altogether and allow the CSS to reference the elements by name and hierarchy with the option of having custom class styles applied.) The HTML transcript pages could be closer to 20k in size (with classes or no styles).
> I think if we want to open the HTML export to customizations, we would probably be doing it by taking out styling completely from the HTML and isolate it into replaceable CSS files.
That's a good first step. Using a DOM would be ideal.
For example:
writer.write("<span style=\"font-family: ");
writer.write(fontFamily);
writer.write("; ");
Can be written:
Element span = createElement("span"); // delegates to document.createElement...
span.setAttribute( "font-family", fontFamily );
Let the DOM transformer worry about semi-colons, escaping strings, and ensuring that opened elements are properly closed (using a plethora of confounding if statements). That's its job. Hard-coding how HTML elements are created within the exporter is mixing concerns, rather than separating them. The HTML Exporter shouldn't have to know HTML syntax: it should be concerned with building a document, not building an HTML document.
Using a DOM also means that the separate code XML and CSV exporters could be eliminated (in favour of a simple XSL transformations). You'd keep the classes as shells that are instances of the DOM Exporter, but could gut the contents completely. In other words:
DOMExporter becomes the superclass for HTMLExporter, XMLExporter, CSVExporter, and any other text-based export format required.
> This is not easy to achieve, but I guess would suit better web developers for whom CSS is the only thing they need and don't want to deal with implementing Java interfaces to customize JR Lib exporter.
Mostly time consuming, since the logic is in place. Migrating the code from a StringBuilder/StringBuffer to a DOM would be arduous. Once finished, though, it'd be trivial to add features such as custom styles, removing styles, etc.
Hi,
Thanks for the detailed explanation.
While the way we produce the HTML is not the cleanest, it has an advantage in terms of memory consumption.
We are concerned that using DOM we would be consuming significant amounts of memory when generating big HTML outputs.
We do not change the HTML exporter very often and maintaining the code we already have is not such a burden, although I have to admit we recently had a bug about mismatch closing tags.
But that is besides the point. The complexity of the problem is not in the way we produce the resulting HTML document, but in the way we have to separate two types of markup: HTML tags and CSS styles.
We have a POC we created a few years ago but had to put aside due to other priorities, in which we tried to anticipate some of the problems introduced by the content and styles separation.
That's why I said is not trivial. We are probably going to go back to that POC soon, if we are to solve this issue once and for all.
As for the other formats that would benefit from DOM-based export, I'm not sure either.
Looks like as far as CSV, JSON and XML exports are concerned, people are more interested to produce pure data outputs in which only a limited number of values are supposed to appear, usually in the form of data tables. Our initial exporters which were layout based did not serve this purpose well. The HTML exporter, even when created with DOM would be a layout based output, in which the place where the value appears matters most.
But for pure data extraction, it works better to simply ignore layout and export values that have been marked up for export using special marking properties at report design time.
For these, we have the so-called metadata exporters for CSV, JSON and soon we'll have similar one for XML too. Reusing DOM-based technique from the HTML exporter would be not feasible or would not help much, as the export logic is much simpler.
Also, I think the main problem is not about using or not DOM to generate the HTML.
The problem is about separating content and style.
We'll have a go at it again soon and let you know.
Thanks,
Teodor