Virtual Data Sources

Virtual data sources have two usage scenarios:

They allow you to combine multiple JDBC and JNDI data sources and make them available to be joined through a Domain. You can combine any number of data sources, including schemas from different databases, databases from different vendors, and different schemas within the same database into a single object.
Virtual data sources also make Cassandra, Hadoop-Hive, and MongoDB data sources available in Domains. In this usage, the virtual data source acts as a wrapper for the data source for big data, with certain limitations described in Big Data Connectors for Virtual Data Sources.

A virtual data source may wrap a single data source for big data, or combine any number of JDBC, JNDI and big data data sources.

Once you have created a virtual data source, you create a Domain that joins tables across the data sources to define the relationships between the data. Ad Hoc views and reports based on the Domain can access the combined data transparently. For more information about Domains, see the chapter "Creating Domains" in the JasperReports Server User Guide.

Virtual Data Source Scenario

When you combine data sources into a virtual data source, you select an alias for each data source you include; this alias is added as a prefix to the tables in the original data source to ensure that table names are unique across the virtual data source.

To create a virtual data source:

1. Log on as an administrator.
2. Click View > Repository, expand the folder tree, and right-click a folder to select Add Resource > Data Source from the context menu. Alternatively, you can select Create > Data Source from the main menu on any page and specify a folder location later. If you have installed the sample data, the suggested folder is Data Sources.

The New Data Source page appears.

3. In the Type field, select Virtual Data Source.
4. Fill in the other required fields, along with any optional information.
5. Locate the data sources you want to use in the Available Data Sources pane. Double-click each desired data source to select it. The data source is shown in the Selected Data Sources pane.
6. Change the aliases by editing them directly in the Alias column, if desired. The alias is used to identify the selected data source within the virtual data source; it is also added as a prefix to the name of each table in that data source. Spaces are not allowed in aliases.
7. Click Save. The data source appears in the repository.

The following figure shows values for creating a virtual data source by combining two of the databases included in the sample data: the Foodmart database and the SugarCRM database.

Creating a Virtual Data Source

Virtual data sources cannot use the Time Zone field that may be set on individual data sources. If used in a virtual data source, a target data source with a time zone will not return the expected date/time values. Therefore, Jaspersoft recommends that you do not use data sources with time zone settings in a virtual data source.

You can edit a virtual data source to add or remove the data sources it uses. If the virtual data source is used by a Domain, you can add data sources, but you cannot remove them. Removing a data source from a virtual data source modifies only the virtual data source; the data source you removed remains in the repository.

To edit a virtual data source:

1. Log on as an administrator.
2. Click View > Repository and expand the folder tree to locate the folder containing the data source.
3. Right-click the data source and select Edit from the context menu.
4. To add a data source, locate the data source in the Available Data Sources pane and double-click. To remove a data source, select it in the Selected Data Sources pane and click the left arrow.
5. Click Save. If you are attempting to delete a data source from a virtual data source that is used by a Domain, you receive a warning and the data source is unchanged. Otherwise, the data source is updated in the repository.

Performance Considerations

Virtual data sources are based on the Teiid engine to handle multiple data sources and combine the results from them. How you design the tables in each of your data sources and how you combine them in a Domain join can significantly affect performance.

There are several issues to keep in mind when designing a solution that combines several databases in a virtual data source:

The virtual data source runs in JasperReports Server and must allocate memory and use processing to handle result sets.
The Teiid engine uses its own cache for data accessed through a virtual data source. You can clear the Teiid cache at the same time as the Ad Hoc cache, as described in Ad Hoc Cache Management.
The virtual data source can push down certain operations (joins, filters, and aggregations) to the databases, others it must perform in memory. The more it can push down, the smaller the datasets that it must handle in memory
You should design your schema so that the most processing in your query can be pushed down to the database, and so that the virtual data source handles the least amount of data in a join between data sources.

For example, if you create a duplicate copy of a date dimension table in both of your data sources, they can be joined much more efficiently. If this table existed in only one database, the Teiid engine would need to retrieve all the rows from the other database and perform the join itself. With this table in both data bases, the Teiid engine can push down all of the time-dimension joins to the individual databases and perform a final join on much smaller datasets.

For more information about optimizing your data for use in Teiid, see http://www.jboss.org/teiid/.

Logging for Virtual Data Sources

If you have issues with your big data connections through virtual data sources, you can enable logging in the following classes:

com.jaspersoft.jasperserver.api.common.virtualdatasourcequery.VirtualDataSourceQueryService

com.jaspersoft.jasperserver.api.engine.common.virtualdatasourcequery.teiid.TeiidEmbeddedServer

For information about enabling logging, see Configuring System Logs.

Version: 
Feedback
randomness