Virtual data sources have two usage scenarios:
| • | They allow you to combine multiple JDBC and JNDI data sources and make them available to be joined through a Domain. You can combine any number of data sources, including schemas from different databases, databases from different vendors, and different schemas within the same database into a single object. |
| • | They make Cassandra, Hadoop-Hive, and MongoDB data sources available in Domains. In this case, the virtual data source acts as a wrapper for the data source for big data, with certain limitations described in Big Data Connectors for Virtual Data Sources. |
A virtual data source can wrap a single data source for big data, or combine any number of JDBC, JNDI and big data data sources.
Once you have created a virtual data source, you create a Domain that joins tables across the data sources to define the relationships among the data. Ad Hoc views and reports based on the Domain can access the combined data transparently. For more information about Domains, see the chapter "Creating Domains" in the JasperReports Server User Guide.
Virtual Data Source Scenario |
|
When you combine data sources into a virtual data source, you select an alias for each data source you include; this alias is added as a prefix to the tables in the original data source to ensure that table names are unique across the virtual data source.
To create a virtual data source:
| 1. | Log on as an administrator. |
| 2. | Click View > Repository, expand the folder tree, and right-click a folder to select Add Resource > Data Source from the context menu. Alternatively, you can select Create > Data Source from the main menu on any page and specify a folder location later. If you have installed the sample data, the suggested folder is Data Sources. The New Data Source page appears. |
| 3. | In the Type field, select Virtual Data Source. |
| 4. | Locate the data sources you want to use in the Available Data Sources pane. Double-click to select each chosen data source. The data source is shown in the Selected Data Sources pane. |
| 5. | Change the aliases by editing them directly in the Alias column (optional). The alias identifies the selected data source within the virtual data source; it is also added as a prefix to the name of each table in that data source. Spaces are not allowed in aliases. |
The following figure shows values for creating a virtual data source by combining two of the databases included in the sample data: the Foodmart database and the SugarCRM database.
|
Creating a Virtual Data Source |
| Virtual data sources cannot use the Time Zone field that may be set on individual data sources. If used in a virtual data source, a target data source with a time zone will not return the expected date/time values. Therefore, we recommend that you do not use data sources with time zone settings in a virtual data source. |
| 6. | Click Save. The Save dialog appears. |
|
Saving a Virtual Data Connection |
| 7. | Enter the data source name and, optionally, a description. the Resource ID appears based on the name you enter. |
| 8. | Expand the folder tree and select the location for your data source, then click Save. The data source appears in the repository. |
You can edit a virtual data source to add or remove the data sources it uses. If the virtual data source is used by a Domain, you can add data sources, but you cannot remove them. Removing a data source from a virtual data source modifies only the virtual data source; the data source you removed remains in the repository.
To edit a virtual data source:
| 1. | Log on as an administrator. |
| 2. | Click View > Repository and expand the folder tree to locate the folder containing the data source. |
| 3. | Right-click the data source and select Edit from the context menu. |
| 4. | To add a data source, locate the data source in the Available Data Sources pane and double-click. To remove a data source, select it in the Selected Data Sources pane and click the left arrow. |
| 5. | Click Save. If you try to delete a data source from a virtual data source that is used by a Domain, you'll receive a warning and the data source will remain unchanged. Otherwise, the data source is updated in the repository. |
Performance Considerations
Virtual data sources are based on the Teiid engine to handle multiple data sources and combine the results from them. How you design the tables in each of your data sources and how you combine them in a Domain join can significantly affect performance.
Consider these issues when designing a solution that combines several databases in a virtual data source:
| • | When using virtual data source in a Domain, you must select at least of the tables from the Tables tab to appear in the Domain before using it to create a derived table. Virtual data sources are optimized to use the smallest set of metadata, but they do detect Domains that use only derived tables. |
| • | The virtual data source runs in JasperReports Server and must allocate memory and use processing to handle result sets. |
| • | The Teiid engine uses its own cache for data accessed through a virtual data source. You can clear the Teiid cache at the same time as the Ad Hoc cache, as described in Ad Hoc Cache Management. |
| • | The virtual data source can push down certain operations (joins, filters, and aggregations) to the databases, others it must perform in memory. The more it can push down, the smaller the datasets it handles in memory |
| • | You should design your schema to maximize the processing pushed to the database and minimize data handled in the virtual data source in a join between data sources. |
For example, if you create a duplicate copy of a date dimension table in both of your data sources, they can be joined much more efficiently. If this table existed in only one database, the Teiid engine would need to retrieve all the rows from the other database and perform the join itself. With this table in both data bases, the Teiid engine can push down all of the time-dimension joins to the individual databases and perform a final join on much smaller datasets.
For more information about optimizing your data for use in Teiid, see http://www.jboss.org/teiid/.
Logging for Virtual Data Sources
If you have issues with your big data connections through virtual data sources, you can enable logging in the following classes:
com.jaspersoft.jasperserver.api.common.virtualdatasourcequery.VirtualDataSourceQueryService
com.jaspersoft.jasperserver.api.engine.common.virtualdatasourcequery.teiid.TeiidEmbeddedServer
For information about enabling logging, see Configuring System Logs.
Recommended Comments
There are no comments to display.