Wednesday, April 15, 2009

Data Integrator is confused by Oracle

When recently experimenting with the Business Objects ETL product Data Integrator [BoDi] (part of the Data Services suite), my college and I encountered some strange behavior when BoDi talks to an Oracle database.

Within BoDi a database is accessed through a DataStore object. This DataStore object is defined by a 'Connection Name' (e.g. an entry out of TNS Names), a user and a password.

Oracle makes use of schemas and schema owners (and rights of users on - part of - a schema).
So the user used within a DataStore definition must be configured within Oracle to have some rights within the database, and these rights are mostly limited to a subset of all the schemas present or even part of some schema.

Now when BoDi reads from 2 different schema's within the same database (by means of different users), the DataStore 'Connection Name' will be the same (it's the same Oracle database).

Because of this equal 'Connection Name', BoDi mistakenly makes the assumption that any of these 2 DataStore users will have visibility on each other schema content.
This is not necessarilly so! A user might have only visibility on a single schema.

But because of it's assumption BoDi will attempt to perform any join that is defined between tables of these 2 schemas in a single sql query (pushdown) to be performed by the database.
This is very nice performance wise as the join is performed by the database, but if the users do not have visibility on all used tables then this will not work.

There are some workarounds (e.g. forcing BoDi to make 2 select queries and performing the join itself - at a performance cost - by defining Aliasses on the DataStore).
[Although it's not clear to me what this Alias exactly does or is used for :-( ]

The workaround we've chosen is to use a single user that has rights on all schema's within a database so that the pushdown queries always work, gaining us the database performance.

No comments:

Post a Comment