Data mediation

Data mediation aims to access or evaluate queries on heterogeneous data sources. It is a subdomain of data integration with a focus on performing efficient data retrieval.

Mediator optimization

The mediators are the mediation systems.

Query execution plans

They usually take as input a query execution plan and returns its answers. This input plan is defined by a logical plan, which is a tree where the nodes are operators on the data. For example in relational plan, an operator can be a projection, a join or a selection. Performing this plan requires to choose an implementation for each operator of the logical plan. We obtain a physical plan, when the implementation of every operator of the plan are fixed.

Cost of query plan

In order to optimize the execution of a query plan P, the mediator have to find the cheapest physical plan among the space of every physical plan that are equivalent to P. It requires a cost model on the physical plans. Usually, the cost of a logical plan is defined as the minimal cost of its implementations.

The monotony assumption on cost model is the fact that if a plan P' is obtained from the plan P by replacing a subplan P1 of P by a cheaper and equivalent subplan P2, then P' is cheaper than P DBLP:conf/sigmod/FlorescuLMS99.

Binding Patterns

In some cases, a relation can be accessed only if we provide some inputs. The combinations of input and output columns for each relation is called binding patterns. Given a relation symbol R, the binding patterns of R is a set of mappings from the columns of R to {b,f} (bounded and free). If a column is mapped to b, we say that the column is bounded, inversely we say that the column is free. Bounded columns are inputs and free ones are outputs.

Given a conjunctive query on a set of relation with binding patterns, some question arises:

  1. Is there an execution for the conjunctive query, which is valid according to the binding patterns
  2. How to enumerate the set of valid execution plans

If there exists a valid plan, then there exists a valid plan whose each non-atomic subplan is also valid DBLP:conf/sigmod/FlorescuLMS99. A such plan may need to contain some cartesian products.

Valid execution plan

There is a greedy algorithm to check whether a query has a valid execution plan DBLP:conf/pods/Morris88.

Viable subplan

A subplan p of a query q is said called viable if there exists a valid plan of q that extends p.

For a conjunctive query q, a subplan p is viable if and only if there exists a valid plan of q' with q' being the query obtained from q by replacing all the atoms in p by one atom having the same variable and the same access as p DBLP:conf/sigmod/FlorescuLMS99.

This post accepts webmentions. Do you have the URL to your post?

Otherwise, send your comment on my service.

Or interact from the fediverse with your username:

fediverse logo Share on the Fediverse