Lossy Join Decomposition :
The best refinement technique for the schema design is Decomposition. Decomposition refers to decompose or break-down of the relational-schema that has many attributes into several schemas with fewer attributes. We should take care some desirable properties while doing decomposition. If we do the careless decomposition, then it will lead to a bad design again.
Careless Decomposition or Lossy Join Decomposition :
Consider a Schema relation which has many attributes and results into redundancy. Let us apply Decomposition
|
Supplier_Parts :
| S# |
Sname |
City |
P# |
Qty |
| 3 |
Smith |
London |
301 |
20 |
| 5 |
Nick |
NY |
500 |
50 |
| 2 |
Steve |
Boston |
20 |
10 |
| 5 |
Nick |
NY |
400 |
40 |
| 5 |
Nick |
NY |
301 |
10 |
|
| ⇓ |
|
Parts :
| P# |
Qty |
| 301 |
20 |
| 500 |
50 |
| 20 |
10 |
| 400 |
40 |
| 301 |
10 |
|
Supplier :
| S# |
Sname |
City |
Qty |
| 3 |
Smith |
London |
20 |
| 5 |
Nick |
NY |
50 |
| 2 |
Steve |
Boston |
10 |
| 5 |
Nick |
NY |
40 |
| 5 |
Nick |
NY |
10 |
|
The above decomposition is a careless decomposition or Lossy join Decomposition. Because, Let us apply natural join operation on the decomposed relations.
Parts ⋈ Supplier
|
Parts ⋈ Supplier :
| S# |
Sname |
City |
P# |
Qty |
| 3 |
Smith |
London |
301 |
20 |
| 5 |
Nick |
NY |
500 |
50 |
| 5 |
Nick |
NY |
20 |
10 |
| 2 |
Steve |
Boston |
20 |
10 |
| 5 |
Nick |
NY |
400 |
40 |
| 5 |
Nick |
NY |
301 |
10 |
| 2 |
Steve |
Boston |
301 |
10 |
|
≠ Supplier_Parts |
Although, every tuple that appears in the
Supplier_Parts relation appears in
Parts ⋈ Supplier, there are tuples in
Parts ⋈ Supplier that are not in
Supplier_Parts. The spurious tuples or the extra tuples that are not in the relation are :
- (5, Nick, NY, 20, 10)
- (2, Steve, Boston, 301, 10)
A closer look on the relation
Parts ⋈ Supplier, will lead to wrong data and therefore, we have less and misleading information.
Definition of Lossy Join Decomposition :
Let
R be the relational schema with instance
r is decomposed into
R1,R2,….,Rn with instance
r1,r2,…..,rn.
If
r1 ⋈ r2 ⋈ ……. ⋈ rn ⊃ r , then it is called Lossy Join Decomposition.
i.e. if the original relation is the proper subset of natural joins of all the decompositions, then it is said to be Lossy Join Decomposition.
In the above example, we can say that, Supplier_Parts is the subset of natural join of parts and supplier, so we can say that
Parts ⋈ Supplier ⊃ Supplier_Parts and therefore, the decomposition is lossy join decomposition.
Why Lossy Join Decomposition is called Lossy although the relation is getting extra tuples ?
⇒ Because we are loosing original Data.
In short, we design such system such that these undesirable properties do not occur in decomposition. Some desirable properties of decomposition are discussed in next post.]]>