SQL-92 outer join vs T-SQL outer join (6.5 or 7.0) - test script included

SQL-92 outer join vs T-SQL outer join (6.5 or 7.0) - test script included

Post by David Edelma » Wed, 01 May 2002 03:50:11



Take the following scenario:

We have two tables that have somewhat of a parent-child
relationship. We are trying to use a SQL-92 outer join
that returns the same results as a TSQL *= outer join. The
difficulty we are having is that some of the parent
records do not have any corresponding child records, but
we still want to see those parent records with 0 (zero)
for the count. How can we accomplish this with a SQL-92
compliant join (if it is even possible)? In the query
results below, we would like the first set of results.

As a secondary question, if we take the WHERE clause off
of the SQL-92 query below, we get the same results as we
would with the T-SQL query and no filtering on the child
table.  Why does a WHERE clause on a child (inner) table
affect the functionality of the SQL-92 OUTER JOIN?

Thanks in advance for any help.
-David Edelman

Test script below, followed by results
===========================================
create table parent (p_id int NOT NULL)
go
create table child (p_id int NOT NULL, c_type varchar(6)
NULL)
go
insert parent values (1)
insert parent values (2)
insert parent values (3)
insert parent values (4)
insert parent values (5)
insert parent values (6)
insert parent values (7)
insert parent values (8)
insert parent values (9)
insert parent values (10)
go

insert child values (1, 'AAA')
insert child values (1, 'BBB')
insert child values (1, 'CCC')
insert child values (2, 'AAA')
insert child values (4, 'AAA')
insert child values (4, 'DEF')
insert child values (4, 'AAA')
insert child values (4, 'BBB')
insert child values (5, 'AAA')
insert child values (5, 'AAA')
insert child values (6, 'AAA')
insert child values (7, 'AAA')
insert child values (7, 'BBB')
insert child values (7, 'CCC')
insert child values (7, 'DDD')
insert child values (10, 'AAA')
insert child values (10, 'CCC')
go

select p.p_id, count(c.p_id) as num_rows
from parent p, child c
where p.p_id *= c.p_id
and c.c_type in ('AAA', 'BBB')
group by p.p_id

select p.p_id, count(c.p_id) as num_rows
from parent p left outer join child c on p.p_id = c.p_id
where c.c_type in ('AAA', 'BBB')
group by p.p_id

=========================================
Results:
(T-SQL *= outer join)
p_id num_rows
----------- -----------
1 2
2 1
3 0
4 3
5 2
6 1
7 2
8 0
9 0
10 1

(SQL-92 outer join)
Warning: Null value eliminated from aggregate.
p_id num_rows
----------- -----------
1 2
2 1
4 3
5 2
6 1
7 2
10 1

 
 
 

SQL-92 outer join vs T-SQL outer join (6.5 or 7.0) - test script included

Post by Anith Se » Wed, 01 May 2002 04:02:47


select p.p_id, count(c.p_id) as num_rows
from parent p left outer join (select * from child
where c_type in ('AAA', 'BBB')) as c on p.p_id = c.p_id
group by p.p_id

Thanks
Anith


Quote:> Take the following scenario:

> We have two tables that have somewhat of a parent-child
> relationship. We are trying to use a SQL-92 outer join
> that returns the same results as a TSQL *= outer join. The
> difficulty we are having is that some of the parent
> records do not have any corresponding child records, but
> we still want to see those parent records with 0 (zero)
> for the count. How can we accomplish this with a SQL-92
> compliant join (if it is even possible)? In the query
> results below, we would like the first set of results.

> As a secondary question, if we take the WHERE clause off
> of the SQL-92 query below, we get the same results as we
> would with the T-SQL query and no filtering on the child
> table.  Why does a WHERE clause on a child (inner) table
> affect the functionality of the SQL-92 OUTER JOIN?

> Thanks in advance for any help.
> -David Edelman

> Test script below, followed by results
> ===========================================
> create table parent (p_id int NOT NULL)
> go
> create table child (p_id int NOT NULL, c_type varchar(6)
> NULL)
> go
> insert parent values (1)
> insert parent values (2)
> insert parent values (3)
> insert parent values (4)
> insert parent values (5)
> insert parent values (6)
> insert parent values (7)
> insert parent values (8)
> insert parent values (9)
> insert parent values (10)
> go

> insert child values (1, 'AAA')
> insert child values (1, 'BBB')
> insert child values (1, 'CCC')
> insert child values (2, 'AAA')
> insert child values (4, 'AAA')
> insert child values (4, 'DEF')
> insert child values (4, 'AAA')
> insert child values (4, 'BBB')
> insert child values (5, 'AAA')
> insert child values (5, 'AAA')
> insert child values (6, 'AAA')
> insert child values (7, 'AAA')
> insert child values (7, 'BBB')
> insert child values (7, 'CCC')
> insert child values (7, 'DDD')
> insert child values (10, 'AAA')
> insert child values (10, 'CCC')
> go

> select p.p_id, count(c.p_id) as num_rows
> from parent p, child c
> where p.p_id *= c.p_id
> and c.c_type in ('AAA', 'BBB')
> group by p.p_id

> select p.p_id, count(c.p_id) as num_rows
> from parent p left outer join child c on p.p_id = c.p_id
> where c.c_type in ('AAA', 'BBB')
> group by p.p_id

> =========================================
> Results:
> (T-SQL *= outer join)
> p_id num_rows
> ----------- -----------
> 1 2
> 2 1
> 3 0
> 4 3
> 5 2
> 6 1
> 7 2
> 8 0
> 9 0
> 10 1

> (SQL-92 outer join)
> Warning: Null value eliminated from aggregate.
> p_id num_rows
> ----------- -----------
> 1 2
> 2 1
> 4 3
> 5 2
> 6 1
> 7 2
> 10 1


 
 
 

SQL-92 outer join vs T-SQL outer join (6.5 or 7.0) - test script included

Post by Dieter N?t » Wed, 01 May 2002 04:17:58


The WHERE clause is applied _after_ the join, so you compare to NULLed
values, which evaluates to UNKNOWN/FALSE. The result is the same as an INNER
join. Wait for Joe Celko to post an article ;-)

So for your expected result set, just AND the restriction to the join
condition:

select p.p_id, count(c.p_id) as num_rows
from parent p left outer join child c on p.p_id = c.p_id
and c.c_type in ('AAA', 'BBB')
group by p.p_id

Dieter


> Take the following scenario:

> We have two tables that have somewhat of a parent-child
> relationship. We are trying to use a SQL-92 outer join
> that returns the same results as a TSQL *= outer join. The
> difficulty we are having is that some of the parent
> records do not have any corresponding child records, but
> we still want to see those parent records with 0 (zero)
> for the count. How can we accomplish this with a SQL-92
> compliant join (if it is even possible)? In the query
> results below, we would like the first set of results.

> As a secondary question, if we take the WHERE clause off
> of the SQL-92 query below, we get the same results as we
> would with the T-SQL query and no filtering on the child
> table.  Why does a WHERE clause on a child (inner) table
> affect the functionality of the SQL-92 OUTER JOIN?

> Thanks in advance for any help.
> -David Edelman

> Test script below, followed by results
> ===========================================
> create table parent (p_id int NOT NULL)
> go
> create table child (p_id int NOT NULL, c_type varchar(6)
> NULL)
> go
> insert parent values (1)
> insert parent values (2)
> insert parent values (3)
> insert parent values (4)
> insert parent values (5)
> insert parent values (6)
> insert parent values (7)
> insert parent values (8)
> insert parent values (9)
> insert parent values (10)
> go

> insert child values (1, 'AAA')
> insert child values (1, 'BBB')
> insert child values (1, 'CCC')
> insert child values (2, 'AAA')
> insert child values (4, 'AAA')
> insert child values (4, 'DEF')
> insert child values (4, 'AAA')
> insert child values (4, 'BBB')
> insert child values (5, 'AAA')
> insert child values (5, 'AAA')
> insert child values (6, 'AAA')
> insert child values (7, 'AAA')
> insert child values (7, 'BBB')
> insert child values (7, 'CCC')
> insert child values (7, 'DDD')
> insert child values (10, 'AAA')
> insert child values (10, 'CCC')
> go

> select p.p_id, count(c.p_id) as num_rows
> from parent p, child c
> where p.p_id *= c.p_id
> and c.c_type in ('AAA', 'BBB')
> group by p.p_id

> select p.p_id, count(c.p_id) as num_rows
> from parent p left outer join child c on p.p_id = c.p_id
> where c.c_type in ('AAA', 'BBB')
> group by p.p_id

> =========================================
> Results:
> (T-SQL *= outer join)
> p_id num_rows
> ----------- -----------
> 1 2
> 2 1
> 3 0
> 4 3
> 5 2
> 6 1
> 7 2
> 8 0
> 9 0
> 10 1

> (SQL-92 outer join)
> Warning: Null value eliminated from aggregate.
> p_id num_rows
> ----------- -----------
> 1 2
> 2 1
> 4 3
> 5 2
> 6 1
> 7 2
> 10 1

 
 
 

SQL-92 outer join vs T-SQL outer join (6.5 or 7.0) - test script included

Post by Joe Celk » Wed, 01 May 2002 05:15:22


Here is how OUTER JOINs work in SQL-92.  Assume you are given:

Table1       Table2
 a   b        a   c
 ======       ======
 1   w        1   r
 2   x        2   s
 3   y        3   t
 4   z

and the outer join expression:

 Table1
 LEFT OUTER JOIN
 Table2
 ON Table1.a = Table2.a      <== join condition
    AND Table2.c = 't';      <== single table condition

We call Table1 the "preserved table" and Table2 the "unpreserved table"
in the query.  What I am going to give you is a little different, but
equivalent to the ANSI/ISO standards.

1)  We build the CROSS JOIN of the two tables.  Scan each row in the
result set.  

2) If the predicate tests TRUE for that row, then you keep it. You also
remove all rows derived from it from the CROSS JOIN

3) If the predicate tests FALSE or UNKNOWN for that row, then keep the
columns from the preserved table, convert all the columns from the
unpreserved table to NULLs and remove the duplicates.

So let us execute this by hand:


 Let * = passed the second predicate

 Table1 CROSS JOIN Table2
 a   b        a   c
 =========================

 1   w       2   s
 1   w       3   t *
 2   x       1   r

 2   x       3   t *
 3   y       1   r
 3   y       2   s

 4   z       1   r
 4   z       2   s
 4   z       3   t *

 Table1 LEFT OUTER JOIN Table2
 a   b        a   c
 =========================
 3   y     3      t      <= only TRUE row
 -----------------------
 1   w     NULL   NULL   Sets of duplicates
 1   w     NULL   NULL
 1   w     NULL   NULL
 -----------------------
 2   x     NULL   NULL
 2   x     NULL   NULL
 2   x     NULL   NULL
 3   y     NULL   NULL  <== derived from the TRUE set - Remove  
 3   y     NULL   NULL
 -----------------------
 4   z     NULL   NULL
 4   z     NULL   NULL
 4   z     NULL   NULL

the final results:

 Table1 LEFT OUTER JOIN Table2
 a   b        a   c
 =========================
 1   w     NULL   NULL
 2   x     NULL   NULL
 3   y     3      t
 4   z     NULL   NULL

The basic rule is that every row in the preserved table is represented
in the results in at least one result row.

There are limitations and very serious problems with the extended
equality version of an outer join used in some diseased mutant products.
Consider the two Chris Date tables

 Suppliers        SupParts
 supno             supno partno qty
 =========        ==============
 S1               S1   P1    100
 S2               S1   P2    250
 S3               S2   P1    100
                  S2   P2    250

and let's do an extended equality outer join like this:

 SELECT *
  FROM Supplier, SupParts
 WHERE Supplier.supno *= SupParts.supno
   AND qty < 200;

If I do the outer first, I get:

 Suppliers LOJ SupParts
 supno supno partno qty
 =======================
 S1     S1   P1    100
 S1     S1   P2    250
 S2     S2   P1    100
 S2     S2   P2    250
 S3   NULL  NULL   NULL

Then I apply the (qty < 200) predicate and get

 Suppliers LOJ SupParts
 supno supno partno qty
 ===================
 S1   S1   P1    100
 S2   S2   P1    100

Doing it in the opposite order

 Suppliers LOJ SupParts
 supno supno partno qty
 ===================
 S1   S1   P1    100
 S2   S2   P1    100
 S3   NULL NULL  NULL

Sybase does it one way, Oracle does it the other and Centura (nee Gupta)
lets you pick which one -- the worst of both non-standard worlds!  In
SQL-92, you have a choice and can force the order of execution. Either
do the predicates after the join ...

 SELECT *
   FROM Supplier
        LEFT OUTER JOIN
        SupParts
        ON Supplier.supno = SupParts.supno
 WHERE qty < 200;

 ... or do it in the joining:

 SELECT *
  FROM Supplier
       LEFT OUTER JOIN
       SupParts
       ON Supplier.supno = SupParts.supno
          AND qty < 200;

Another problem is that you cannot show the same table as preserved and
unpreserved in the extended equality version, but it is easy in SQL-92.
For example to find the students who have taken Math 101 and might have
taken Math 102:

 SELECT C1.student, C1.math, C2.math
  FROM (SELECT * FROM Courses WHERE math = 101) AS C1
       LEFT OUTER JOIN
       (SELECT * FROM Courses WHERE math = 102) AS C2
       ON C1.student = C2.student;

--CELKO--
 ===========================
 Please post DDL, so that people do not have to guess what the keys,
constraints, Declarative Referential Integrity, datatypes, etc. in your
schema are.

*** Sent via Developersdex http://www.developersdex.com ***
Don't just participate in USENET...get rewarded for it!

 
 
 

1. MS SQL Outer Joins VS Sybase Outer Joins

I'm working with a development team in converting a SQL Server application
to work with Sybase.  Beyond the typical syntax differences, the major
difficulty we have encountered so far is Sybase's apparent dislike of tables
joined to the first table of an outer join operation:

E.G.

SELECT * FROM TABLE1 T1, TABLE2 T2, TABLE3 T3
WHERE
T1 *= T2
AND T2 = T3   --- OR event T2 *= T3

Additionally, we have a few queries which actually filter out the results of
an inner join based on the values in the right table

E.G.

SELECT * FROM TABLE1 T1, TABLE2 T2
WHERE
T1 *= T2
AND T2.LAST_NAME LIKE 'SMITH%'

Finally, we might have a correlated sub-query that also accesses the value
of a field in a left-joined table:

E.G.

SELECT *, SUPPLIER_NO = (SELECT SUPPLIER_NO FROM TBLSUPPLIER WHERE
SUPPLIER_NO = T2.SUPPLIER_NO AND ACTIVE_FLG = 1)
FROM TABLE1 T1, TABLE2 T2
WHERE
T1 *= T2

All of these queries cause errors in Sybase, seemingly because extra tables
can't be joined to the first table in an outer join clause, or you can't
refer to a field in a left-joined table.  We are currently working around
this problem by creating views to simulate the multi-table left join (i.e.
left joining to a view which is itself a series of inner joins) or by using
a subquery in the WHERE clause to test for a value in the right table.

Are there any other ways around this?  Have any of you guys run into this
problem?  All of these queries will run correctly in SQL Server 7.  We are
using version Sybase 11.0.3, and will be upgrading shortly to 11.9.

Thanks,

Mike

2. Upgrade to ASE 12.5 Causing a Performance Hit

3. Outer Join SQL*PLUS-SQL/92

4. BCP out cobol compatible data types

5. Outer joins in Oracle vs outer joins in Sybase

6. OEO: How to recognize TIME dimension?

7. SQL outer joins -> Oracle (+) vs outer(Informix)

8. PdoxWin 4.5 Form & Query

9. Outer Joins... Which One is ANSI SQL-92

10. SQL-92 OUTER JOIN CONFORMANCE

11. SQL-92 Multiple outer joins

12. SQL-92 Outer Join Support (lack)