Urgent Help needed in a query - Self Join

Urgent Help needed in a query - Self Join

Post by Shafee » Tue, 29 Jul 2003 17:57:06



Hi all, I want to find out the ID of the oldest person in each family,
please go through the steps that I tried,

Address Table
===========
id
name
age
sex
date_of_birth
addressline1
addressline2
city
state
zip

ID is unique.
I want to select the oldest person from each family. I used the query
(1) SELECT ADDRESS_LINE_1, ADDRESS_LINE_2, city, state, ZIP_CODE,
MIN(DATE_OF_BIRTH) AS DATE_OF_BIRTH, COUNT(*) FROM  ADDRESS GROUP BY
ADDRESS_LINE_1, ADDRESS_LINE_2, city, state, ZIP_CODE

I got 21,000 rows, But I don't have the ID with me, now, Inorder to get the
ID, I tried the following query

(2) SELECT B.ID,B.ADDRESS_LINE_1, B.ADDRESS_LINE_2, B.city, B.state,
B.ZIP_CODE  FROM

(SELECT ADDRESS_LINE_1, ADDRESS_LINE_2, city, state, ZIP_CODE,
MIN(DATE_OF_BIRTH) AS DATE_OF_BIRTH, COUNT(*) FROM  ADDRESS GROUP BY
ADDRESS_LINE_1, ADDRESS_LINE_2, city, state, ZIP_CODE) A,

 ADDRESS B
    WHERE A.ADDRESS_LINE_1=B.ADDRESS_LINE_1
       AND A.ADDRESS_LINE_2 = B.ADDRESS_LINE_2
       AND A.city = B.city
       AND A.state = B.state
       AND A.ZIP_CODE =B.ZIP_CODE
       AND A.DATE_OF_BIRTH = B.DATE_OF_BIRTH

But this query is returning 8000 values, where there is 21,000 different
familes are there returned by the first query, Please help

Thanks
Shafeek Khalidh

 
 
 

Urgent Help needed in a query - Self Join

Post by Bob Barrow » Tue, 29 Jul 2003 18:08:53


This will be a guess, since you did not provide DDL and sample data (can't
more than one family live at the same address?)

This view returns the age of the oldest person at each address:

CREATE VIEW MaxAgePerAddress
Select addressline1,addressline2, city, state, ZIP_CODE,
max(age) as oldest
from address
group by addressline1,addressline2, city, state, ZIP_CODE

Now, just use the above as a virtual table to get the id of the person (or
persons: there may be ties, correct?):

Select a.ADDRESS_LINE_1, a.ADDRESS_LINE_2, a.city,
a.state, a.ZIP_CODE, id from
address a inner join MaxAgePerAddress v
on a.ADDRESS_LINE_1 = v.ADDRESS_LINE_1 AND
a.ADDRESS_LINE_2 = v.ADDRESS_LINE_2 AND
a.city = v.city AND a.state = v.state AND
a.ZIP_CODE = v.ZIP_CODE

HTH,
Bob Barrows


> Hi all, I want to find out the ID of the oldest person in each family,
> please go through the steps that I tried,

> Address Table
> ===========
> id
> name
> age
> sex
> date_of_birth
> addressline1
> addressline2
> city
> state
> zip

> ID is unique.
> I want to select the oldest person from each family. I used the query
> (1) SELECT ADDRESS_LINE_1, ADDRESS_LINE_2, city, state, ZIP_CODE,
> MIN(DATE_OF_BIRTH) AS DATE_OF_BIRTH, COUNT(*) FROM  ADDRESS GROUP BY
> ADDRESS_LINE_1, ADDRESS_LINE_2, city, state, ZIP_CODE

> I got 21,000 rows, But I don't have the ID with me, now, Inorder to
> get the ID, I tried the following query

> (2) SELECT B.ID,B.ADDRESS_LINE_1, B.ADDRESS_LINE_2, B.city, B.state,
> B.ZIP_CODE  FROM

> (SELECT ADDRESS_LINE_1, ADDRESS_LINE_2, city, state, ZIP_CODE,
> MIN(DATE_OF_BIRTH) AS DATE_OF_BIRTH, COUNT(*) FROM  ADDRESS GROUP BY
> ADDRESS_LINE_1, ADDRESS_LINE_2, city, state, ZIP_CODE) A,

>  ADDRESS B
>     WHERE A.ADDRESS_LINE_1=B.ADDRESS_LINE_1
>        AND A.ADDRESS_LINE_2 = B.ADDRESS_LINE_2
>        AND A.city = B.city
>        AND A.state = B.state
>        AND A.ZIP_CODE =B.ZIP_CODE
>        AND A.DATE_OF_BIRTH = B.DATE_OF_BIRTH

> But this query is returning 8000 values, where there is 21,000
> different familes are there returned by the first query, Please help

> Thanks
> Shafeek Khalidh


 
 
 

Urgent Help needed in a query - Self Join

Post by Steve Kas » Tue, 29 Jul 2003 18:56:13


Bob,

  Your view contains one row for every address,
and your query selects id and address for every
row in the view, hence selects everything.

  Something is fishy, since you could leave out the MAX(age)
column from the view without troubling the query...

This is the week for these kinds of queries I think.  One
solution (guessing, without DDL) to get the ids is

select id from address A1
where not exists (
  select * from address A2
  where A1.addressline1 = A2.addressline1
  and A1.addressline2 = A2.addressline2
  and ... (same for city, state, zip)
  and A1.date_of_birth < A2.date_of_birth
)
-- if addressline2 or other address column can be null, needs adjustment

Another, using a tag-along sort, is

select
  right(min(convert(char(8),date_of_birth,112) +
convert(char(20),id)),20) as id
from address
group by addressline1, addressline2, city, state, zip
-- will only return the lowest id number if there are ties in a household

-- Steve Kass
-- Drew University
-- Ref: 6A04FEEB-F509-461B-A91D-5134333230AD


>This will be a guess, since you did not provide DDL and sample data (can't
>more than one family live at the same address?)

>This view returns the age of the oldest person at each address:

>CREATE VIEW MaxAgePerAddress
>Select addressline1,addressline2, city, state, ZIP_CODE,
>max(age) as oldest
>from address
>group by addressline1,addressline2, city, state, ZIP_CODE

>Now, just use the above as a virtual table to get the id of the person (or
>persons: there may be ties, correct?):

>Select a.ADDRESS_LINE_1, a.ADDRESS_LINE_2, a.city,
>a.state, a.ZIP_CODE, id from
>address a inner join MaxAgePerAddress v
>on a.ADDRESS_LINE_1 = v.ADDRESS_LINE_1 AND
>a.ADDRESS_LINE_2 = v.ADDRESS_LINE_2 AND
>a.city = v.city AND a.state = v.state AND
>a.ZIP_CODE = v.ZIP_CODE

>HTH,
>Bob Barrows


>>Hi all, I want to find out the ID of the oldest person in each family,
>>please go through the steps that I tried,

>>Address Table
>>===========
>>id
>>name
>>age
>>sex
>>date_of_birth
>>addressline1
>>addressline2
>>city
>>state
>>zip

>>ID is unique.
>>I want to select the oldest person from each family. I used the query
>>(1) SELECT ADDRESS_LINE_1, ADDRESS_LINE_2, city, state, ZIP_CODE,
>>MIN(DATE_OF_BIRTH) AS DATE_OF_BIRTH, COUNT(*) FROM  ADDRESS GROUP BY
>>ADDRESS_LINE_1, ADDRESS_LINE_2, city, state, ZIP_CODE

>>I got 21,000 rows, But I don't have the ID with me, now, Inorder to
>>get the ID, I tried the following query

>>(2) SELECT B.ID,B.ADDRESS_LINE_1, B.ADDRESS_LINE_2, B.city, B.state,
>>B.ZIP_CODE  FROM

>>(SELECT ADDRESS_LINE_1, ADDRESS_LINE_2, city, state, ZIP_CODE,
>>MIN(DATE_OF_BIRTH) AS DATE_OF_BIRTH, COUNT(*) FROM  ADDRESS GROUP BY
>>ADDRESS_LINE_1, ADDRESS_LINE_2, city, state, ZIP_CODE) A,

>> ADDRESS B
>>    WHERE A.ADDRESS_LINE_1=B.ADDRESS_LINE_1
>>       AND A.ADDRESS_LINE_2 = B.ADDRESS_LINE_2
>>       AND A.city = B.city
>>       AND A.state = B.state
>>       AND A.ZIP_CODE =B.ZIP_CODE
>>       AND A.DATE_OF_BIRTH = B.DATE_OF_BIRTH

>>But this query is returning 8000 values, where there is 21,000
>>different familes are there returned by the first query, Please help

>>Thanks
>>Shafeek Khalidh

 
 
 

Urgent Help needed in a query - Self Join

Post by Shafeek Khalid » Tue, 29 Jul 2003 19:13:03


Thanks Bob,
I tried this and same result, 13,087 records are missing, But i found that
these fields have either
ADDRESS_LINE_1 or ADDRESS_LINE_2 or.city or state or ZIP_CODE values as
NULL. That may be the resoason, they are not appearing when we tried to
compare the records,
Is there any way to compare these records with null

Thanks for your immediate help
Shafeek Khalidh

 
 
 

Urgent Help needed in a query - Self Join

Post by Shafeek Khalid » Tue, 29 Jul 2003 19:26:08


My idea is now to use NVL2 and substitute a string for NULL values and then
compare, Is it the correct way?
 
 
 

Urgent Help needed in a query - Self Join

Post by Bob Barrow » Tue, 29 Jul 2003 22:56:13


Oops! my bad, I meant to do this:
Select a.ADDRESS_LINE_1, a.ADDRESS_LINE_2, a.city,
a.state, a.ZIP_CODE, id from
address a inner join MaxAgePerAddress v
on a.ADDRESS_LINE_1 = v.ADDRESS_LINE_1 AND
a.ADDRESS_LINE_2 = v.ADDRESS_LINE_2 AND
a.city = v.city AND a.state = v.state AND
a.ZIP_CODE = v.ZIP_CODE
AND a.age = v.age

But you are right about the possible Nulls ...

Bob


> Bob,

>   Your view contains one row for every address,
> and your query selects id and address for every
> row in the view, hence selects everything.

>   Something is fishy, since you could leave out the MAX(age)
> column from the view without troubling the query...

> This is the week for these kinds of queries I think.  One
> solution (guessing, without DDL) to get the ids is

> select id from address A1
> where not exists (
>   select * from address A2
>   where A1.addressline1 = A2.addressline1
>   and A1.addressline2 = A2.addressline2
>   and ... (same for city, state, zip)
>   and A1.date_of_birth < A2.date_of_birth
> )
> -- if addressline2 or other address column can be null, needs
> adjustment

> Another, using a tag-along sort, is

> select
>   right(min(convert(char(8),date_of_birth,112) +
> convert(char(20),id)),20) as id
> from address
> group by addressline1, addressline2, city, state, zip
> -- will only return the lowest id number if there are ties in a
> household

> -- Steve Kass
> -- Drew University
> -- Ref: 6A04FEEB-F509-461B-A91D-5134333230AD


>> This will be a guess, since you did not provide DDL and sample data
>> (can't more than one family live at the same address?)

>> This view returns the age of the oldest person at each address:

>> CREATE VIEW MaxAgePerAddress
>> Select addressline1,addressline2, city, state, ZIP_CODE,
>> max(age) as oldest
>> from address
>> group by addressline1,addressline2, city, state, ZIP_CODE

>> Now, just use the above as a virtual table to get the id of the
>> person (or persons: there may be ties, correct?):

>> Select a.ADDRESS_LINE_1, a.ADDRESS_LINE_2, a.city,
>> a.state, a.ZIP_CODE, id from
>> address a inner join MaxAgePerAddress v
>> on a.ADDRESS_LINE_1 = v.ADDRESS_LINE_1 AND
>> a.ADDRESS_LINE_2 = v.ADDRESS_LINE_2 AND
>> a.city = v.city AND a.state = v.state AND
>> a.ZIP_CODE = v.ZIP_CODE

>> HTH,
>> Bob Barrows


>>> Hi all, I want to find out the ID of the oldest person in each
>>> family, please go through the steps that I tried,

>>> Address Table
>>> ===========
>>> id
>>> name
>>> age
>>> sex
>>> date_of_birth
>>> addressline1
>>> addressline2
>>> city
>>> state
>>> zip

>>> ID is unique.
>>> I want to select the oldest person from each family. I used the
>>> query (1) SELECT ADDRESS_LINE_1, ADDRESS_LINE_2, city, state,
>>> ZIP_CODE, MIN(DATE_OF_BIRTH) AS DATE_OF_BIRTH, COUNT(*) FROM
>>> ADDRESS GROUP BY ADDRESS_LINE_1, ADDRESS_LINE_2, city, state,
>>> ZIP_CODE

>>> I got 21,000 rows, But I don't have the ID with me, now, Inorder to
>>> get the ID, I tried the following query

>>> (2) SELECT B.ID,B.ADDRESS_LINE_1, B.ADDRESS_LINE_2, B.city, B.state,
>>> B.ZIP_CODE  FROM

>>> (SELECT ADDRESS_LINE_1, ADDRESS_LINE_2, city, state, ZIP_CODE,
>>> MIN(DATE_OF_BIRTH) AS DATE_OF_BIRTH, COUNT(*) FROM  ADDRESS GROUP BY
>>> ADDRESS_LINE_1, ADDRESS_LINE_2, city, state, ZIP_CODE) A,

>>> ADDRESS B
>>>    WHERE A.ADDRESS_LINE_1=B.ADDRESS_LINE_1
>>>       AND A.ADDRESS_LINE_2 = B.ADDRESS_LINE_2
>>>       AND A.city = B.city
>>>       AND A.state = B.state
>>>       AND A.ZIP_CODE =B.ZIP_CODE
>>>       AND A.DATE_OF_BIRTH = B.DATE_OF_BIRTH

>>> But this query is returning 8000 values, where there is 21,000
>>> different familes are there returned by the first query, Please help

>>> Thanks
>>> Shafeek Khalidh

 
 
 

Urgent Help needed in a query - Self Join

Post by Bob Barrow » Tue, 29 Jul 2003 23:07:38



> Thanks Bob,
> I tried this and same result, 13,087 records are missing, But i found
> that these fields have either
> ADDRESS_LINE_1 or ADDRESS_LINE_2 or.city or state or ZIP_CODE values
> as NULL. That may be the resoason, they are not appearing when we
> tried to compare the records,
> Is there any way to compare these records with null

> Thanks for your immediate help
> Shafeek Khalidh

I think the best thing to do is run an Update query that replaces the Nulls
with 'NA' (short for Not Available) or something. If you can't do that for
some reason, then you're going to need to do something like this:

CREATE VIEW MaxAgePerAddress
Select coalesce(addressline1,'NA') addressline1,
coalesce(addressline2,'NA') addressline2,
coalesce(city,'NA') city,
coalesce(state,'NA') state,
coalesce(ZIP_CODE,'NA') ZIP_CODE,
max(age) as oldest
from address
group by Select coalesce(addressline1,'NA') ,
coalesce(addressline2,'NA') ,
coalesce(city,'NA') ,
coalesce(state,'NA') ,
coalesce(ZIP_CODE,'NA')

Select a.ADDRESS_LINE_1, a.ADDRESS_LINE_2, a.city,
a.state, a.ZIP_CODE, id from
address a inner join MaxAgePerAddress v
on coalesce(a.addressline1,'NA')  = v.ADDRESS_LINE_1 AND
coalesce(a.addressline2,'NA') = v.ADDRESS_LINE_2 AND
coalesce(a.city,'NA')= v.city AND
coalesce(a.state,'NA') = v.state AND
a.ZIP_CODE = v.ZIP_CODE
AND a.age = v.age

This will not perform well with if you have a lot of rows. Further
normalization of this design would not hurt.

Bob Barrows

 
 
 

1. Need Help with self join query

Hi, I am having trouble writing a query to flatten a table.  The problem I
am having is returning all rows including those that don't match - so to
speak.

Please SQL only, i'm not interested in RAC or any other add on products.

create table orders(
CompanyName varchar(50),
product varchar(10),
charged decimal(10,2),
Paid decimal(10,2))

insert into orders values('SPC','Beans',400,200)
insert into orders values('SPC','Rice',200,100)
insert into orders values('ADX','Rice',300,50)

Anticipate results are
SPC Beans 400.00 200.00 Rice 200.00 100.00
ADX                             Rice    300.00   50.00

Thanks,
Lance

2. Help with Store Procedure

3. Need help with self join query

4. HELP: Approaches to storing perl objects in RDBMS

5. self join - urgent - pl. help

6. 4682-CT-Stamford-ORACLE Financials-ORACLE Manufacturing-DBA Skills-ORACLE-Database Administrator

7. Self Referencing Table - Urgent Help Needed !

8. Cursor (fetch status never re-sets)

9. Need help with self join

10. Please help with self join Query

11. Inner Join / Self Joins help!

12. Need help turning OUTER JOIN query into UNION query

13. HELP PLEASE: URGENT: Need help forming a multi table query