SQL dilemma and gen'l Sas-L question

SQL dilemma and gen'l Sas-L question

Post by Lee Medo » Wed, 29 Nov 2000 04:00:00



Perhaps someone could help me with the following.

I have two files which I'm trying to join, using a full join in Proc Sql.
Unfortunately, some of the data gets lost in the process (obviously not a
good thing).

The structure of the files is as follows:

Data set 1 has variables a,b,...,g,h
Data set 2 has variables a,b,i,...,l,m

The key on which I'm joining the two files, element 'a', is not always
repeated on both files.  In other words, the overlap is not complete on the
key.  Where it isn't, I still want to bring in the data, but this doesn't
happen, as of now.  Any advice?

Separately, how can I direct Sas-L communications to a separate folder
within my e-mail inbox.  As it now stands I'm getting buried alive with
messages.

 
 
 

SQL dilemma and gen'l Sas-L question

Post by Howard Schrei » Wed, 29 Nov 2000 04:00:00


1. It's hard to answer since you did not show either representative data or
the code which produces the unacceptable results. But

   data one; input a b c; cards;
   1 1 3
   2 1 4
   ;
   data two; input a b d; cards;
   1 2 5
   3 2 6
   ;
   proc sql;
   select coalesce(one.a,two.a) as a,
          coalesce(one.b,two.b) as b,
          c,
          d
    from one full join two on one.a=two.a;
   quit;

produces

   A         B         C         D

   1         1         3         5
   2         1         4         .
   3         2         .         6

If you want the B values from TWO to override those from ONE, reverse the
arguments to COALESCE.

2. The e-mail management functionality you are looking would come from your
mailreader client. If you can't find the answer locally, there is probably a
newsgroup or list devoted to usage of that product.



Quote:>Perhaps someone could help me with the following.

>I have two files which I'm trying to join, using a full join in Proc Sql.
>Unfortunately, some of the data gets lost in the process (obviously not a
>good thing).

>The structure of the files is as follows:

>Data set 1 has variables a,b,...,g,h
>Data set 2 has variables a,b,i,...,l,m

>The key on which I'm joining the two files, element 'a', is not always
>repeated on both files.  In other words, the overlap is not complete on the
>key.  Where it isn't, I still want to bring in the data, but this doesn't
>happen, as of now.  Any advice?

>Separately, how can I direct Sas-L communications to a separate folder
>within my e-mail inbox.  As it now stands I'm getting buried alive with
>messages.


 
 
 

SQL dilemma and gen'l Sas-L question

Post by HERMAN » Thu, 30 Nov 2000 04:00:00


I might add that one can also rename column variables not included in the ON
condition, as in One.b as b1,Two.b as b2.  But I have a more compelling reason
for tagging a note on Howard's answer.  I tried FULL JOIN under V8.1 under
Windows '98 and found that I had to switch to OUTER JOIN to get the expected
results.  Under V8.0 under Windows '95 I had to switch back to FULL JOIN.  Am I
becoming more delusional than usual?  I don't have V8.1 close at hand just now.
Those that do could run Howard's test program to see if it works as expected.
V8.0 produces a syntax error when fed OUTER JOIN.  Thanks.  Sig
-----Original Message-----

Sent: Tuesday, November 28, 2000 4:52 PM

Subject: RE: SQL dilemma and gen'l Sas-L question

1. It's hard to answer since you did not show either representative data or
the code which produces the unacceptable results. But

   data one; input a b c; cards;
   1 1 3
   2 1 4
   ;
   data two; input a b d; cards;
   1 2 5
   3 2 6
   ;
   proc sql;
   select coalesce(one.a,two.a) as a,
          coalesce(one.b,two.b) as b,
          c,
          d
    from one full join two on one.a=two.a;
   quit;

produces

   A         B         C         D

   1         1         3         5
   2         1         4         .
   3         2         .         6

If you want the B values from TWO to override those from ONE, reverse the
arguments to COALESCE.

2. The e-mail management functionality you are looking would come from your
mailreader client. If you can't find the answer locally, there is probably a
newsgroup or list devoted to usage of that product.



>Perhaps someone could help me with the following.

>I have two files which I'm trying to join, using a full join in Proc Sql.
>Unfortunately, some of the data gets lost in the process (obviously not a
>good thing).

>The structure of the files is as follows:

>Data set 1 has variables a,b,...,g,h
>Data set 2 has variables a,b,i,...,l,m

>The key on which I'm joining the two files, element 'a', is not always
>repeated on both files.  In other words, the overlap is not complete on the
>key.  Where it isn't, I still want to bring in the data, but this doesn't
>happen, as of now.  Any advice?

>Separately, how can I direct Sas-L communications to a separate folder
>within my e-mail inbox.  As it now stands I'm getting buried alive with
>messages.

 
 
 

SQL dilemma and gen'l Sas-L question

Post by William W. Viergev » Thu, 30 Nov 2000 04:00:00


Hi Sig:

Yup - no error msgs under 8.1:
+++++++++++++++++++++++++++++++++++++++++++

NOTE: Copyright (c) 1999-2000 by SAS Institute Inc., Cary, NC, USA.
NOTE: SAS (r) Proprietary Software Release 8.1 (TS1M0)
       Licensed to VIERGEVER & ASSOCIATES, Site 0019446002.
NOTE: This session is executing on the WIN_PRO  platform.

NOTE: SAS initialization used:
       real time           4.01 seconds
       cpu time            0.54 seconds

1    data one; input a b c; cards;

NOTE: The data set WORK.ONE has 2 observations and 3 variables.
NOTE: DATA statement used:
       real time           0.27 seconds
       cpu time            0.03 seconds

4    ;
5    data two; input a b d; cards;

NOTE: The data set WORK.TWO has 2 observations and 3 variables.
NOTE: DATA statement used:
       real time           0.00 seconds
       cpu time            0.00 seconds

8    ;
9    proc sql;
10      select coalesce(one.a,two.a) as a,
11             coalesce(one.b,two.b) as b,
12             c,
13             d
14      from one full join two on one.a=two.a;
15   quit

                         The SAS System   09:42 Wednesday, November 29,
2000   1

                                     a         b         c         d
                              --------------------------------------
                                     1         1         3         5
                                     2         1         4         .
                                     3         2         .         6

+++++++++++++++++++++++++++++++++++++++++++++

>I might add that one can also rename column variables not included in the ON
>condition, as in One.b as b1,Two.b as b2.  But I have a more compelling reason
>for tagging a note on Howard's answer.  I tried FULL JOIN under V8.1 under
>Windows '98 and found that I had to switch to OUTER JOIN to get the expected
>results.  Under V8.0 under Windows '95 I had to switch back to FULL
>JOIN.  Am I
>becoming more delusional than usual?  I don't have V8.1 close at hand just
>now.
>Those that do could run Howard's test program to see if it works as expected.
>V8.0 produces a syntax error when fed OUTER JOIN.  Thanks.  Sig

>-----Original Message-----

>Sent: Tuesday, November 28, 2000 4:52 PM

>Subject: RE: SQL dilemma and gen'l Sas-L question

>1. It's hard to answer since you did not show either representative data or
>the code which produces the unacceptable results. But

>    data one; input a b c; cards;
>    1 1 3
>    2 1 4
>    ;
>    data two; input a b d; cards;
>    1 2 5
>    3 2 6
>    ;
>    proc sql;
>    select coalesce(one.a,two.a) as a,
>           coalesce(one.b,two.b) as b,
>           c,
>           d
>     from one full join two on one.a=two.a;
>    quit;

>produces

>    A         B         C         D

>    1         1         3         5
>    2         1         4         .
>    3         2         .         6

>If you want the B values from TWO to override those from ONE, reverse the
>arguments to COALESCE.

>2. The e-mail management functionality you are looking would come from your
>mailreader client. If you can't find the answer locally, there is probably a
>newsgroup or list devoted to usage of that product.



> >Perhaps someone could help me with the following.

> >I have two files which I'm trying to join, using a full join in Proc Sql.
> >Unfortunately, some of the data gets lost in the process (obviously not a
> >good thing).

> >The structure of the files is as follows:

> >Data set 1 has variables a,b,...,g,h
> >Data set 2 has variables a,b,i,...,l,m

> >The key on which I'm joining the two files, element 'a', is not always
> >repeated on both files.  In other words, the overlap is not complete on the
> >key.  Where it isn't, I still want to bring in the data, but this doesn't
> >happen, as of now.  Any advice?

> >Separately, how can I direct Sas-L communications to a separate folder
> >within my e-mail inbox.  As it now stands I'm getting buried alive with
> >messages.

----------------------------------------------------------------------------
William W. Viergever                      Voice  : (916) 483-8398
Viergever & Associates                    Fax    : (916) 486-1488

----------------------------------------------------------------------------
 
 
 

SQL dilemma and gen'l Sas-L question

Post by Grace La Torr » Thu, 30 Nov 2000 04:00:00



> Perhaps someone could help me with the following.

> I have two files which I'm trying to join, using a full join in Proc Sql.
> Unfortunately, some of the data gets lost in the process (obviously not a
> good thing).

> The structure of the files is as follows:

> Data set 1 has variables a,b,...,g,h
> Data set 2 has variables a,b,i,...,l,m

> The key on which I'm joining the two files, element 'a', is not always
> repeated on both files.  In other words, the overlap is not complete on the
> key.  Where it isn't, I still want to bring in the data, but this doesn't
> happen, as of now.  Any advice?

Proc sql;

     create table tablenew as
     select coalesce(data1.a, data2.a) as a, data1.*, data2.*
        from data1 full join data2
        on data1.a = data2.a;

quit;

There, I think that does it!

Hope this helps,
Grace La Torra
Systems Analyst
New Mexico State University

 
 
 

1. SQL dilemma

Suppose I've got two DB2 tables, which I will simulate here using SAS
datasets for the moment (the actual tables contain millions of records):

data test1;
input ID;
cards;
1234
5678
9876
5432
;
run;

data test2;
input ID;
cards;
1234
9876
7777
;
run;

I need to find the count of unique IDs across both tables; that is to say, I
need to count every distinct value of ID only once, without double (or
triple, quadruple, etc) counting any id that appears in both tables.

So, I'd expect the resulting count from the simulated tables above to be 5,
or, if I were to get the actual IDs instead of the count:

1234
5678
9876
5432
7777

So far the best thing I've hit on (below) seems to produce the list of IDs I
expect, but I think it is very inefficient as it involves two Cartesian
joins.  Also, I haven't quite worked out how to get the count of the IDs
rather than the IDs themselves from this query.

proc sql;

create table test3 as

select a.id from test1 a,
                       test2 b
where a.id ne b.id

union

select b.id from test1 a,
                       test2 b
where b.id ne a.id;

quit;

The query will really be done in a pass-through SQL to DB2, so whatever I do
has to be valid DB2 SQL and I need to handle it on the DB2 side rather than
pull both multi-million row tables into SAS to do it in a data step.

Can anyone suggest a better approach?

Thanks,

Sterling Price

**********************************************************************
Notice:  The area code for the Wal-Mart General Office has
changed from 501 to 479.  Please make sure that you are
dialing 479 when making calls to any General Office location.

**********************************************************************
This email and any files transmitted with it are confidential
and intended solely for the individual or entity to
whom they are addressed.  If you have received this email
in error destroy it immediately.
**********************************************************************

2. Can't section dwg in SWX99

3. SQL Dilemma

4. VMS Millisecond Timer

5. Cut'n'past on the SAS display manager command line, Windows/SAS 8

6. Command-line equivalent to Device Manager's "Scan for hardware changes"

7. Next Gen PC's ?

8. Cursor positioning in consol app.

9. SAS/SQL vs MS Access SQL

10. OT: The developer's dilemma

11. need gen'l c/s info

12. How do mib2c-gen'ed files relate to snmpd?

13. Please Help - HP Scanjet 4c ands NT