select distinct on

select distinct on

Post by Dave A » Wed, 22 Nov 2000 07:16:36



Hello,

I am trying to use SELECT DISTINCT ON.  This works as intended:
        SELECT DISTINCT ON (f1) * from table ORDER BY f1

 f1 | f2 | f3
----+----+----
  1 |  2 |  3
  2 |  2 |  2
  3 |  1 |  2

However, I am trying to order the results by non-distinct fields f2, f3
to get:

 f1 | f2 | f3
----+----+----
  3 |  1 |  2
  2 |  2 |  2
  1 |  2 |  3

Clearly this does not work:
        SELECT DISTINCT ON (f1) * from table ORDER BY f2, f3, f1

So is there some other way without SELECT INTO a temporary table?
Thanks in advance.
Dave
--

When you were born, you cried and the world rejoiced.  Try to live your life
so that when you die, you will rejoice and the world will cry.  -1/2 jj^2

 
 
 

select distinct on

Post by Tom La » Wed, 22 Nov 2000 08:10:13



> Clearly this does not work:
>    SELECT DISTINCT ON (f1) * from table ORDER BY f2, f3, f1

No; the implementation of DISTINCT requires the DISTINCT fields to be
the major sort keys, so you have to ORDER BY f1,f2,f3 instead.

If you want the final output sorted differently, in 7.0 you don't really
have much choice but to do the SELECT DISTINCT into a temporary table.
In 7.1 it'll be possible to do it via subselect-in-FROM:

        SELECT * FROM
          (SELECT DISTINCT ON (f1) * FROM table ORDER BY f1, f2, f3) tmp
        ORDER BY f2, f3, f1;

but in 7.0 there's no way to cause the system to do multiple sorting
passes in one query.

                        regards, tom lane

 
 
 

1. select distinct much slower than non-distinct

(I don't have the data, I'm posting this for a friend)

He has a big, fat query that left joins a dozen tables and has a dozen
where clause terms.  It runs against a million rows, comes back in
three seconds with about three hundred rows.  So far so good.  The
problem is there are a dozen duplicate rows.

So, add the "distinct" keyword, right?  He does so and it takes over
thirty seconds.  I've suggested he store the 300 rows into a temp
table, then select distinct against it, but I haven't heard back yet
about how that works.  I can't imagine a problem.

This *sounds* to me like a bug in the SQLServer execution planning.  I
don't have any showplan stuff, I also suggested he look at that.  Any
other suggestions?  Is this a known bug in SQL2K?

Thanks.

Joshua Stern

2. DB2: Stored procedures vs functions

3. SELECT DISTINCT not distinct enough

4. Load SQL table from Recordset

5. SELECT DISTINCT + ORDER BY gives ERROR 145: ORDER BY items must appear in the select list if SELECT DISTINCT is specified.

6. Help Please, What Shift Am I On?

7. ORDER BY items must appear in the select list if SELECT DISTINCT is specified

8. Select Top works some of the time?

9. SELECT works, but SELECT DISTINCT fails

10. SELECT DISTINCT yields more than SELECT

11. SELECT DISTINCT blocks; SELECT ALL doesn't

12. Select Distinct Date by Weekday and Date Range

13. SELECT DISTINCT - Access