Slow query when joining to un-analyzed temp table...

Slow query when joining to un-analyzed temp table...

Post by Glen Parke » Sun, 08 Sep 2002 07:31:21



This is a multi-part message in MIME format.

------=_NextPart_000_00AF_01C255BA.796782A0
Content-Type: text/plain;
        charset="US-ASCII"
Content-Transfer-Encoding: 7bit

Sorry for the bulky transcript here, but I don't see a better way to
convey what's going on.  I hope someone will take a look anyway.  It's
attached as well to preserve line breaks, or the lack thereof.

Before I begin, yes the DB is well vacummed/analyzed :-)

I have a query that appears to plan out correctly, in terms of index
use, and produces only 13 rows, but is still slow (over a second for a
query that should take ~0.0 time).  The index it uses *should* also
produce 13 rows (and in other tests it does produce 13 rows).

The funky queries are commented below...

TIA,
Glen Parker

----------------------------------------------------

oms=# select version();
                           version                          
-------------------------------------------------------------
 PostgreSQL 7.2.1 on i686-pc-linux-gnu, compiled by GCC 2.96
(1 row)

oms=# create temp table __item_id as select item_id from items where
branch='10' and date_modified >= '09/01/2002';
SELECT

oms=# select count(*) from __item_id;
 count
-------
     4
(1 row)

oms=# select count(*) from itemindex;
 count
-------
 66072
(1 row)

/*
  HERE IS WHERE THE TROUBLE STARTS...
  Why does the index scan claim to have found 57387 rows?  It *should*
find 13, and this step should
  naturally then be quite fast, but it appears that this step is what
takes all the time.
*/
oms=# explain analyze select i.* from itemindex i, __item_id l where
i.item_id = l.item_id order by i.item_id,i.seq;
NOTICE:  QUERY PLAN:

Sort  (cost=4117.20..4117.20 rows=2107 width=54) (actual
time=1060.14..1060.16 rows=13 loops=1)
  ->  Merge Join  (cost=69.83..4000.90 rows=2107 width=54) (actual
time=0.73..1059.79 rows=13 loops=1)
        ->  Index Scan using idx_itemindex_itemid on itemindex i
(cost=0.00..3737.06 rows=66072 width=50) (actual time=0.37..915.49
rows=57387 loops=1)
        ->  Sort  (cost=69.83..69.83 rows=1000 width=4) (actual
time=0.12..0.14 rows=9 loops=1)
              ->  Seq Scan on __item_id l  (cost=0.00..20.00 rows=1000
width=4) (actual time=0.01..0.03 rows=4 loops=1)
Total runtime: 1060.46 msec

EXPLAIN

oms=# select i.* from itemindex i, __item_id l where i.item_id =
l.item_id order by i.item_id,i.seq;
<DATA SNIPPED>
(13 rows)

oms=# analyze __item_id;
ANALYZE

/*
  HUH?
  Analyzing a 4 row table makes a real difference??  The query does a
seq-scan on the temp table
  whether its been analyzed or not.  This is a temp table; trying to
write portable
  SQL dictates that I can't use an analyze here.
*/
oms=# explain analyze select i.* from itemindex i, __item_id l where
i.item_id = l.item_id order by i.item_id,i.seq;
NOTICE:  QUERY PLAN:

Sort  (cost=40.30..40.30 rows=8 width=54) (actual time=1.02..1.03
rows=13 loops=1)
  ->  Nested Loop  (cost=0.00..40.17 rows=8 width=54) (actual
time=0.11..0.82 rows=13 loops=1)
        ->  Seq Scan on __item_id l  (cost=0.00..1.04 rows=4 width=4)
(actual time=0.02..0.04 rows=4 loops=1)
        ->  Index Scan using idx_itemindex_itemid on itemindex i
(cost=0.00..9.76 rows=2 width=50) (actual time=0.05..0.15 rows=3
loops=4)
Total runtime: 1.25 msec

EXPLAIN

------=_NextPart_000_00AF_01C255BA.796782A0
Content-Type: text/plain;
        name="sql_transcript.txt"
Content-Transfer-Encoding: quoted-printable
Content-Disposition: attachment;
        filename="sql_transcript.txt"

oms=3D# select version();
                           version=20=20=20=20=20=20=20=20=20=20=20=20=20=
=20=20=20=20=20=20=20=20=20=20=20=20=20=20
-------------------------------------------------------------
 PostgreSQL 7.2.1 on i686-pc-linux-gnu, compiled by GCC 2.96
(1 row)

oms=3D# create temp table __item_id as select item_id from items where bran=
ch=3D'10' and date_modified >=3D '09/01/2002';
SELECT

oms=3D# select count(*) from __item_id;
 count=20
-------
     4
(1 row)

oms=3D# select count(*) from itemindex;
 count=20
-------
 66072
(1 row)

/*
  HERE IS WHERE THE TROUBLE STARTS...
  Why does the index scan claim to have found 57387 rows?  It *should* find=
 13, and this step should
  naturally then be quite fast, but it appears that this step is what takes=
 all the time.
*/
oms=3D# explain analyze select i.* from itemindex i, __item_id l where i.it=
em_id =3D l.item_id order by i.item_id,i.seq;
NOTICE:  QUERY PLAN:

Sort  (cost=3D4117.20..4117.20 rows=3D2107 width=3D54) (actual time=3D1060.=
14..1060.16 rows=3D13 loops=3D1)
  ->  Merge Join  (cost=3D69.83..4000.90 rows=3D2107 width=3D54) (actual ti=
me=3D0.73..1059.79 rows=3D13 loops=3D1)
        ->  Index Scan using idx_itemindex_itemid on itemindex i  (cost=3D0=
.00..3737.06 rows=3D66072 width=3D50) (actual time=3D0.37..915.49 rows=3D57=
387 loops=3D1)
        ->  Sort  (cost=3D69.83..69.83 rows=3D1000 width=3D4) (actual time=
=3D0.12..0.14 rows=3D9 loops=3D1)
              ->  Seq Scan on __item_id l  (cost=3D0.00..20.00 rows=3D1000 =
width=3D4) (actual time=3D0.01..0.03 rows=3D4 loops=3D1)
Total runtime: 1060.46 msec

EXPLAIN

oms=3D# select i.* from itemindex i, __item_id l where i.item_id =3D l.item=
_id order by i.item_id,i.seq;
<DATA SNIPPED>
(13 rows)

oms=3D# analyze __item_id;
ANALYZE

/*
  HUH?
  Analyzing a 4 row table makes a real difference??  The query does a seq-s=
can on the temp table
  whether its been analyzed or not.  This is a temp table; trying to write =
portable
  SQL dictates that I can't use an analyze here.
*/
oms=3D# explain analyze select i.* from itemindex i, __item_id l where i.it=
em_id =3D l.item_id order by i.item_id,i.seq;
NOTICE:  QUERY PLAN:

Sort  (cost=3D40.30..40.30 rows=3D8 width=3D54) (actual time=3D1.02..1.03 r=
ows=3D13 loops=3D1)
  ->  Nested Loop  (cost=3D0.00..40.17 rows=3D8 width=3D54) (actual time=3D=
0.11..0.82 rows=3D13 loops=3D1)
        ->  Seq Scan on __item_id l  (cost=3D0.00..1.04 rows=3D4 width=3D4)=
 (actual time=3D0.02..0.04 rows=3D4 loops=3D1)
        ->  Index Scan using idx_itemindex_itemid on itemindex i  (cost=3D0=
.00..9.76 rows=3D2 width=3D50) (actual time=3D0.05..0.15 rows=3D3 loops=3D4)
Total runtime: 1.25 msec

EXPLAIN

------=_NextPart_000_00AF_01C255BA.796782A0
Content-Type: text/plain
Content-Disposition: inline
Content-Transfer-Encoding: 8bit
MIME-Version: 1.0

---------------------------(end of broadcast)---------------------------
TIP 4: Don't 'kill -9' the postmaster

------=_NextPart_000_00AF_01C255BA.796782A0--

 
 
 

Slow query when joining to un-analyzed temp table...

Post by Stephan Sza » Sun, 08 Sep 2002 07:50:38



> oms=# create temp table __item_id as select item_id from items where
> branch='10' and date_modified >= '09/01/2002';
> SELECT

> oms=# select count(*) from __item_id;
>  count
> -------
>      4
> (1 row)

> oms=# select count(*) from itemindex;
>  count
> -------
>  66072
> (1 row)

> /*
>   HERE IS WHERE THE TROUBLE STARTS...
>   Why does the index scan claim to have found 57387 rows?  It *should*
> find 13, and this step should
>   naturally then be quite fast, but it appears that this step is what
> takes all the time.
> */

Well, this query plan thinks there are 1000 output rows from the
__item_id scan, and the lower one thinks there are 4.  In the first
it's attempting a merge join between the two results which possibly would
be better if there really were 1000 rows in __item_id.  I don't
really know, but I'd guess the 57387 is how many rows it got through
before running off the last row in __item_id in the merge.

- Show quoted text -

Quote:> oms=# explain analyze select i.* from itemindex i, __item_id l where
> i.item_id = l.item_id order by i.item_id,i.seq;
> NOTICE:  QUERY PLAN:

> Sort  (cost=4117.20..4117.20 rows=2107 width=54) (actual
> time=1060.14..1060.16 rows=13 loops=1)
>   ->  Merge Join  (cost=69.83..4000.90 rows=2107 width=54) (actual
> time=0.73..1059.79 rows=13 loops=1)
>         ->  Index Scan using idx_itemindex_itemid on itemindex i
> (cost=0.00..3737.06 rows=66072 width=50) (actual time=0.37..915.49
> rows=57387 loops=1)
>         ->  Sort  (cost=69.83..69.83 rows=1000 width=4) (actual
> time=0.12..0.14 rows=9 loops=1)
>               ->  Seq Scan on __item_id l  (cost=0.00..20.00 rows=1000
> width=4) (actual time=0.01..0.03 rows=4 loops=1)
> Total runtime: 1060.46 msec

> EXPLAIN

> oms=# select i.* from itemindex i, __item_id l where i.item_id =
> l.item_id order by i.item_id,i.seq;
> <DATA SNIPPED>
> (13 rows)

> oms=# analyze __item_id;
> ANALYZE

> /*
>   HUH?
>   Analyzing a 4 row table makes a real difference??  The query does a
> seq-scan on the temp table
>   whether its been analyzed or not.  This is a temp table; trying to
> write portable
>   SQL dictates that I can't use an analyze here.
> */
> oms=# explain analyze select i.* from itemindex i, __item_id l where
> i.item_id = l.item_id order by i.item_id,i.seq;
> NOTICE:  QUERY PLAN:

> Sort  (cost=40.30..40.30 rows=8 width=54) (actual time=1.02..1.03
> rows=13 loops=1)
>   ->  Nested Loop  (cost=0.00..40.17 rows=8 width=54) (actual
> time=0.11..0.82 rows=13 loops=1)
>         ->  Seq Scan on __item_id l  (cost=0.00..1.04 rows=4 width=4)
> (actual time=0.02..0.04 rows=4 loops=1)
>         ->  Index Scan using idx_itemindex_itemid on itemindex i
> (cost=0.00..9.76 rows=2 width=50) (actual time=0.05..0.15 rows=3
> loops=4)
> Total runtime: 1.25 msec

> EXPLAIN

---------------------------(end of broadcast)---------------------------
TIP 2: you can get off all lists at once with the unregister command


 
 
 

1. Query using UDF joined to a sub query vs Temp Table to sub Query

I have a simple query question.  Currently I have a query that joins a UDF that returns a table to a sub query of a number of secondary tables.  This query seems to run forever, but if I dump the results of the udf into a temp table and use joined to the sub query the results are as expected.  Can anyone shed some light onto why?  Also is it possible to use the UDF in this manner with possible join hints to get the query return time I'm looking for?

2. Permission Denied on Scheduled DTS

3. Slow join with temp table

4. SELECT returns wrong computing result !

5. How do you un-analyze tables?

6. Query Speed

7. Slow-running query with temp table on IDS 9.20

8. Charts in reports

9. Formula to calculate size of temp segment when analyzing tables/schemas

10. need advice on analyzing slow queries

11. Temp tables vs table joins

12. Joining 5 tables vs. using temp table

13. How do I create a temp table to join with another table