Cost estimates consistently too high - does it matter?

Cost estimates consistently too high - does it matter?

Post by Matt Clar » Sat, 09 Aug 2003 20:47:20



Hi,

I've noticed that the cost estimates for a lot of my queries are
consistently far to high.  Sometimes it's because the row estimates are
wrong, like this:

explain analyze select logtime from loginlog where
uid='Ymogen::YM_User::3e2c0869c2fdd26d8a74d218d5a6ff585d490560' and result =
'Success' order by logtime desc limit 3;
NOTICE:  QUERY PLAN:
Limit  (cost=0.00..221.85 rows=3 width=8) (actual time=0.21..2.39 rows=3
loops=1)
  ->  Index Scan Backward using loginlog_logtime_idx on loginlog
(cost=0.00..12846.69 rows=174 width=8) (actual time=0.20..2.37 rows=4
loops=1)
Total runtime: 2.48 msec

The row estimate here is off by a factor of 50, but the cost estimate is off
by a factor of 5000.

Sometimes the row estimates are good, but the costs are still too high:

explain analyze select u.email from ym_user u join mobilepm m on (m.ownerid
= u._id) where m.status = 'Validated' and m.network = 'TMOBILEUK';
NOTICE:  QUERY PLAN:
Nested Loop  (cost=0.00..2569.13 rows=441 width=145) (actual
time=1.93..248.57 rows=553 loops=1)
  ->  Seq Scan on mobilepm m  (cost=0.00..795.11 rows=441 width=58) (actual
time=1.69..132.83 rows=553 loops=1)
  ->  Index Scan using ym_user_id_idx on ym_user u  (cost=0.00..4.01 rows=1
width=87) (actual time=0.19..0.20 rows=1 loops=553)
Total runtime: 249.47 msec

loginlog has 180000 rows, mobilepm has 12000, ym_user has 50000, and they've
all been analyzed prior to running the query.

The server is a Quad PIII 700 Xeon/1MB cache, 3GB RAM, hardware RAID 10 on
two SCSI channels with 128MB write-back cache.

I've lowered the random_page_cost to 2 to reflect the decent disk IO, but I
suppose the fact that the DB & indexes are essentially all cached in RAM
might also be affecting the results, although effective_cache_size is set to
a realistic 262144 (2GB).  Those planner params in full:

#effective_cache_size = 1000  # default in 8k pages
#random_page_cost = 4
#cpu_tuple_cost = 0.01
#cpu_index_tuple_cost = 0.001
#cpu_operator_cost = 0.0025
effective_cache_size = 262144 # 2GB of FS cache
random_page_cost = 2

For now the planner seems to be making the right choices, but my concern is
that at some point the planner might start making some bad decisions,
especially on more complex queries.  Should I bother tweaking the planner
costs more, and if so which ones?  Am I fretting over nothing?

Cheers

Matt
Matt Clark
Ymogen Ltd

corp.ymogen.net

 
 
 

Cost estimates consistently too high - does it matter?

Post by Tom La » Sun, 10 Aug 2003 00:14:36



> I've noticed that the cost estimates for a lot of my queries are
> consistently far to high.

You seem to be under a misapprehension.  The cost estimates are not
in units of milliseconds, they are on an arbitrary scale with 1.0
defined as one disk fetch.

LIMIT throws another monkey wrench into the mix: the estimates for the
plan nodes underneath the limit are done as if the plan were to be
executed to completion, which of course it won't be.

                        regards, tom lane

---------------------------(end of broadcast)---------------------------


 
 
 

Cost estimates consistently too high - does it matter?

Post by Matt Clar » Sun, 10 Aug 2003 00:38:55


Well, I usually am under a misapprehension!  Thanks for the explanation about LIMIT too.

In that case then, I shall stop worrying and learn to love the planner.

M

> -----Original Message-----

> Sent: 08 August 2003 16:15
> To: Matt Clark

> Subject: Re: [ADMIN] Cost estimates consistently too high - does it
> matter?


> > I've noticed that the cost estimates for a lot of my queries are
> > consistently far to high.

> You seem to be under a misapprehension.  The cost estimates are not
> in units of milliseconds, they are on an arbitrary scale with 1.0
> defined as one disk fetch.

> LIMIT throws another monkey wrench into the mix: the estimates for the
> plan nodes underneath the limit are done as if the plan were to be
> executed to completion, which of course it won't be.

>                    regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 7: don't forget to increase your free space map settings
 
 
 

1. Seeking High-Level Cost Estimates to Justify DW/DSS Project

I am seeking industry cost estimates for implementing a DW/DSS solution
broken down by size of project & industry with costs broken out by
hardware, software, consulting, training, etc.

Can anyone direct me to industry research?
 - or -
If you have implemented a DW/DSS solution, would you be willing to share
high-level budget/expense information along with an overview of the
technology and types of consultants utilized?


2. ADO Recordset addnew

3. Row Estimate/ cost estimate

4. ODBC

5. Question about estimating query cost

6. Replication evaluation

7. Cost estimates of implementing Cognos

8. Estimating costs (was Functional Indices)

9. Q: Estimated Cost from SET EXPLAIN?

10. Cost estimates?

11. Estimated I/O, CPU, and SubTree Costs

12. Anyone Have Experience With Evaluating Cost Estimate?