SET EXPLAIN output

SET EXPLAIN output

Post by Jay » Thu, 04 Apr 2002 01:51:51



Hi All,

I'm trying to gather statistics on queries performed against a none
relational database and one of a relational model.  When I execute a query
it seem the non-relational model performs faster, and I cannot decipher the
set explain output.  What is meant by Estimated cost?  And why does the
faster query (non-relational) have a higher cost?  What is estimated # of
rows?  The number of rows that are returned by both queries is 9062.  But
the relation model estimates 87 and the non-relational estimates 1.  I would
appreciate any clarification on this to help me optimize the query path
better.

Thanks
Jay

 
 
 

SET EXPLAIN output

Post by Jay » Thu, 04 Apr 2002 01:55:29


I thought I might as well attach both queries and set explain output for
those that might interested

# of rows in table a = 6010, b= 523774

example Query 1 -relational
select distinct a.sin, a.corp, a.plan, a.seq, b.effect_dt, b.deposit_dt
from e03_plan1 a, e06_dept1 b
where  a.id = b.id
and a.sin matches '*'
and a.corp matches 'G01682'
and a.plan matches '*'
and a.seq = 1;

Estimated Cost: 979
Estimated # of Rows Returned: 1

1) informix.a: SEQUENTIAL SCAN

    Filters: (informix.a.sin MATCHES '*' AND (informix.a.corp MATCHES
'G01682' AND (informix.a.plan MATCHES '*' AND informix.a.seq =
 1 ) ) )

2) informix.b: INDEX PATH

    (1) Index Keys: id
        Lower Index Filter: informix.b.id = informix.a.id
Time =  real    0m9.15s
             user    0m2.05s
             sys     0m0.69s

Query2 -non-relational
# of rows in table b= 6010, a= 523774

select b.sin, b.corp, b.plan, b.seq, a.effect_dt, a.deposit_dt
from e06_dept a, e03_plan b
where b.sin = a.sina
nd b.corp = a.corp
and b.plan = a.plana
nd b.seq = a.seq
and b.sin matches '*'
and b.corp matches 'G01682'
and b.plan matches '*'
and b.seq = 1

Estimated Cost: 561
Estimated # of Rows Returned: 1

1) informix.b: INDEX PATH    Filters: (informix.b.sin MATCHES '*' AND
(informix.b.corp MATCHES 'G01682' AND (informix.b.plan MATCHES '*' AND
informix.b.seq = 1 ) ) )    (1) Index Keys: sin corp plan seq   (Key-Only)2)
informix.a: INDEX PATH    (1) Index Keys: sin corp plan seq deposit_dt
to_acct ref_no        Lower Index Filter: (informix.a.seq = informix.b.seq
AND (informix.a.plan = informix.b.plan AND (informix.a.corp =
informix.b.corp AND informix.a.sin = informix.b.sin ) ) )

Time = real    0m33.61s
            user    0m0.28s
            sys     0m0.14s

Thanks Again
Jay


Quote:> Hi All,

> I'm trying to gather statistics on queries performed against a none
> relational database and one of a relational model.  When I execute a query
> it seem the non-relational model performs faster, and I cannot decipher
the
> set explain output.  What is meant by Estimated cost?  And why does the
> faster query (non-relational) have a higher cost?  What is estimated # of
> rows?  The number of rows that are returned by both queries is 9062.  But
> the relation model estimates 87 and the non-relational estimates 1.  I
would
> appreciate any clarification on this to help me optimize the query path
> better.

> Thanks
> Jay


 
 
 

SET EXPLAIN output

Post by Obnoxio The Clo » Thu, 04 Apr 2002 02:21:43




Quote:>I'm trying to gather statistics on queries performed against a none
>relational database and one of a relational model.  When I execute a query
>it seem the non-relational model performs faster, and I cannot decipher the
>set explain output.  What is meant by Estimated cost?  And why does the
>faster query (non-relational) have a higher cost?  What is estimated # of
>rows?  The number of rows that are returned by both queries is 9062.  But
>the relation model estimates 87 and the non-relational estimates 1.  I would
>appreciate any clarification on this to help me optimize the query path
>better.

Estimated cost is (expected CPU cycles required + (some fudge factor *
number of disk accesses required)). It is a totally meaningless
number, and in general, all you can say is that something with a cost
of 17 gazillion will _probably_ be slower that something with a cost
of 17. I doubt if the difference between 1 and 87 means anything.

Have you done the same update stats on both?

 
 
 

SET EXPLAIN output

Post by Obnoxio The Clo » Thu, 04 Apr 2002 02:25:28




Quote:>I thought I might as well attach both queries and set explain output for
>those that might interested

I'm confused: you said that the non-relational query was faster?
According to this, it was much, much faster? And what does
non-relational mean, anyway?
># of rows in table a = 6010, b= 523774

>example Query 1 -relational
>select distinct a.sin, a.corp, a.plan, a.seq, b.effect_dt, b.deposit_dt
>from e03_plan1 a, e06_dept1 b
>where  a.id = b.id
>and a.sin matches '*'
>and a.corp matches 'G01682'
>and a.plan matches '*'
>and a.seq = 1;

>Estimated Cost: 979
>Estimated # of Rows Returned: 1

>1) informix.a: SEQUENTIAL SCAN

>    Filters: (informix.a.sin MATCHES '*' AND (informix.a.corp MATCHES
>'G01682' AND (informix.a.plan MATCHES '*' AND informix.a.seq =
> 1 ) ) )

>2) informix.b: INDEX PATH

>    (1) Index Keys: id
>        Lower Index Filter: informix.b.id = informix.a.id
>Time =  real    0m9.15s
>             user    0m2.05s
>             sys     0m0.69s

>Query2 -non-relational
># of rows in table b= 6010, a= 523774

>select b.sin, b.corp, b.plan, b.seq, a.effect_dt, a.deposit_dt
>from e06_dept a, e03_plan b
>where b.sin = a.sina
>nd b.corp = a.corp
>and b.plan = a.plana
>nd b.seq = a.seq
>and b.sin matches '*'
>and b.corp matches 'G01682'
>and b.plan matches '*'
>and b.seq = 1

>Estimated Cost: 561
>Estimated # of Rows Returned: 1

>1) informix.b: INDEX PATH    Filters: (informix.b.sin MATCHES '*' AND
>(informix.b.corp MATCHES 'G01682' AND (informix.b.plan MATCHES '*' AND
>informix.b.seq = 1 ) ) )    (1) Index Keys: sin corp plan seq   (Key-Only)2)
>informix.a: INDEX PATH    (1) Index Keys: sin corp plan seq deposit_dt
>to_acct ref_no        Lower Index Filter: (informix.a.seq = informix.b.seq
>AND (informix.a.plan = informix.b.plan AND (informix.a.corp =
>informix.b.corp AND informix.a.sin = informix.b.sin ) ) )

>Time = real    0m33.61s
>            user    0m0.28s
>            sys     0m0.14s

>Thanks Again
>Jay



>> Hi All,

>> I'm trying to gather statistics on queries performed against a none
>> relational database and one of a relational model.  When I execute a query
>> it seem the non-relational model performs faster, and I cannot decipher
>the
>> set explain output.  What is meant by Estimated cost?  And why does the
>> faster query (non-relational) have a higher cost?  What is estimated # of
>> rows?  The number of rows that are returned by both queries is 9062.  But
>> the relation model estimates 87 and the non-relational estimates 1.  I
>would
>> appreciate any clarification on this to help me optimize the query path
>> better.

 
 
 

SET EXPLAIN output

Post by Jay » Thu, 04 Apr 2002 02:27:04


I'm sorry the time is inverted, the relational model is
                 real    0m33.61s
                 user    0m0.28s
                 sys     0m0.14s
and the non-relational model is
                  real    0m9.15s
                  user    0m2.05s
                  sys     0m0.69s

The relational model uses primary and foreign keys while the non-relational
has no keys and is joined by redundant data held in the 2 tables (a.sin
=b.sin, a.corp=b.corp..ect).  I have updated statistics on all tables.





> >I thought I might as well attach both queries and set explain output for
> >those that might interested

> I'm confused: you said that the non-relational query was faster?
> According to this, it was much, much faster? And what does
> non-relational mean, anyway?

> ># of rows in table a = 6010, b= 523774

> >example Query 1 -relational
> >select distinct a.sin, a.corp, a.plan, a.seq, b.effect_dt, b.deposit_dt
> >from e03_plan1 a, e06_dept1 b
> >where  a.id = b.id
> >and a.sin matches '*'
> >and a.corp matches 'G01682'
> >and a.plan matches '*'
> >and a.seq = 1;

> >Estimated Cost: 979
> >Estimated # of Rows Returned: 1

> >1) informix.a: SEQUENTIAL SCAN

> >    Filters: (informix.a.sin MATCHES '*' AND (informix.a.corp MATCHES
> >'G01682' AND (informix.a.plan MATCHES '*' AND informix.a.seq =
> > 1 ) ) )

> >2) informix.b: INDEX PATH

> >    (1) Index Keys: id
> >        Lower Index Filter: informix.b.id = informix.a.id
> >Time =  real    0m9.15s
> >             user    0m2.05s
> >             sys     0m0.69s

> >Query2 -non-relational
> ># of rows in table b= 6010, a= 523774

> >select b.sin, b.corp, b.plan, b.seq, a.effect_dt, a.deposit_dt
> >from e06_dept a, e03_plan b
> >where b.sin = a.sina
> >nd b.corp = a.corp
> >and b.plan = a.plana
> >nd b.seq = a.seq
> >and b.sin matches '*'
> >and b.corp matches 'G01682'
> >and b.plan matches '*'
> >and b.seq = 1

> >Estimated Cost: 561
> >Estimated # of Rows Returned: 1

> >1) informix.b: INDEX PATH    Filters: (informix.b.sin MATCHES '*' AND
> >(informix.b.corp MATCHES 'G01682' AND (informix.b.plan MATCHES '*' AND
> >informix.b.seq = 1 ) ) )    (1) Index Keys: sin corp plan seq
(Key-Only)2)
> >informix.a: INDEX PATH    (1) Index Keys: sin corp plan seq deposit_dt
> >to_acct ref_no        Lower Index Filter: (informix.a.seq =
informix.b.seq
> >AND (informix.a.plan = informix.b.plan AND (informix.a.corp =
> >informix.b.corp AND informix.a.sin = informix.b.sin ) ) )

> >Time = real    0m33.61s
> >            user    0m0.28s
> >            sys     0m0.14s

> >Thanks Again
> >Jay



> >> Hi All,

> >> I'm trying to gather statistics on queries performed against a none
> >> relational database and one of a relational model.  When I execute a
query
> >> it seem the non-relational model performs faster, and I cannot decipher
> >the
> >> set explain output.  What is meant by Estimated cost?  And why does the
> >> faster query (non-relational) have a higher cost?  What is estimated #
of
> >> rows?  The number of rows that are returned by both queries is 9062.
But
> >> the relation model estimates 87 and the non-relational estimates 1.  I
> >would
> >> appreciate any clarification on this to help me optimize the query path
> >> better.

 
 
 

SET EXPLAIN output

Post by Obnoxio The Clo » Thu, 04 Apr 2002 02:51:08




Quote:>I'm sorry the time is inverted, the relational model is
>                 real    0m33.61s
>                 user    0m0.28s
>                 sys     0m0.14s
>and the non-relational model is
>                  real    0m9.15s
>                  user    0m2.05s
>                  sys     0m0.69s

>The relational model uses primary and foreign keys while the non-relational
>has no keys and is joined by redundant data held in the 2 tables (a.sin
>=b.sin, a.corp=b.corp..ect).  I have updated statistics on all tables.

How did you update statistics on both tables?

Also, why do you have the following in your where clause:

Quote:>> >and a.sin matches '*'
>> >and a.plan matches '*'

?




>> >I thought I might as well attach both queries and set explain output for
>> >those that might interested

>> I'm confused: you said that the non-relational query was faster?
>> According to this, it was much, much faster? And what does
>> non-relational mean, anyway?

>> ># of rows in table a = 6010, b= 523774

>> >example Query 1 -relational
>> >select distinct a.sin, a.corp, a.plan, a.seq, b.effect_dt, b.deposit_dt
>> >from e03_plan1 a, e06_dept1 b
>> >where  a.id = b.id
>> >and a.sin matches '*'
>> >and a.corp matches 'G01682'
>> >and a.plan matches '*'
>> >and a.seq = 1;

>> >Estimated Cost: 979
>> >Estimated # of Rows Returned: 1

>> >1) informix.a: SEQUENTIAL SCAN

>> >    Filters: (informix.a.sin MATCHES '*' AND (informix.a.corp MATCHES
>> >'G01682' AND (informix.a.plan MATCHES '*' AND informix.a.seq =
>> > 1 ) ) )

>> >2) informix.b: INDEX PATH

>> >    (1) Index Keys: id
>> >        Lower Index Filter: informix.b.id = informix.a.id
>> >Time =  real    0m9.15s
>> >             user    0m2.05s
>> >             sys     0m0.69s

>> >Query2 -non-relational
>> ># of rows in table b= 6010, a= 523774

>> >select b.sin, b.corp, b.plan, b.seq, a.effect_dt, a.deposit_dt
>> >from e06_dept a, e03_plan b
>> >where b.sin = a.sina
>> >nd b.corp = a.corp
>> >and b.plan = a.plana
>> >nd b.seq = a.seq
>> >and b.sin matches '*'
>> >and b.corp matches 'G01682'
>> >and b.plan matches '*'
>> >and b.seq = 1

>> >Estimated Cost: 561
>> >Estimated # of Rows Returned: 1

>> >1) informix.b: INDEX PATH    Filters: (informix.b.sin MATCHES '*' AND
>> >(informix.b.corp MATCHES 'G01682' AND (informix.b.plan MATCHES '*' AND
>> >informix.b.seq = 1 ) ) )    (1) Index Keys: sin corp plan seq
>(Key-Only)2)
>> >informix.a: INDEX PATH    (1) Index Keys: sin corp plan seq deposit_dt
>> >to_acct ref_no        Lower Index Filter: (informix.a.seq =
>informix.b.seq
>> >AND (informix.a.plan = informix.b.plan AND (informix.a.corp =
>> >informix.b.corp AND informix.a.sin = informix.b.sin ) ) )

>> >Time = real    0m33.61s
>> >            user    0m0.28s
>> >            sys     0m0.14s

>> >Thanks Again
>> >Jay



>> >> Hi All,

>> >> I'm trying to gather statistics on queries performed against a none
>> >> relational database and one of a relational model.  When I execute a
>query
>> >> it seem the non-relational model performs faster, and I cannot decipher
>> >the
>> >> set explain output.  What is meant by Estimated cost?  And why does the
>> >> faster query (non-relational) have a higher cost?  What is estimated #
>of
>> >> rows?  The number of rows that are returned by both queries is 9062.
>But
>> >> the relation model estimates 87 and the non-relational estimates 1.  I
>> >would
>> >> appreciate any clarification on this to help me optimize the query path
>> >> better.

 
 
 

SET EXPLAIN output

Post by Obnoxio The Clo » Thu, 04 Apr 2002 02:58:56




Quote:>I thought I might as well attach both queries and set explain output for
>those that might interested

Well, as far as I can see, the optimiser is behaving perfectly
rationally: it has obviously decided that in the non-relational
version, the extra selectivity on table B makes it more sensible to
drive the query from that than from table A. Then joining the result
set is quicker. I think it's the "corp" field in table B in the
non-relational query that makes the difference, not the join strategy.
For some reason, A.corp is not as selective as B.corp.
># of rows in table a = 6010, b= 523774

>example Query 1 -relational
>select distinct a.sin, a.corp, a.plan, a.seq, b.effect_dt, b.deposit_dt
>from e03_plan1 a, e06_dept1 b
>where  a.id = b.id
>and a.sin matches '*'
>and a.corp matches 'G01682'
>and a.plan matches '*'
>and a.seq = 1;

>Estimated Cost: 979
>Estimated # of Rows Returned: 1

>1) informix.a: SEQUENTIAL SCAN

>    Filters: (informix.a.sin MATCHES '*' AND (informix.a.corp MATCHES
>'G01682' AND (informix.a.plan MATCHES '*' AND informix.a.seq =
> 1 ) ) )

>2) informix.b: INDEX PATH

>    (1) Index Keys: id
>        Lower Index Filter: informix.b.id = informix.a.id
>Time =  real    0m9.15s
>             user    0m2.05s
>             sys     0m0.69s

>Query2 -non-relational
># of rows in table b= 6010, a= 523774

>select b.sin, b.corp, b.plan, b.seq, a.effect_dt, a.deposit_dt
>from e06_dept a, e03_plan b
>where b.sin = a.sina
>nd b.corp = a.corp
>and b.plan = a.plana
>nd b.seq = a.seq
>and b.sin matches '*'
>and b.corp matches 'G01682'
>and b.plan matches '*'
>and b.seq = 1

>Estimated Cost: 561
>Estimated # of Rows Returned: 1

>1) informix.b: INDEX PATH    Filters: (informix.b.sin MATCHES '*' AND
>(informix.b.corp MATCHES 'G01682' AND (informix.b.plan MATCHES '*' AND
>informix.b.seq = 1 ) ) )    (1) Index Keys: sin corp plan seq   (Key-Only)2)
>informix.a: INDEX PATH    (1) Index Keys: sin corp plan seq deposit_dt
>to_acct ref_no        Lower Index Filter: (informix.a.seq = informix.b.seq
>AND (informix.a.plan = informix.b.plan AND (informix.a.corp =
>informix.b.corp AND informix.a.sin = informix.b.sin ) ) )

>Time = real    0m33.61s
>            user    0m0.28s
>            sys     0m0.14s

>Thanks Again
>Jay



>> Hi All,

>> I'm trying to gather statistics on queries performed against a none
>> relational database and one of a relational model.  When I execute a query
>> it seem the non-relational model performs faster, and I cannot decipher
>the
>> set explain output.  What is meant by Estimated cost?  And why does the
>> faster query (non-relational) have a higher cost?  What is estimated # of
>> rows?  The number of rows that are returned by both queries is 9062.  But
>> the relation model estimates 87 and the non-relational estimates 1.  I
>would
>> appreciate any clarification on this to help me optimize the query path
>> better.

>> Thanks
>> Jay

 
 
 

SET EXPLAIN output

Post by Jay » Thu, 04 Apr 2002 04:57:08


So joining tables via primary and foreign keys are slower than joining
multiple columns?





> >I thought I might as well attach both queries and set explain output for
> >those that might interested

> Well, as far as I can see, the optimiser is behaving perfectly
> rationally: it has obviously decided that in the non-relational
> version, the extra selectivity on table B makes it more sensible to
> drive the query from that than from table A. Then joining the result
> set is quicker. I think it's the "corp" field in table B in the
> non-relational query that makes the difference, not the join strategy.
> For some reason, A.corp is not as selective as B.corp.

> ># of rows in table a = 6010, b= 523774

> >example Query 1 -relational
> >select distinct a.sin, a.corp, a.plan, a.seq, b.effect_dt, b.deposit_dt
> >from e03_plan1 a, e06_dept1 b
> >where  a.id = b.id
> >and a.sin matches '*'
> >and a.corp matches 'G01682'
> >and a.plan matches '*'
> >and a.seq = 1;

> >Estimated Cost: 979
> >Estimated # of Rows Returned: 1

> >1) informix.a: SEQUENTIAL SCAN

> >    Filters: (informix.a.sin MATCHES '*' AND (informix.a.corp MATCHES
> >'G01682' AND (informix.a.plan MATCHES '*' AND informix.a.seq =
> > 1 ) ) )

> >2) informix.b: INDEX PATH

> >    (1) Index Keys: id
> >        Lower Index Filter: informix.b.id = informix.a.id
> >Time =  real    0m9.15s
> >             user    0m2.05s
> >             sys     0m0.69s

> >Query2 -non-relational
> ># of rows in table b= 6010, a= 523774

> >select b.sin, b.corp, b.plan, b.seq, a.effect_dt, a.deposit_dt
> >from e06_dept a, e03_plan b
> >where b.sin = a.sina
> >nd b.corp = a.corp
> >and b.plan = a.plana
> >nd b.seq = a.seq
> >and b.sin matches '*'
> >and b.corp matches 'G01682'
> >and b.plan matches '*'
> >and b.seq = 1

> >Estimated Cost: 561
> >Estimated # of Rows Returned: 1

> >1) informix.b: INDEX PATH    Filters: (informix.b.sin MATCHES '*' AND
> >(informix.b.corp MATCHES 'G01682' AND (informix.b.plan MATCHES '*' AND
> >informix.b.seq = 1 ) ) )    (1) Index Keys: sin corp plan seq
(Key-Only)2)
> >informix.a: INDEX PATH    (1) Index Keys: sin corp plan seq deposit_dt
> >to_acct ref_no        Lower Index Filter: (informix.a.seq =
informix.b.seq
> >AND (informix.a.plan = informix.b.plan AND (informix.a.corp =
> >informix.b.corp AND informix.a.sin = informix.b.sin ) ) )

> >Time = real    0m33.61s
> >            user    0m0.28s
> >            sys     0m0.14s

> >Thanks Again
> >Jay



> >> Hi All,

> >> I'm trying to gather statistics on queries performed against a none
> >> relational database and one of a relational model.  When I execute a
query
> >> it seem the non-relational model performs faster, and I cannot decipher
> >the
> >> set explain output.  What is meant by Estimated cost?  And why does the
> >> faster query (non-relational) have a higher cost?  What is estimated #
of
> >> rows?  The number of rows that are returned by both queries is 9062.
But
> >> the relation model estimates 87 and the non-relational estimates 1.  I
> >would
> >> appreciate any clarification on this to help me optimize the query path
> >> better.

> >> Thanks
> >> Jay

 
 
 

SET EXPLAIN output

Post by Obnoxio The Clo » Thu, 04 Apr 2002 05:56:27




Quote:>So joining tables via primary and foreign keys are slower than joining
>multiple columns?

<sigh> "I think it's the "corp" field in table B in the non-relational
query that makes the difference, not the join strategy."

The two queries are completely different. Non-relational starts with
table B, relational starts with table A. I think the performance
difference can be attributed to the fact that the corp column in table
B severely restricts the number of rows returned from table B, making
for a more speedy query. As an experiment, try adding the "corp"
column to B in the relational query and try

select distinct a.sin, a.corp, a.plan, a.seq, b.effect_dt,
b.deposit_dt
from e03_plan1 a, e06_dept1 b
where  a.id = b.id
and b.corp matches 'G01682'
and a.seq = 1;

I still want to know why you have these in your WHERE clause:
"and a.sin matches '*'
and a.plan matches '*'"





>> >I thought I might as well attach both queries and set explain output for
>> >those that might interested

>> Well, as far as I can see, the optimiser is behaving perfectly
>> rationally: it has obviously decided that in the non-relational
>> version, the extra selectivity on table B makes it more sensible to
>> drive the query from that than from table A. Then joining the result
>> set is quicker. I think it's the "corp" field in table B in the
>> non-relational query that makes the difference, not the join strategy.
>> For some reason, A.corp is not as selective as B.corp.

>> ># of rows in table a = 6010, b= 523774

>> >example Query 1 -relational
>> >select distinct a.sin, a.corp, a.plan, a.seq, b.effect_dt, b.deposit_dt
>> >from e03_plan1 a, e06_dept1 b
>> >where  a.id = b.id
>> >and a.sin matches '*'
>> >and a.corp matches 'G01682'
>> >and a.plan matches '*'
>> >and a.seq = 1;

>> >Estimated Cost: 979
>> >Estimated # of Rows Returned: 1

>> >1) informix.a: SEQUENTIAL SCAN

>> >    Filters: (informix.a.sin MATCHES '*' AND (informix.a.corp MATCHES
>> >'G01682' AND (informix.a.plan MATCHES '*' AND informix.a.seq =
>> > 1 ) ) )

>> >2) informix.b: INDEX PATH

>> >    (1) Index Keys: id
>> >        Lower Index Filter: informix.b.id = informix.a.id
>> >Time =  real    0m9.15s
>> >             user    0m2.05s
>> >             sys     0m0.69s

>> >Query2 -non-relational
>> ># of rows in table b= 6010, a= 523774

>> >select b.sin, b.corp, b.plan, b.seq, a.effect_dt, a.deposit_dt
>> >from e06_dept a, e03_plan b
>> >where b.sin = a.sina
>> >nd b.corp = a.corp
>> >and b.plan = a.plana
>> >nd b.seq = a.seq
>> >and b.sin matches '*'
>> >and b.corp matches 'G01682'
>> >and b.plan matches '*'
>> >and b.seq = 1

>> >Estimated Cost: 561
>> >Estimated # of Rows Returned: 1

>> >1) informix.b: INDEX PATH    Filters: (informix.b.sin MATCHES '*' AND
>> >(informix.b.corp MATCHES 'G01682' AND (informix.b.plan MATCHES '*' AND
>> >informix.b.seq = 1 ) ) )    (1) Index Keys: sin corp plan seq
>(Key-Only)2)
>> >informix.a: INDEX PATH    (1) Index Keys: sin corp plan seq deposit_dt
>> >to_acct ref_no        Lower Index Filter: (informix.a.seq =
>informix.b.seq
>> >AND (informix.a.plan = informix.b.plan AND (informix.a.corp =
>> >informix.b.corp AND informix.a.sin = informix.b.sin ) ) )

>> >Time = real    0m33.61s
>> >            user    0m0.28s
>> >            sys     0m0.14s

>> >Thanks Again
>> >Jay



>> >> Hi All,

>> >> I'm trying to gather statistics on queries performed against a none
>> >> relational database and one of a relational model.  When I execute a
>query
>> >> it seem the non-relational model performs faster, and I cannot decipher
>> >the
>> >> set explain output.  What is meant by Estimated cost?  And why does the
>> >> faster query (non-relational) have a higher cost?  What is estimated #
>of
>> >> rows?  The number of rows that are returned by both queries is 9062.
>But
>> >> the relation model estimates 87 and the non-relational estimates 1.  I
>> >would
>> >> appreciate any clarification on this to help me optimize the query path
>> >> better.

>> >> Thanks
>> >> Jay

 
 
 

SET EXPLAIN output

Post by Rob_Burba/Austin/Sector » Thu, 04 Apr 2002 05:29:52


Other differences....

Query-1 is a select DISTINCT and is performing a SEQUENTIAL SCAN of table "a
(plan)" then joining to "b (dept)"
Query-2 is just a select and is performing a "KEY ONLY" scan of table "b
(plan)" before doing the indexed join to table "a (dept)"

Since both tables have an index where the first four parts (sin corp plan
seq), why isn't this doing a KEY ONLY scan of table a in Query-1?  Is it
because the index on table a has three additional fields (deposit_dt to_acct
ref_no)?

Another idea...  In query-2 only key fields are needed from table the "dept"
table whereas, in query-1 the key fields from the "dept" table and the id
field are needed to complete the join.

-----Original Message-----


Sent: Tuesday, April 02, 2002 11:59 AM

Subject: Re: SET EXPLAIN output



>I thought I might as well attach both queries and set explain output for
>those that might interested

Well, as far as I can see, the optimiser is behaving perfectly
rationally: it has obviously decided that in the non-relational
version, the extra selectivity on table B makes it more sensible to
drive the query from that than from table A. Then joining the result
set is quicker. I think it's the "corp" field in table B in the
non-relational query that makes the difference, not the join strategy.
For some reason, A.corp is not as selective as B.corp.

># of rows in table a = 6010, b= 523774

>example Query 1 -relational
>select distinct a.sin, a.corp, a.plan, a.seq, b.effect_dt, b.deposit_dt
>from e03_plan1 a, e06_dept1 b
>where  a.id = b.id
>and a.sin matches '*'
>and a.corp matches 'G01682'
>and a.plan matches '*'
>and a.seq = 1;

>Estimated Cost: 979
>Estimated # of Rows Returned: 1

>1) informix.a: SEQUENTIAL SCAN

>    Filters: (informix.a.sin MATCHES '*' AND (informix.a.corp MATCHES
>'G01682' AND (informix.a.plan MATCHES '*' AND informix.a.seq =
> 1 ) ) )

>2) informix.b: INDEX PATH

>    (1) Index Keys: id
>        Lower Index Filter: informix.b.id = informix.a.id
>Time =  real    0m9.15s
>             user    0m2.05s
>             sys     0m0.69s

>Query2 -non-relational
># of rows in table b= 6010, a= 523774

>select b.sin, b.corp, b.plan, b.seq, a.effect_dt, a.deposit_dt
>from e06_dept a, e03_plan b
>where b.sin = a.sina
>nd b.corp = a.corp
>and b.plan = a.plana
>nd b.seq = a.seq
>and b.sin matches '*'
>and b.corp matches 'G01682'
>and b.plan matches '*'
>and b.seq = 1

>Estimated Cost: 561
>Estimated # of Rows Returned: 1

>1) informix.b: INDEX PATH    Filters: (informix.b.sin MATCHES '*' AND
>(informix.b.corp MATCHES 'G01682' AND (informix.b.plan MATCHES '*' AND
>informix.b.seq = 1 ) ) )    (1) Index Keys: sin corp plan seq
(Key-Only)2)
>informix.a: INDEX PATH    (1) Index Keys: sin corp plan seq deposit_dt
>to_acct ref_no        Lower Index Filter: (informix.a.seq = informix.b.seq
>AND (informix.a.plan = informix.b.plan AND (informix.a.corp =
>informix.b.corp AND informix.a.sin = informix.b.sin ) ) )

>Time = real    0m33.61s
>            user    0m0.28s
>            sys     0m0.14s

>Thanks Again
>Jay



>> Hi All,

>> I'm trying to gather statistics on queries performed against a none
>> relational database and one of a relational model.  When I execute a
query
>> it seem the non-relational model performs faster, and I cannot decipher
>the
>> set explain output.  What is meant by Estimated cost?  And why does the
>> faster query (non-relational) have a higher cost?  What is estimated # of
>> rows?  The number of rows that are returned by both queries is 9062.  But
>> the relation model estimates 87 and the non-relational estimates 1.  I
>would
>> appreciate any clarification on this to help me optimize the query path
>> better.

>> Thanks
>> Jay

 
 
 

SET EXPLAIN output

Post by Jay » Thu, 04 Apr 2002 06:29:42


Thanks, and sorry for simple questions, for all this is pretty new to me.

The where clause contains the "and a.sin matches '*' and a.plan matches '*'
because values are inserted into the quotes dynamically.  My job is to show
that by changing our existing system (which is a whole lot of a.sin = b.sin,
a.corp=b.corp, ...) into a relational model that contains the data once and
only once and references other tables via primary and foreign keys will not
affect performance (since it already crawls along).  Unfortunately placing
the column 'corp' into table B would be breaking 1nf.





> >So joining tables via primary and foreign keys are slower than joining
> >multiple columns?

> <sigh> "I think it's the "corp" field in table B in the non-relational
> query that makes the difference, not the join strategy."

> The two queries are completely different. Non-relational starts with
> table B, relational starts with table A. I think the performance
> difference can be attributed to the fact that the corp column in table
> B severely restricts the number of rows returned from table B, making
> for a more speedy query. As an experiment, try adding the "corp"
> column to B in the relational query and try

> select distinct a.sin, a.corp, a.plan, a.seq, b.effect_dt,
> b.deposit_dt
> from e03_plan1 a, e06_dept1 b
> where  a.id = b.id
> and b.corp matches 'G01682'
> and a.seq = 1;

> I still want to know why you have these in your WHERE clause:
> "and a.sin matches '*'
> and a.plan matches '*'"





> >> >I thought I might as well attach both queries and set explain output
for
> >> >those that might interested

> >> Well, as far as I can see, the optimiser is behaving perfectly
> >> rationally: it has obviously decided that in the non-relational
> >> version, the extra selectivity on table B makes it more sensible to
> >> drive the query from that than from table A. Then joining the result
> >> set is quicker. I think it's the "corp" field in table B in the
> >> non-relational query that makes the difference, not the join strategy.
> >> For some reason, A.corp is not as selective as B.corp.

> >> ># of rows in table a = 6010, b= 523774

> >> >example Query 1 -relational
> >> >select distinct a.sin, a.corp, a.plan, a.seq, b.effect_dt,
b.deposit_dt
> >> >from e03_plan1 a, e06_dept1 b
> >> >where  a.id = b.id
> >> >and a.sin matches '*'
> >> >and a.corp matches 'G01682'
> >> >and a.plan matches '*'
> >> >and a.seq = 1;

> >> >Estimated Cost: 979
> >> >Estimated # of Rows Returned: 1

> >> >1) informix.a: SEQUENTIAL SCAN

> >> >    Filters: (informix.a.sin MATCHES '*' AND (informix.a.corp MATCHES
> >> >'G01682' AND (informix.a.plan MATCHES '*' AND informix.a.seq =
> >> > 1 ) ) )

> >> >2) informix.b: INDEX PATH

> >> >    (1) Index Keys: id
> >> >        Lower Index Filter: informix.b.id = informix.a.id
> >> >Time =  real    0m9.15s
> >> >             user    0m2.05s
> >> >             sys     0m0.69s

> >> >Query2 -non-relational
> >> ># of rows in table b= 6010, a= 523774

> >> >select b.sin, b.corp, b.plan, b.seq, a.effect_dt, a.deposit_dt
> >> >from e06_dept a, e03_plan b
> >> >where b.sin = a.sina
> >> >nd b.corp = a.corp
> >> >and b.plan = a.plana
> >> >nd b.seq = a.seq
> >> >and b.sin matches '*'
> >> >and b.corp matches 'G01682'
> >> >and b.plan matches '*'
> >> >and b.seq = 1

> >> >Estimated Cost: 561
> >> >Estimated # of Rows Returned: 1

> >> >1) informix.b: INDEX PATH    Filters: (informix.b.sin MATCHES '*' AND
> >> >(informix.b.corp MATCHES 'G01682' AND (informix.b.plan MATCHES '*' AND
> >> >informix.b.seq = 1 ) ) )    (1) Index Keys: sin corp plan seq
> >(Key-Only)2)
> >> >informix.a: INDEX PATH    (1) Index Keys: sin corp plan seq deposit_dt
> >> >to_acct ref_no        Lower Index Filter: (informix.a.seq =
> >informix.b.seq
> >> >AND (informix.a.plan = informix.b.plan AND (informix.a.corp =
> >> >informix.b.corp AND informix.a.sin = informix.b.sin ) ) )

> >> >Time = real    0m33.61s
> >> >            user    0m0.28s
> >> >            sys     0m0.14s

> >> >Thanks Again
> >> >Jay



> >> >> Hi All,

> >> >> I'm trying to gather statistics on queries performed against a none
> >> >> relational database and one of a relational model.  When I execute a
> >query
> >> >> it seem the non-relational model performs faster, and I cannot
decipher
> >> >the
> >> >> set explain output.  What is meant by Estimated cost?  And why does
the
> >> >> faster query (non-relational) have a higher cost?  What is estimated
#
> >of
> >> >> rows?  The number of rows that are returned by both queries is 9062.
> >But
> >> >> the relation model estimates 87 and the non-relational estimates 1.
I
> >> >would
> >> >> appreciate any clarification on this to help me optimize the query
path
> >> >> better.

> >> >> Thanks
> >> >> Jay

 
 
 

SET EXPLAIN output

Post by Karl & Betty Schende » Thu, 04 Apr 2002 07:12:42




> ... What is meant by Estimated cost?  And why does the
> faster query (non-relational) have a higher cost? ...

Any cost-based optimizer has to estimate a cost for any given query plan.
The optimizer generates possible plans, costs them, and picks the one
with the smallest cost.

Obviously this cost is the optimizer's estimate.  It might be close to
reality, or it might be pure hemp smoke.  In fact, the accuracy of the
optimizer's cost estimates are totally irrelevant as long as it picks
the right (i.e. fastest in actual execution) query plan.

 
 
 

SET EXPLAIN output

Post by Murray Woo » Thu, 04 Apr 2002 08:32:20


NO   Based upon the number of rows in those tables and the distribution
information available to the optimiser for that query, that is the best way
to obtain the requested result.

I dont understand how you could come to any generalisation based upon one
query.

-----Original Message-----


Sent: Wednesday, 3 April 2002 7:57 a.m.

Subject: Re: SET EXPLAIN output

So joining tables via primary and foreign keys are slower than joining
multiple columns?





> >I thought I might as well attach both queries and set explain output for
> >those that might interested

> Well, as far as I can see, the optimiser is behaving perfectly
> rationally: it has obviously decided that in the non-relational
> version, the extra selectivity on table B makes it more sensible to
> drive the query from that than from table A. Then joining the result
> set is quicker. I think it's the "corp" field in table B in the
> non-relational query that makes the difference, not the join strategy.
> For some reason, A.corp is not as selective as B.corp.

> ># of rows in table a = 6010, b= 523774

> >example Query 1 -relational
> >select distinct a.sin, a.corp, a.plan, a.seq, b.effect_dt, b.deposit_dt
> >from e03_plan1 a, e06_dept1 b
> >where  a.id = b.id
> >and a.sin matches '*'
> >and a.corp matches 'G01682'
> >and a.plan matches '*'
> >and a.seq = 1;

> >Estimated Cost: 979
> >Estimated # of Rows Returned: 1

> >1) informix.a: SEQUENTIAL SCAN

> >    Filters: (informix.a.sin MATCHES '*' AND (informix.a.corp MATCHES
> >'G01682' AND (informix.a.plan MATCHES '*' AND informix.a.seq =
> > 1 ) ) )

> >2) informix.b: INDEX PATH

> >    (1) Index Keys: id
> >        Lower Index Filter: informix.b.id = informix.a.id
> >Time =  real    0m9.15s
> >             user    0m2.05s
> >             sys     0m0.69s

> >Query2 -non-relational
> ># of rows in table b= 6010, a= 523774

> >select b.sin, b.corp, b.plan, b.seq, a.effect_dt, a.deposit_dt
> >from e06_dept a, e03_plan b
> >where b.sin = a.sina
> >nd b.corp = a.corp
> >and b.plan = a.plana
> >nd b.seq = a.seq
> >and b.sin matches '*'
> >and b.corp matches 'G01682'
> >and b.plan matches '*'
> >and b.seq = 1

> >Estimated Cost: 561
> >Estimated # of Rows Returned: 1

> >1) informix.b: INDEX PATH    Filters: (informix.b.sin MATCHES '*' AND
> >(informix.b.corp MATCHES 'G01682' AND (informix.b.plan MATCHES '*' AND
> >informix.b.seq = 1 ) ) )    (1) Index Keys: sin corp plan seq
(Key-Only)2)
> >informix.a: INDEX PATH    (1) Index Keys: sin corp plan seq deposit_dt
> >to_acct ref_no        Lower Index Filter: (informix.a.seq =
informix.b.seq
> >AND (informix.a.plan = informix.b.plan AND (informix.a.corp =
> >informix.b.corp AND informix.a.sin = informix.b.sin ) ) )

> >Time = real    0m33.61s
> >            user    0m0.28s
> >            sys     0m0.14s

> >Thanks Again
> >Jay



> >> Hi All,

> >> I'm trying to gather statistics on queries performed against a none
> >> relational database and one of a relational model.  When I execute a
query
> >> it seem the non-relational model performs faster, and I cannot decipher
> >the
> >> set explain output.  What is meant by Estimated cost?  And why does the
> >> faster query (non-relational) have a higher cost?  What is estimated #
of
> >> rows?  The number of rows that are returned by both queries is 9062.
But
> >> the relation model estimates 87 and the non-relational estimates 1.  I
> >would
> >> appreciate any clarification on this to help me optimize the query path
> >> better.

> >> Thanks
> >> Jay

 
 
 

SET EXPLAIN output

Post by David McPau » Thu, 04 Apr 2002 09:59:10


Just thought I would a few things.

        Estimated Cost is only usefull when compared to another Estimated
Cost.  It is the result of a lot of different calculations (No of rows, no
of IO's, selectivity of indexes, sort times, etc).  By itself it has no
meaning ie.  It is NOT no of seconds or no of io's required.  It is just a
calculated number based on a formula that only the informix engine knows.

        Generally the lower the estimated cost the faster the query.
However, this is only generally true it is entirely possible for the
optimiser to get it wrong but this should hopefully be rare.  Where it
happens you can try a number of things to help out.

                1.  Change indexes
                2.  Update Stats
                3.  Use Optimiser hints
                4.  Change the SQL

        Not neccessarily in that order though.

        Estimated number of rows is how many rows the optimiser thinks will
be returned.  This number is rarely correct except in bleedingly obvious SQL
cases.

Cheers
David

Everything above is how I think things work, how things actually work may be
very different.  The wonderfull thing about the internet is that as soon as
you say how things work there will be plenty of people lineing up to tell
you that you are wrong as well as one person who will tell you that your
spelling sucks.

-----Original Message-----

Sent: Wednesday, April 03, 2002 2:52 AM

Subject: SET EXPLAIN output

Hi All,

I'm trying to gather statistics on queries performed against a none
relational database and one of a relational model.  When I execute a query
it seem the non-relational model performs faster, and I cannot decipher the
set explain output.  What is meant by Estimated cost?  And why does the
faster query (non-relational) have a higher cost?  What is estimated # of
rows?  The number of rows that are returned by both queries is 9062.  But
the relation model estimates 87 and the non-relational estimates 1.  I would
appreciate any clarification on this to help me optimize the query path
better.

Thanks
Jay



 
 
 

SET EXPLAIN output

Post by Max Yak » Thu, 04 Apr 2002 15:27:10



> Just thought I would a few things.

Techno-snip

Quote:> Cheers
> David

> Everything above is how I think things work, how things actually work may be
> very different.  The wonderfull thing about the internet is that as soon as
> you say how things work there will be plenty of people lineing up to tell
> you that you are wrong as well as one person who will tell you that your
> spelling sucks.

More snips

Da!  Spelling sucks!
Also, grammar sucks (Just thought I would a few things).
Also, hook to start *ic sub-thread - key word 'Cheers'.
Also, hook to start * sub-thread - key word 'sucks'.
Also, you may be correct... maybe only one person 'lineing[sic] up
to tell you that you are wrong' - me.

But seem like good tips in techno-text must say.
Is what counts most.  :-)

HTF

Max V. Yakov

 
 
 

1. Set explain output

All

Does anybody know if it is possible to construct a report similar to
that produced by set explain using the sysmaster tables in V7?

I know that some of the information is available there, but not sure
about the info on what indexes and filters etc have been chosen.

I am trying to automate the process of detecting sequential scans
and determining the processing path. The purpose of this is to avoid
rerunning the query on a production machine.

Mark


2. INGRES Work

3. SET EXPLAIN output file for OWS 7.22 on NT?

4. Novice question - Please Help

5. output of set explain ?

6. Add field to table

7. Set Explain -- Please explain

8. Replication within workgroup

9. Explain this query output

10. Explain Output made Colourful

11. Explain Output made Colorful

12. Sending Explain output to a perl client

13. Help interpreting the output of EXPLAIN