Performance question

Performance question

Post by Guillermo Labatt » Sun, 31 Dec 1899 09:00:00



Hi,
I have been told that

select * from table1 where col between value1 and value2

is faster than

select * from table1 where col>=value1 and col<=value2

Is that true for Online 5.07?

How about

select * from table1 where col in (value1,value2,value3,value4, ...)

vs

select * from table1 where col=value1 or col=value2 or col=value3 or ...

?

Thanks

 
 
 

Performance question

Post by Rui Domingo » Sun, 31 Dec 1899 09:00:00


I can help you for this two:

Quote:

>select * from table1 where col in (value1,value2,value3,value4, ...)

>vs

>select * from table1 where col=value1 or col=value2 or col=value3 or ...

The fastest way is neither one, but:

select * from table1 where col=value1
UNION
select * from table1 where col=value2
UNION
select * from table1 where col=value3

This is true because Informix doesn't use indexes in the presence of an or
operation.
So if table1 is big and you have an index on col the UNION example will work
much faster, because now you have 3 selects but no ORs.

Rui Domingos


 
 
 

Performance question

Post by Paul Rober » Sun, 31 Dec 1899 09:00:00




>This is true because Informix doesn't use indexes in the presence of an or
>operation.
>So if table1 is big and you have an index on col the UNION example will work
>much faster, because now you have 3 selects but no ORs.

>Rui Domingos


You know, I wonder if this is really true. Everyone says it, and yet my
query

select rev_ref_num
  from asap_rev_extract
 where ( ext_id = 3840000 or ext_id = 383000 )

comes back instantly. 3.5M rows, index on ext_id...

I'm using OnLine 7.3

Paul  (not a spokesman)

 
 
 

Performance question

Post by Rui Domingo » Sun, 31 Dec 1899 09:00:00


Yes, ok, but have you tried with the UNION?
And if is more than one OR?

Rui Domingos

 
 
 

Performance question

Post by David William » Sun, 31 Dec 1899 09:00:00




Quote:

>Hi,
>I have been told that

>select * from table1 where col between value1 and value2

>is faster than

>select * from table1 where col>=value1 and col<=value2

  Proably little difference. I just prefer the first syntax as it is
  clearer what you mean..

Quote:>Is that true for Online 5.07?

>How about

>select * from table1 where col in (value1,value2,value3,value4, ...)

  That will do a sequential scan of the table

Quote:>vs

>select * from table1 where col=value1 or col=value2 or col=value3 or ...

  and so will that

  instead do

   select * from table1 where col1=value1
   union
   select * from table1 where col1=value2
   union
   select * from table1 where col1=value3
   union
   select * from table1 where col1=value4

   Or course with PDQ enabled the way this can be parallelized is more
   obvious to the engine...

   Can anyone confirm that with PDQPRIORITY=1 i.e. parallel scans
   that unions are scanned in parallel?

Quote:>?

>Thanks

--
David Williams

Maintainer of the Informix FAQ
 Primary site (Beta Version)  http://www.smooth1.demon.co.uk
 Official site                http://www.iiug.org/techinfo/faq/faq_top.html

I see you standin', Standin' on your own, It's such a lonely place for you, For
you to be If you need a shoulder, Or if you need a friend, I'll be here
standing, Until the bitter end...
So don't chastise me Or think I, I mean you harm...
All I ever wanted Was for you To know that I care

 
 
 

Performance question

Post by Art S. Kage » Sun, 31 Dec 1899 09:00:00



> Hi,
> I have been told that

> select * from table1 where col between value1 and value2

> is faster than

> select * from table1 where col>=value1 and col<=value2

> Is that true for Online 5.07?

> How about

> select * from table1 where col in (value1,value2,value3,value4, ...)

> vs

> select * from table1 where col=value1 or col=value2 or col=value3 or

In 5.0x this was always true.  In 7.2x which has a more sophisticated
optimizer, which feels free to rewrite your queries, the optimizer may
replace between with ..<=..>= or relace ..<=..>= with between and
replace an IN clause with a list of OR'd equalities or replace a list
of ORd equalities with an IN clause or it may leave things alone,
depending on data distributions, fragmentation, PDQPRIORITY, etc.  So
in 7.2x all bets are off.  In general, even in 7.2x, IN is better than
OR and BETWEEN better than a range.

Art S. Kagel

 
 
 

Performance question

Post by Leffler, Jonatha » Sun, 31 Dec 1899 09:00:00


Quote:>----------


>Sent:  Thursday, February 26, 1998 10:43 AM


>>This is true because Informix doesn't use indexes in the presence of an OR operation.
>>So if table1 is big and you have an index on col the UNION example will work
>>much faster, because now you have 3 selects but no ORs.
>You know, I wonder if this is really true. Everyone says it, and yet my query
>select rev_ref_num
>  from asap_rev_extract
> where ( ext_id = 3840000 or ext_id = 383000 )

>comes back instantly. 3.5M rows, index on ext_id...
>I'm using OnLine 7.3

There are two factors which affect whether the OR operation works with
indexes or not, and they're the usual culprits - the version of the engine
and the SQL statement.

Some versions of the engine didn't handle OR conditions very well; I don't
have the details available.  You can find out whether your version does by
looking at the SET EXPLAIN output.

Some OR conditions are not readily handlable with indexes and others are.
Paul's example is one that can be dealt with because both clauses in the OR
are on the same column.  The condition could also be written as:

        ext_id IN (3840000,383000)

Other OR conditions cannot easily be handled with indexes: for example, a
OR condition referring to two different columns on either side is not so easily
handled:

        (ext_id = 3840000) OR (other_column BETWEEN 37 and 92)

Even if there is an index on ext_id and a separate index on other_column,
it isn't clear how the optimizer will, or should, handle the query.  If there isn't
much overlap between the sets of rows identified by the two conditions, then
it may be quicker to do a UNION because there won't be much to do in the
duplicate elimination phase of UNION.  On the other hand, if there is a lot of
overlap (so many rows satisfy both conditions), then the single scan may be
quicker than the UNION with two separate indexed SELECT operations.  It
also depends on the size of the table, of course.

Isn't life fun!  The answer is 'it varies, and it depends'...

Yours,

 
 
 

Performance question

Post by Sam Brow » Sun, 31 Dec 1899 09:00:00


Everyone can argue about the best way, but this depends on context. You
can find the answer to your question empirically by using Technovations
Dbsizr. A free eval copy is on our web site if you are interested...
_____________________________________________________________
Sam Brown, Technovations, Inc.
Concerned about Performance/Capacity/Reliability issues ?
Check out our Sizing products at http://www.technovations.com


> Hi,
> I have been told that

> select * from table1 where col between value1 and value2

> is faster than

> select * from table1 where col>=value1 and col<=value2

> Is that true for Online 5.07?

> How about

> select * from table1 where col in (value1,value2,value3,value4, ...)

> vs

> select * from table1 where col=value1 or col=value2 or col=value3 or ...

> ?

> Thanks

 
 
 

Performance question

Post by David William » Sun, 31 Dec 1899 09:00:00




Quote:

>There are two factors which affect whether the OR operation works with
>indexes or not, and they're the usual culprits - the version of the engine
>and the SQL statement.

>Some versions of the engine didn't handle OR conditions very well; I don't
>have the details available.  You can find out whether your version does by
>looking at the SET EXPLAIN output.

  I thought I saw something in the 7.1 Performance Tuning Guide about
  replacing OR's with UNIONS.

Quote:>Some OR conditions are not readily handlable with indexes and others are.
>Paul's example is one that can be dealt with because both clauses in the OR
>are on the same column.  The condition could also be written as:

>       ext_id IN (3840000,383000)

>Other OR conditions cannot easily be handled with indexes: for example, a
>OR condition referring to two different columns on either side is not so easily
>handled:

>       (ext_id = 3840000) OR (other_column BETWEEN 37 and 92)

>Even if there is an index on ext_id and a separate index on other_column,
>it isn't clear how the optimizer will, or should, handle the query.  If there
>isn't
>much overlap between the sets of rows identified by the two conditions, then
>it may be quicker to do a UNION because there won't be much to do in the
>duplicate elimination phase of UNION.  On the other hand, if there is a lot of
>overlap (so many rows satisfy both conditions), then the single scan may be
>quicker than the UNION with two separate indexed SELECT operations.  It
>also depends on the size of the table, of course.

  True, but can't the duplicate elimination be done in parallel with the
  two scans? Also duplicate elimination should be CPU intensive and
  hence faster than the extra disk I/O required by a table scan..

  Surely then it really comes down to index vs sequential scan?

>Isn't life fun!  The answer is 'it varies, and it depends'...

>Yours,


--
David Williams

Maintainer of the Informix FAQ
 Primary site (Beta Version)  http://www.smooth1.demon.co.uk
 Official site                http://www.iiug.org/techinfo/faq/faq_top.html

I see you standin', Standin' on your own, It's such a lonely place for you, For
you to be If you need a shoulder, Or if you need a friend, I'll be here
standing, Until the bitter end...
So don't chastise me Or think I, I mean you harm...
All I ever wanted Was for you To know that I care

 
 
 

1. SQL performance question

I would be grateful for any comments on SQL performance issues on the
following.

1. Is there any difference between:
 a)   term IN ('term1', 'term2')
 b)   term='term1' OR term='term2'

2. I have a large number of terms that I have to search against a
user's input string in order to find phrases thus
 'cherry blossom days' LIKE '*'&term&'*'
   Most of the terms are single words with just a few multi-word
phrases. Is there any performance gain by ANDing the above with
  name LIKE '* *'
   in order to reduce non-indexed string searching. Or would it be
better to include a boolean field indicating where terms are
multi-word phrases.

2. Accessing Exchange Store in SQL Server w/VB client

3. two newbie performance questions

4. Query-Tools review's?

5. Select statement performance question - DDL included

6. problem on using DataEnvironment, recordset and seek

7. Newbie Performance Questions

8. moving sql to another server

9. Performance question

10. Performance questions (long)

11. MS SQL server performance question

12. Performance Question