Determining which index to create

Determining which index to create

Post by Eric Chol » Thu, 22 Nov 2001 20:35:34



I have the following table :

 Attribute |           Type           | Modifier
-----------+--------------------------+----------
 motid     | integer                  | not null
 objid     | integer                  | not null
 date      | timestamp with time zone | not null

...with 140 million rows. For each distinct value of
motid there are many rows (with different objid/dates).
I would like to optimize the following query:

=> select * from dico_frs where motid=4742 order by date desc limit 10;

Creating an index on 'date' makes the query use that index:

Limit  (cost=0.00..17591.91 rows=10 width=16)
  ->  Index Scan Backward using dico_frs_date on dico_frs
(cost=0.00..20023641.63 rows=11382 width=16)

But it's still quite slow. I'm thinking an index on (motid, date desc)
would be best
but that doesn't seem to be possible. How can I optimize this query?

--
Eric Cholet

---------------------------(end of broadcast)---------------------------

 
 
 

Determining which index to create

Post by Martijn van Oosterho » Thu, 22 Nov 2001 20:57:09



> I would like to optimize the following query:

> => select * from dico_frs where motid=4742 order by date desc limit 10;

> But it's still quite slow. I'm thinking an index on (motid, date desc)
> would be best
> but that doesn't seem to be possible. How can I optimize this query?

Indexes (at least btree ones) can be scanned in either forward or backward
directions. So an index on (motid,date) should be fine.
--

http://svana.org/kleptog/

Quote:> Magnetism, electricity and motion are like a three-for-two special offer:
> if you have two of them, the third one comes free.

---------------------------(end of broadcast)---------------------------
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/users-lounge/docs/faq.html

 
 
 

Determining which index to create

Post by Eric Chol » Thu, 22 Nov 2001 21:05:44


--On mercredi 21 novembre 2001 22:48 +1100 Martijn van Oosterhout



>> I would like to optimize the following query:

>> => select * from dico_frs where motid=4742 order by date desc limit 10;

>> But it's still quite slow. I'm thinking an index on (motid, date desc)
>> would be best
>> but that doesn't seem to be possible. How can I optimize this query?

> Indexes (at least btree ones) can be scanned in either forward or backward
> directions. So an index on (motid,date) should be fine.

I should have mentionned I tried that, but it isn't being used:

=> \d dico_frs_motid_date
     Index "dico_frs_motid_date"
 Attribute |           Type
-----------+--------------------------
 motid     | integer
 date      | timestamp with time zone
btree

=> explain select * from dico_frs where motid=4742 order by date desc limit
10;
NOTICE:  QUERY PLAN:

Limit  (cost=0.00..17591.91 rows=10 width=16)
  ->  Index Scan Backward using dico_frs_date on dico_frs
(cost=0.00..20023641.63 rows=11382 width=16)

--
Eric Cholet

---------------------------(end of broadcast)---------------------------
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/users-lounge/docs/faq.html

 
 
 

Determining which index to create

Post by Martijn van Oosterho » Thu, 22 Nov 2001 23:35:42



> I should have mentionned I tried that, but it isn't being used:

> => \d dico_frs_motid_date
>      Index "dico_frs_motid_date"
>  Attribute |           Type
> -----------+--------------------------
>  motid     | integer
>  date      | timestamp with time zone
> btree

> => explain select * from dico_frs where motid=4742 order by date desc limit
> 10;
> NOTICE:  QUERY PLAN:

> Limit  (cost=0.00..17591.91 rows=10 width=16)
>   ->  Index Scan Backward using dico_frs_date on dico_frs
> (cost=0.00..20023641.63 rows=11382 width=16)

Well, it is doing the scan backwards, which is good. But it's not using the
index. If you drop dico_frs_date index, does it do it then?

Oh, and what version of postgres was this again?
--

http://svana.org/kleptog/

Quote:> Magnetism, electricity and motion are like a three-for-two special offer:
> if you have two of them, the third one comes free.

---------------------------(end of broadcast)---------------------------
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/users-lounge/docs/faq.html

 
 
 

Determining which index to create

Post by Eric Chol » Fri, 23 Nov 2001 00:25:56


--On jeudi 22 novembre 2001 01:04 +1100 Martijn van Oosterhout



>> I should have mentionned I tried that, but it isn't being used:

>> => \d dico_frs_motid_date
>>      Index "dico_frs_motid_date"
>>  Attribute |           Type
>> -----------+--------------------------
>>  motid     | integer
>>  date      | timestamp with time zone
>> btree

>> => explain select * from dico_frs where motid=4742 order by date desc
>> limit  10;
>> NOTICE:  QUERY PLAN:

>> Limit  (cost=0.00..17591.91 rows=10 width=16)
>>   ->  Index Scan Backward using dico_frs_date on dico_frs
>> (cost=0.00..20023641.63 rows=11382 width=16)

> Well, it is doing the scan backwards, which is good. But it's not using
> the index. If you drop dico_frs_date index, does it do it then?

=> explain select * from dico_frs where motid=4742 order by date desc limit
10;
NOTICE:  QUERY PLAN:

Limit  (cost=46172.25..46172.25 rows=10 width=16)
  ->  Sort  (cost=46172.25..46172.25 rows=11382 width=16)
        ->  Index Scan using dico_frs_motid_date on dico_frs
(cost=0.00..45405.39 rows=11382 width=16)

It's a bit better but still quite long, depending on how many rows for a
particular motid.
Dropping the "desc" in the "order by date" clause makes things much faster,
but I need the
results in reverse chronological order!

Quote:> Oh, and what version of postgres was this again?

=> select version();
                               version
---------------------------------------------------------------------
 PostgreSQL 7.1.3 on i386-unknown-freebsd4.4, compiled by GCC 2.95.3
(1 row)

Thanks for your help,
--
Eric Cholet

---------------------------(end of broadcast)---------------------------
TIP 6: Have you searched our list archives?

http://archives.postgresql.org

 
 
 

Determining which index to create

Post by Stephan Sza » Fri, 23 Nov 2001 02:05:55



> => explain select * from dico_frs where motid=4742 order by date desc limit
> 10;
> NOTICE:  QUERY PLAN:

> Limit  (cost=46172.25..46172.25 rows=10 width=16)
>   ->  Sort  (cost=46172.25..46172.25 rows=11382 width=16)
>         ->  Index Scan using dico_frs_motid_date on dico_frs
> (cost=0.00..45405.39 rows=11382 width=16)

> It's a bit better but still quite long, depending on how many rows for a
> particular motid.
> Dropping the "desc" in the "order by date" clause makes things much faster,
> but I need the
> results in reverse chronological order!

Hmm, it looks like the sort is the expensive bit even though it's
estimating something low for it (relative to the index scan).
Have you tried setting sort_mem higher than the defaults (which are
really low) to see if it's just going out to disk for the sort.

---------------------------(end of broadcast)---------------------------
TIP 4: Don't 'kill -9' the postmaster

 
 
 

Determining which index to create

Post by Eric Chol » Fri, 23 Nov 2001 23:27:00






>> > => explain select * from dico_frs where motid=4742 order by date desc
>> > limit 10;
>> > NOTICE:  QUERY PLAN:

>> > Limit  (cost=46172.25..46172.25 rows=10 width=16)
>> >   ->  Sort  (cost=46172.25..46172.25 rows=11382 width=16)
>> >         ->  Index Scan using dico_frs_motid_date on dico_frs
>> > (cost=0.00..45405.39 rows=11382 width=16)

>> That's wrong. It doesn't seem to realise that a reverse scan on the index
>> would give the right answer. Note that that's only true because you're
>> selecting only a single motid. If there were multiple, a reverse scan
>> would definitly not be appropriate.

> Please try
>   select * from dico_frs where motid=4742 order by motid desc,
>   date desc limit 10;

Wow, I am speechless. Sub-second response time, whether the result set
is large or very small. Very impressive. I have resisted pressure to
use Oracle for this application, trusting open source software would do
an equivalent or better job (this table has 140 million records).

Thank you very much.

--
Eric Cholet

---------------------------(end of broadcast)---------------------------
TIP 6: Have you searched our list archives?

http://archives.postgresql.org

 
 
 

1. Indexes - determining what to create

Hi all,
        can anyone spare some time to make some comments on the scenario below.

Assume I have a table (call it boxes, say) whose columns are A,B,C,D,E,F,G,H,I,J.

A is an identity column, and is the primary key
B is a foreign key
C and D make up an alternate key - C has by far the greater number of repeat values
E is a single char
F is a varchar 75
G is an integer
H is an image
I is a text
J is a timestamp

The following are queries on the table:

Get rows between two A values
Get rows between two A values and where E is a value
Get rows for a B
Get a row from a G, E and B
Get an A from an C and D
Get F's where E is a value
Updates use A only
Deletes are by A,B or J

Can anyone suggest what indexes I should create on the table, and give an insight as to why?

Thanks in advance
Wayne
E-mail appreciated and preferred :)

2. Advanced text search packages available.

3. (create index) vs (create index ... online)

4. rdoRegisterDataSource with SQL Server 7.0

5. ORA-01031 with Create Index for a Function Based Index

6. Database backup and restore

7. reindex vs. drop index , create index

8. ADO Recordset Delete

9. System Index vs. User Created Index

10. Drop Index, BCP Import and Create Index (Ver 6.5)

11. Create Index UNIQUE contraint or index?

12. Primary key using index alt. creating index

13. Index dees not match the table, delete the index file and re-create