index on a box

index on a box

Post by Dustin Sallin » Fri, 22 Jun 2001 03:05:35



        I've got a site with a ton of geometric data and I'm using a
little of postgresql's geometrical types.  I've got very large polygons,
up to 12kilopoints or so, in individual rows with floats for my x and y
values.  I'm calculating a box that contains all of my points and am using

        I was wondering, however, if there's a way I can use an index to
avoid table scanning for this.  The relevant parts of my sample table look
like this:

create table tmp (
        id integer,
        name text,
        b box
)

and I added the following index:

create index tmp_bybox on tmp using rtree(b);

I've got 33,507 rows currently (still loading data).

Here are the problems I'm having:


Sort  (cost=2428.02..2428.02 rows=16754 width=44)
  ->  Seq Scan on tmp  (cost=0.00..969.84 rows=16754 width=44)

        Any ideas that might help me speed things up?

--
SPY                      My girlfriend asked me which one I like better.

|    Key fingerprint =  87 02 57 08 02 D0 DA D6  C8 0F 3E 65 51 98 D8 BE
L_______________________ I hope the answer won't upset her. ____________

---------------------------(end of broadcast)---------------------------
TIP 4: Don't 'kill -9' the postmaster

 
 
 

index on a box

Post by Tom La » Sat, 23 Jun 2001 00:41:37



>    I was wondering, however, if there's a way I can use an index to
> avoid table scanning for this.

If you say "SET enable_seqscan TO off", and repeat the EXPLAIN, do you
get an indexscan plan?

I don't recommend doing such a SET for production purposes, but if this
works then the problem is just inaccurate selectivity/cost estimation.
I see that the on_pb operator has no selectivity estimator defined at
all :-( ... as a quick hack, try setting its oprrest and oprjoin to be
areasel and areajoinsel.

                        regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 4: Don't 'kill -9' the postmaster

 
 
 

index on a box

Post by Dustin Sallin » Sat, 23 Jun 2001 06:04:02


Around 11:08 on Jun 21, 2001, Tom Lane said:

# If you say "SET enable_seqscan TO off", and repeat the EXPLAIN, do you
# get an indexscan plan?

Seq Scan on tmp  (cost=100000000.00..100002500.84 rows=30434 width=92)

# I don't recommend doing such a SET for production purposes, but if
# this works then the problem is just inaccurate selectivity/cost
# estimation. I see that the on_pb operator has no selectivity estimator
# defined at all :-( ... as a quick hack, try setting its oprrest and
# oprjoin to be areasel and areajoinsel.

        I'm not sure what you just said.  :)

--
SPY                      My girlfriend asked me which one I like better.

|    Key fingerprint =  87 02 57 08 02 D0 DA D6  C8 0F 3E 65 51 98 D8 BE
L_______________________ I hope the answer won't upset her. ____________

---------------------------(end of broadcast)---------------------------
TIP 2: you can get off all lists at once with the unregister command

 
 
 

index on a box

Post by Tom La » Sat, 23 Jun 2001 06:29:23



> Around 11:08 on Jun 21, 2001, Tom Lane said:
> # If you say "SET enable_seqscan TO off", and repeat the EXPLAIN, do you
> # get an indexscan plan?
> Seq Scan on tmp  (cost=100000000.00..100002500.84 rows=30434 width=92)

Drat.

Quote:>    I'm not sure what you just said.  :)

Never mind, it wouldn't work anyway.



box with all corners at the point of interest.  Crufty, but unless you
want to go in and teach rtree about a new operator type...

                        regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 4: Don't 'kill -9' the postmaster

 
 
 

index on a box

Post by Dustin Sallin » Sat, 23 Jun 2001 07:18:34


Around 17:21 on Jun 21, 2001, Tom Lane said:

# Never mind, it wouldn't work anyway.
#


# zero-area box with all corners at the point of interest.  Crufty, but
# unless you want to go in and teach rtree about a new operator type...


Index Scan using tmp_bybox on tmp  (cost=0.00..238.59 rows=61 width=92)



this one particular task I'm doing right now.  :)  It seems that rtree
already knows how to do what I'm trying to do if all I've got to do is
make a box containing the point twice to get the lookup to be fast.

        If anyone's interested in what I'm doing with this, you can see it
in action here:

http://bleu.west.spy.net/~dustin/geo/pointinfoform.jsp

        I've loaded about 60k polygons (consisting of a total of about
seven million points) describing the shape of various geographical areas
in the United States.  I've got a table with the descriptions of the
polygons and box boundaries, then another table with the actual ordered
polygon data.  I do a box match on the first table to get a list of
candidates, then examine them all in more detail with a point-in-polygon
algorithm in my application.  It's currently pretty slow because I'm don't
actually have a box column on my first table, just the boundaries, which I

looks like, when the index works, it'll be as fast as it was when I had
very little data in the tables again.  :)

        Thanks for the help!

--
SPY                      My girlfriend asked me which one I like better.

|    Key fingerprint =  87 02 57 08 02 D0 DA D6  C8 0F 3E 65 51 98 D8 BE
L_______________________ I hope the answer won't upset her. ____________

---------------------------(end of broadcast)---------------------------
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/users-lounge/docs/faq.html

 
 
 

index on a box

Post by Tom La » Sat, 23 Jun 2001 07:50:28



> create function box(point) returns box as
>     'select box($1, $1)'
>     language 'sql';

> NOTICE:  QUERY PLAN:
> Seq Scan on tmp  (cost=0.00..2653.01 rows=61 width=92)

You'd better declare the function as 'iscachable'.  As is, the planner
doesn't trust it to return a constant.

                        regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 4: Don't 'kill -9' the postmaster

 
 
 

index on a box

Post by Dustin Sallin » Sat, 23 Jun 2001 08:03:10


Around 15:14 on Jun 21, 2001, Dustin Sallings said:


#
# Index Scan using tmp_bybox on tmp  (cost=0.00..238.59 rows=61 width=92)

        Ugh, any idea here?

create function box(point) returns box as
    'select box($1, $1)'
    language 'sql';


NOTICE:  QUERY PLAN:

Seq Scan on tmp  (cost=0.00..2653.01 rows=61 width=92)

misc=# explain select * from tmp

NOTICE:  QUERY PLAN:

Index Scan using tmp_bybox on tmp  (cost=0.00..238.59 rows=61 width=92)

misc=# select box(point(-121, 37));
         box
---------------------
 (-121,37),(-121,37)

misc=# select box(point(-121,37),point(-121,37));
         box
---------------------
 (-121,37),(-121,37)

misc=# select 1 where box(point(-121,37),point(-121,37))=box(point(-121,
37));
 ?column?
----------
        1

--
SPY                      My girlfriend asked me which one I like better.

|    Key fingerprint =  87 02 57 08 02 D0 DA D6  C8 0F 3E 65 51 98 D8 BE
L_______________________ I hope the answer won't upset her. ____________

---------------------------(end of broadcast)---------------------------
TIP 3: if posting/reading through Usenet, please send an appropriate

message can get through to the mailing list cleanly

 
 
 

index on a box

Post by Dustin Sallin » Sat, 23 Jun 2001 08:14:17


Around 18:26 on Jun 21, 2001, Tom Lane said:

# You'd better declare the function as 'iscachable'.  As is, the planner
# doesn't trust it to return a constant.

        Got it!  Tried a few variations on the create until I got this:

create function box(point) returns box as
    'select box($1, $1)'
    language 'sql'
    with (iscachable);


NOTICE:  QUERY PLAN:

Index Scan using tmp_bybox on tmp  (cost=0.00..238.59 rows=61 width=92)

        Thanks!  :)

        (oh, and would it be reasonable to list at least some of the
possible attributes in psql?)

--
SPY                      My girlfriend asked me which one I like better.

|    Key fingerprint =  87 02 57 08 02 D0 DA D6  C8 0F 3E 65 51 98 D8 BE
L_______________________ I hope the answer won't upset her. ____________

---------------------------(end of broadcast)---------------------------
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/users-lounge/docs/faq.html

 
 
 

index on a box

Post by Thalis A. Kalfigopoulo » Thu, 28 Jun 2001 03:25:23



> Tom Lane?
>    Does anyone who knows a user's information is storeed in what pgSQL's system table?  Thank all!

pg_user
(rather obvious i guess)

t.

Quote:

> ---------------------------(end of broadcast)---------------------------
> TIP 5: Have you checked our extensive FAQ?

> http://www.postgresql.org/users-lounge/docs/faq.html

---------------------------(end of broadcast)---------------------------
TIP 3: if posting/reading through Usenet, please send an appropriate

message can get through to the mailing list cleanly
 
 
 

1. Indexing for Combo Boxes

A couple of questions about indexes with a SQL Server back end and an Access
mdb front end.

1) If a combo box has two columns, one for the primary key of the list
table, and another for another field, such as Description, is a two-field
index needed, or is an index on the primary key enough?

2) Similarly, if a combo box has three columns, one for primary key, along
with two additional ones, and the primary key column is hidden, resulting in
the user entering a value based on the second column, what sort of index
would be useful -- an index based on the second column alone, second and
third, other?

Thanks!

Neil

2. IP Address and SQL Server

3. index on a box

4. use of Ececute commande with PSQL

5. Problem with Indexed Table in a List Box

6. Patches (Solaris 2.5.1) required for Sybase 10.0.2

7. Combo box Index problem

8. Column restriction

9. Getting Index from a List View check box

10. List Index is out of Bounds - Combo box problems

11. COMBO BOX INDEX PROPERTY

12. Move DB from one box to other box

13. Copy table from SQL box to other SQL box