Update statistics from "generation program" question

Update statistics from "generation program" question

Post by Doug McAlliste » Sun, 31 Dec 1899 09:00:00



Hi all..

Given the following table definition:

create table acct_nbr_srch_keys
  (
    t_acct_num char(10),
    fund_num char(3),
    acct_num char(13),
    doc_id integer not null ,
    fldr_id integer not null
  );

create unique index acctnbr_s_keys_pk on acct_nbr_srch_keys
(acct_num,fund_num,t_acct_num,doc_id);

create index acctnbr_s_keys_ix1 on acct_nbr_srch_keys
(t_acct_num,acct_num,fund_num,fldr_id,doc_id);
create index acctnbr_sk_doc_fk on acct_nbr_srch_keys (doc_id);
create index acctnbr_s_keys_ix2 on acct_nbr_srch_keys (fldr_id);

I am curious to understand why the update statistics statements are
generated as follows:

UPDATE STATISTICS MEDIUM FOR TABLE acct_nbr_srch_keys  DISTRIBUTIONS
ONLY;
UPDATE STATISTICS HIGH FOR TABLE acct_nbr_srch_keys (t_acct_num)
DISTRIBUTIONS ONLY;
UPDATE STATISTICS LOW FOR TABLE acct_nbr_srch_keys (t_acct_num,
acct_num, fund_num, fldr_id, doc_id);
UPDATE STATISTICS HIGH FOR TABLE acct_nbr_srch_keys (acct_num)
DISTRIBUTIONS ONLY;
UPDATE STATISTICS LOW FOR TABLE acct_nbr_srch_keys (acct_num, fund_num,
t_acct_num, doc_id);
UPDATE STATISTICS HIGH FOR TABLE acct_nbr_srch_keys (doc_id);
UPDATE STATISTICS HIGH FOR TABLE acct_nbr_srch_keys (fldr_id);

Specifically, why are there multiple updates for the same columns (look
at the two LOW statements).
I understand the HIGH's.

This table contains 31,000,000 rows and this update statistics job takes
15 hours to complete on a six-way Sun machine. I would like to shave any
time off the execution that I can.

thanks as always.....

  doug.mcallister.vcf
< 1K Download
 
 
 

Update statistics from "generation program" question

Post by mars1.. » Sun, 31 Dec 1899 09:00:00


What is generating the update statistics lines?  And what version of
Informix are you using?



> This is a multi-part message in MIME format.
> --------------AD06AC18049333F604EE0DF3
> Content-Type: text/plain; charset=us-ascii
> Content-Transfer-Encoding: 7bit

> Hi all..

> Given the following table definition:

> create table acct_nbr_srch_keys
> (
> t_acct_num char(10),
> fund_num char(3),
> acct_num char(13),
> doc_id integer not null ,
> fldr_id integer not null
> );

> create unique index acctnbr_s_keys_pk on acct_nbr_srch_keys
> (acct_num,fund_num,t_acct_num,doc_id);

> create index acctnbr_s_keys_ix1 on acct_nbr_srch_keys
> (t_acct_num,acct_num,fund_num,fldr_id,doc_id);
> create index acctnbr_sk_doc_fk on acct_nbr_srch_keys (doc_id);
> create index acctnbr_s_keys_ix2 on acct_nbr_srch_keys (fldr_id);

> I am curious to understand why the update statistics statements are
> generated as follows:

> UPDATE STATISTICS MEDIUM FOR TABLE acct_nbr_srch_keys DISTRIBUTIONS
> ONLY;
> UPDATE STATISTICS HIGH FOR TABLE acct_nbr_srch_keys (t_acct_num)
> DISTRIBUTIONS ONLY;
> UPDATE STATISTICS LOW FOR TABLE acct_nbr_srch_keys (t_acct_num,
> acct_num, fund_num, fldr_id, doc_id);
> UPDATE STATISTICS HIGH FOR TABLE acct_nbr_srch_keys (acct_num)
> DISTRIBUTIONS ONLY;
> UPDATE STATISTICS LOW FOR TABLE acct_nbr_srch_keys (acct_num,
fund_num,
> t_acct_num, doc_id);
> UPDATE STATISTICS HIGH FOR TABLE acct_nbr_srch_keys (doc_id);
> UPDATE STATISTICS HIGH FOR TABLE acct_nbr_srch_keys (fldr_id);

> Specifically, why are there multiple updates for the same columns
(look
> at the two LOW statements).
> I understand the HIGH's.

> This table contains 31,000,000 rows and this update statistics job
takes
> 15 hours to complete on a six-way Sun machine. I would like to shave
any
> time off the execution that I can.

> thanks as always.....

> --------------AD06AC18049333F604EE0DF3
> Content-Type: text/x-vcard; charset=us-ascii;
> name="doug.mcallister.vcf"
> Content-Transfer-Encoding: 7bit
> Content-Description: Card for Doug McAllister
> Content-Disposition: attachment;
> filename="doug.mcallister.vcf"

> begin:vcard
> n:McAllister;Doug
> tel;work:(603) 791-5688
> x-mozilla-html:TRUE
> url:www.fidelity.com
> org:Fidelity Investments
> version:2.1

> title:Consulting DBA
> adr;quoted-printable:;;2 Contra Way=0D=0AT1Q;Merrimack;NH;03054;USA
> fn:Doug McAllister
> end:vcard

> --------------AD06AC18049333F604EE0DF3--

--
# unrm /
ksh: unrm: not found
# man cpio

Sent via Deja.com http://www.deja.com/
Before you buy.

 
 
 

Update statistics from "generation program" question

Post by jim lowr » Sun, 31 Dec 1899 09:00:00


First off you will find that
using HIGH will give you VERY
LITTLE...  let me repeat...
"very little" over update med.

You may want to remove the low's
and do med instead of high, and
run high with distributions only
for all columns that head an index
or not at all.

In addition, you may even want to
alter the sampling rate by adjusting
the confidence and resolution
parameters.

Good luck.




>Subject: Update statistics from "generation program" question
>Date: Thu, 30 Mar 2000 09:59:12 -0500

>This is a multi-part message in MIME format.
>--------------AD06AC18049333F604EE0DF3
>Content-Type: text/plain; charset=us-ascii
>Content-Transfer-Encoding: 7bit

>Hi all..

>Given the following table definition:

>create table acct_nbr_srch_keys
>   (
>     t_acct_num char(10),
>     fund_num char(3),
>     acct_num char(13),
>     doc_id integer not null ,
>     fldr_id integer not null
>   );

>create unique index acctnbr_s_keys_pk on acct_nbr_srch_keys
>(acct_num,fund_num,t_acct_num,doc_id);

>create index acctnbr_s_keys_ix1 on acct_nbr_srch_keys
>(t_acct_num,acct_num,fund_num,fldr_id,doc_id);
>create index acctnbr_sk_doc_fk on acct_nbr_srch_keys (doc_id);
>create index acctnbr_s_keys_ix2 on acct_nbr_srch_keys (fldr_id);

>I am curious to understand why the update statistics statements are
>generated as follows:

>UPDATE STATISTICS MEDIUM FOR TABLE acct_nbr_srch_keys  DISTRIBUTIONS
>ONLY;
>UPDATE STATISTICS HIGH FOR TABLE acct_nbr_srch_keys (t_acct_num)
>DISTRIBUTIONS ONLY;
>UPDATE STATISTICS LOW FOR TABLE acct_nbr_srch_keys (t_acct_num,
>acct_num, fund_num, fldr_id, doc_id);
>UPDATE STATISTICS HIGH FOR TABLE acct_nbr_srch_keys (acct_num)
>DISTRIBUTIONS ONLY;
>UPDATE STATISTICS LOW FOR TABLE acct_nbr_srch_keys (acct_num, fund_num,
>t_acct_num, doc_id);
>UPDATE STATISTICS HIGH FOR TABLE acct_nbr_srch_keys (doc_id);
>UPDATE STATISTICS HIGH FOR TABLE acct_nbr_srch_keys (fldr_id);

>Specifically, why are there multiple updates for the same columns (look
>at the two LOW statements).
>I understand the HIGH's.

>This table contains 31,000,000 rows and this update statistics job takes
>15 hours to complete on a six-way Sun machine. I would like to shave any
>time off the execution that I can.

>thanks as always.....

>--------------AD06AC18049333F604EE0DF3
>Content-Type: text/x-vcard; charset=us-ascii;
>  name="doug.mcallister.vcf"
>Content-Transfer-Encoding: 7bit
>Content-Description: Card for Doug McAllister
>Content-Disposition: attachment;
>  filename="doug.mcallister.vcf"

>begin:vcard
>n:McAllister;Doug
>tel;work:(603) 791-5688
>x-mozilla-html:TRUE
>url:www.fidelity.com
>org:Fidelity Investments
>version:2.1

>title:Consulting DBA
>adr;quoted-printable:;;2 Contra Way=0D=0AT1Q;Merrimack;NH;03054;USA
>fn:Doug McAllister
>end:vcard

>--------------AD06AC18049333F604EE0DF3--

______________________________________________________
Get Your Private, Free Email at http://www.hotmail.com
 
 
 

Update statistics from "generation program" question

Post by Stefan Weidenede » Sun, 31 Dec 1899 09:00:00


Hi all,

isn't this schema of update statistics the recommandation
for SAP/R3 tables, where we have to perform an UPDATE
STATISTICS HIGH on the first indexed column and an
UPDATE STATISTICS MEDIUM for every other indexed column.

I have another question. What system tables will be
updated if I run the following part of your output ?

UPDATE STATISTICS LOW FOR TABLE acct_nbr_srch_keys (acct_num, fund_num,
t_acct_num, doc_id);
UPDATE STATISTICS LOW FOR TABLE acct_nbr_srch_keys (t_acct_num,
acct_num, fund_num, fldr_id, doc_id);

Best regards,

--
Stefan Weideneder

Phone: +49 89/3565478-2 ---------------
--- Fax: +49 89/3565478-3 -------------

-------- http://www.weideneder.de -----

 
 
 

Update statistics from "generation program" question

Post by Obnoxio The Clow » Sun, 31 Dec 1899 09:00:00



Quote:

>Given the following table definition:

>create table acct_nbr_srch_keys
>   (
>     t_acct_num char(10),
>     fund_num char(3),
>     acct_num char(13),
>     doc_id integer not null ,
>     fldr_id integer not null
>   );

>create unique index acctnbr_s_keys_pk on acct_nbr_srch_keys
>(acct_num,fund_num,t_acct_num,doc_id);

>create index acctnbr_s_keys_ix1 on acct_nbr_srch_keys
>(t_acct_num,acct_num,fund_num,fldr_id,doc_id);
>create index acctnbr_sk_doc_fk on acct_nbr_srch_keys (doc_id);
>create index acctnbr_s_keys_ix2 on acct_nbr_srch_keys (fldr_id);

>I am curious to understand why the update statistics statements are
>generated as follows:

>UPDATE STATISTICS MEDIUM FOR TABLE acct_nbr_srch_keys  DISTRIBUTIONS
>ONLY;
>UPDATE STATISTICS HIGH FOR TABLE acct_nbr_srch_keys (t_acct_num)
>DISTRIBUTIONS ONLY;
>UPDATE STATISTICS LOW FOR TABLE acct_nbr_srch_keys (t_acct_num,
>acct_num, fund_num, fldr_id, doc_id);
>UPDATE STATISTICS HIGH FOR TABLE acct_nbr_srch_keys (acct_num)
>DISTRIBUTIONS ONLY;
>UPDATE STATISTICS LOW FOR TABLE acct_nbr_srch_keys (acct_num, fund_num,
>t_acct_num, doc_id);
>UPDATE STATISTICS HIGH FOR TABLE acct_nbr_srch_keys (doc_id);
>UPDATE STATISTICS HIGH FOR TABLE acct_nbr_srch_keys (fldr_id);

>Specifically, why are there multiple updates for the same columns (look
>at the two LOW statements).
>I understand the HIGH's.

There's an interestling little buglet in UPDATE STATISTICS HIGH. You would
think it does an implicit UPDATE STATISTICS LOW, wouldn't you? Well, it
doesn't under certain circumstances (I think it's if you're US'ing a single
column, but I don't really remember).

Quote:>This table contains 31,000,000 rows and this update statistics job takes
>15 hours to complete on a six-way Sun machine. I would like to shave any
>time off the execution that I can.

Maybe if you did a multi-column US instead, you might not need the extra US
LOW. But I'm not sure.
______________________________________________________
Get Your Private, Free Email at http://www.hotmail.com
 
 
 

Update statistics from "generation program" question

Post by Doug McAlliste » Sun, 31 Dec 1899 09:00:00


These are generated by Art K's dostats program on a 7.30.uc3 system


> What is generating the update statistics lines?  And what version of
> Informix are you using?



> > This is a multi-part message in MIME format.
> > --------------AD06AC18049333F604EE0DF3
> > Content-Type: text/plain; charset=us-ascii
> > Content-Transfer-Encoding: 7bit

> > Hi all..

> > Given the following table definition:

> > create table acct_nbr_srch_keys
> > (
> > t_acct_num char(10),
> > fund_num char(3),
> > acct_num char(13),
> > doc_id integer not null ,
> > fldr_id integer not null
> > );

> > create unique index acctnbr_s_keys_pk on acct_nbr_srch_keys
> > (acct_num,fund_num,t_acct_num,doc_id);

> > create index acctnbr_s_keys_ix1 on acct_nbr_srch_keys
> > (t_acct_num,acct_num,fund_num,fldr_id,doc_id);
> > create index acctnbr_sk_doc_fk on acct_nbr_srch_keys (doc_id);
> > create index acctnbr_s_keys_ix2 on acct_nbr_srch_keys (fldr_id);

> > I am curious to understand why the update statistics statements are
> > generated as follows:

> > UPDATE STATISTICS MEDIUM FOR TABLE acct_nbr_srch_keys DISTRIBUTIONS
> > ONLY;
> > UPDATE STATISTICS HIGH FOR TABLE acct_nbr_srch_keys (t_acct_num)
> > DISTRIBUTIONS ONLY;
> > UPDATE STATISTICS LOW FOR TABLE acct_nbr_srch_keys (t_acct_num,
> > acct_num, fund_num, fldr_id, doc_id);
> > UPDATE STATISTICS HIGH FOR TABLE acct_nbr_srch_keys (acct_num)
> > DISTRIBUTIONS ONLY;
> > UPDATE STATISTICS LOW FOR TABLE acct_nbr_srch_keys (acct_num,
> fund_num,
> > t_acct_num, doc_id);
> > UPDATE STATISTICS HIGH FOR TABLE acct_nbr_srch_keys (doc_id);
> > UPDATE STATISTICS HIGH FOR TABLE acct_nbr_srch_keys (fldr_id);

> > Specifically, why are there multiple updates for the same columns
> (look
> > at the two LOW statements).
> > I understand the HIGH's.

> > This table contains 31,000,000 rows and this update statistics job
> takes
> > 15 hours to complete on a six-way Sun machine. I would like to shave
> any
> > time off the execution that I can.

> > thanks as always.....

> > --------------AD06AC18049333F604EE0DF3
> > Content-Type: text/x-vcard; charset=us-ascii;
> > name="doug.mcallister.vcf"
> > Content-Transfer-Encoding: 7bit
> > Content-Description: Card for Doug McAllister
> > Content-Disposition: attachment;
> > filename="doug.mcallister.vcf"

> > begin:vcard
> > n:McAllister;Doug
> > tel;work:(603) 791-5688
> > x-mozilla-html:TRUE
> > url:www.fidelity.com
> > org:Fidelity Investments
> > version:2.1

> > title:Consulting DBA
> > adr;quoted-printable:;;2 Contra Way=0D=0AT1Q;Merrimack;NH;03054;USA
> > fn:Doug McAllister
> > end:vcard

> > --------------AD06AC18049333F604EE0DF3--

> --
> # unrm /
> ksh: unrm: not found
> # man cpio

> Sent via Deja.com http://www.deja.com/
> Before you buy.

  doug.mcallister.vcf
< 1K Download
 
 
 

Update statistics from "generation program" question

Post by PC Use » Sun, 31 Dec 1899 09:00:00



> Hi all,

> isn't this schema of update statistics the recommandation
> for SAP/R3 tables, where we have to perform an UPDATE
> STATISTICS HIGH on the first indexed column and an
> UPDATE STATISTICS MEDIUM for every other indexed column.

> I have another question. What system tables will be
> updated if I run the following part of your output ?

> UPDATE STATISTICS LOW FOR TABLE acct_nbr_srch_keys (acct_num, fund_num,
> t_acct_num, doc_id);
> UPDATE STATISTICS LOW FOR TABLE acct_nbr_srch_keys (t_acct_num,
> acct_num, fund_num, fldr_id, doc_id);

UPDATE STATISTICS LOW updates the row and page counts in systables, the
levels, leaves, and nunique columns  in sysindexes, and the colmin and
colmax figures
for the named columns in syscolumns.

Art S. Kagel

 
 
 

Update statistics from "generation program" question

Post by PC Use » Sun, 31 Dec 1899 09:00:00



> Hi all..

> Given the following table definition:

> create table acct_nbr_srch_keys
>   (
>     t_acct_num char(10),
>     fund_num char(3),
>     acct_num char(13),
>     doc_id integer not null ,
>     fldr_id integer not null
>   );

> create unique index acctnbr_s_keys_pk on acct_nbr_srch_keys
> (acct_num,fund_num,t_acct_num,doc_id);

> create index acctnbr_s_keys_ix1 on acct_nbr_srch_keys
> (t_acct_num,acct_num,fund_num,fldr_id,doc_id);
> create index acctnbr_sk_doc_fk on acct_nbr_srch_keys (doc_id);
> create index acctnbr_s_keys_ix2 on acct_nbr_srch_keys (fldr_id);

> I am curious to understand why the update statistics statements are
> generated as follows:

> UPDATE STATISTICS MEDIUM FOR TABLE acct_nbr_srch_keys  DISTRIBUTIONS ONLY;

> UPDATE STATISTICS HIGH FOR TABLE acct_nbr_srch_keys (t_acct_num)
> DISTRIBUTIONS ONLY;
> UPDATE STATISTICS LOW FOR TABLE acct_nbr_srch_keys (t_acct_num,
> acct_num, fund_num, fldr_id, doc_id);
> UPDATE STATISTICS HIGH FOR TABLE acct_nbr_srch_keys (acct_num)
> DISTRIBUTIONS ONLY;
> UPDATE STATISTICS LOW FOR TABLE acct_nbr_srch_keys (acct_num, fund_num,
> t_acct_num, doc_id);
> UPDATE STATISTICS HIGH FOR TABLE acct_nbr_srch_keys (doc_id);
> UPDATE STATISTICS HIGH FOR TABLE acct_nbr_srch_keys (fldr_id);

Because dostats implements the recommendations for best efficient stats
generation
as described in the 7.21 release notes which are also now documented in the
Performance Guide manual and these are the recommended statements.

Quote:

> Specifically, why are there multiple updates for the same columns (look
> at the two LOW statements).

The recommendation requires that a LOW be executed for the ENTIRE KEY for
each index.  This is in order to update the levels, leaves, and nunique
columns in
sysindexes.  The only redundancy is the actual update of the syscolumns
records for
the columns repeated in multiple indexes.  The sorting and counting are the
same
regardless of whether the column were named but by leaving it out the
sysindexes
record could not be updated.

Quote:> I understand the HIGH's.

> This table contains 31,000,000 rows and this update statistics job takes
> 15 hours to complete on a six-way Sun machine. I would like to shave any
> time off the execution that I can.

Recommendations to speed dostats operation:

1) PDQPRIORITY=100 (at least 40 anyway of there is much other activity).
2) PSORT_NPROCS=40 (Yes I know the manual says 10 is the maximum effective
   value, try it, I did.)
3) PSORT_DBTEMP=[list of AT LEAST 3 and prefereably 4 different OS
   filesystems preferable on different disk farms and controllers than the
database)

UPDATE STATISTICS is a real dog without the PSORT parameters set properly.

Run multiple copies of dostats one for each table.  You should be able to
productively run many copies.  I have run as many as 100 on a 28 CPU VP
server
running on a 32-way M88100 (50MHZ) system with no noticeable slowdown in
server performance.

Art S. Kagel

 
 
 

Update statistics from "generation program" question

Post by PC Use » Sun, 31 Dec 1899 09:00:00


[SNIP]

Quote:> There's an interestling little buglet in UPDATE STATISTICS HIGH. You would
> think it does an implicit UPDATE STATISTICS LOW, wouldn't you? Well, it
> doesn't under certain circumstances (I think it's if you're US'ing a single
> column, but I don't really remember).

If the column list is a superset of an index then that index's sysindex record
will be
updated and all of the listed columns will have their syscolumns stats (colmax
& colmin) updated.

Quote:

> >This table contains 31,000,000 rows and this update statistics job takes
> >15 hours to complete on a six-way Sun machine. I would like to shave any
> >time off the execution that I can.

> Maybe if you did a multi-column US instead, you might not need the extra US
> LOW. But I'm not sure.

This is the reason that for a single column index, if the columns has not
already been
updated LOW as part of some other index, dostats does ONLY the HIGH leaving
off the DISTRIBUTIONS ONLY clause since the HIGH will properly update the
LOW stats.  The DISTRIBUTIONS ONLY clause is included otherwise to avoid
performing all of the LOW counting that cannot be used to update sysindexes
anyway.

Art S. Kagel

 
 
 

1. SQL-DMO Bulkcopy AND "Update Statistics"

Help!

Our VB app uses the SQLDMO.Bulkcopy object on a fairly constant basis
(2-20 times an hour) on a certain table.

Since we recently upgraded to SQL 7.0, we are getting DEADLOCKS
because at the end of the SQLDMO.Bulkcopy, SQL automatically does an
"UPDATE STATISTICS" on the table.

 We have turned off "AUTO CREATE STATS" and AUTO UPDATE STATS" for
that database. We continually try to keep "sp_autostats" turned off
for that table and its one index.  I realize, after reading MSDN, that
DEADLOCKS *will* occur during UPDATE STATISTICS. I just want someone
to tell me that "Yes, even though you are not *specifying* an UPDATE
STATISTICS after the SQLDMO.Bulkcopy, SQL is automatically doing one
for you. And, as a side note, in the SQL 7 manual, it states that an
UPDATE STATISTICS can only be run by the OWNER of the table, yet the
id I am using in the code to do the SQLDMO.Bulkcopy is *NOT* the table
owner. DBO is the owner and this userid is only in the "public" role.
Does that give it enough rights to do the UPDATE STATISTICS? Because
it sure is able to do it.

Is there any other way to turn off the UPDATE STATISTICS and stop my
users from getting deadlocks?

Thanks for your time,
Ed R.

2. Obscure error --- scratching my head

3. Auto-updating of "Named Statistics"

4. Requery method for VB3.0 Dynasets ???

5. "UPDATE STATISTICS" command

6. Oracle Sertification

7. Update: "Abnormal Program termination" - FYI

8. P.SQL 7 SP6 - When?

9. max of ("...","...","..")

10. Two programming questions: Cursors and "Records"

11. Need "Report Generation" Utility

12. Triggers, "IF UPDATE" question

13. DTS: "update" question