Migrating from Dynamic Server 7.2 to IDS 9 (2000)

Migrating from Dynamic Server 7.2 to IDS 9 (2000)

Post by William Ric » Tue, 18 Sep 2001 21:49:56



<SNIP>

Quote:>>***Jack
>> >You don't include an onstat -d - but from the -D I can see some layout
>> >issues.  Your dbspaces are deep, - probably designed around tables?  They
>> >are not balanced to the number of CPUs.  A table fragmented over eight or
>> >twelve dbspaces might perform better that one built on a single
>seven-chunk
>> >dbspace.  I'm not really sure what your schema is or your application -
>so
>> >YMMV.  However you can see that the load on the disks is not even.  Your
>> >disks are getting hammered  You number of temp spaces should be a
>multiple
>> >of the number of CPUs.  I would pull the whole thing apart and rebuild it
>> >just to get the disks to group on the onstat -d output.  .

>>***Dirk
>> I am starting to see disk bottlenecks yes. The previous weekend I
>> unloaded and fragmented one of our bigger tables that gets accessed
>> a lot.
>> Fragmented by round robin and  the disks seem much better this week.
>> (This specific table  gets a lot of reads and not so many writes).

>> Could you perhaps elaborate a bit more on the issue of balancing dbspace
>to
>> the number of cpus ? Are you just refering to fragmentation of tables so
>> that multithreading can take place ?

>>***Jack
>>If you have a table fragmented over 4 dbspaces and 4 CPUs, you will get 4
>>scan threads.  Odds are that they will run together and complete in about
>>the same amount of time.  If you have 5 dbspaces, then you will get 5 scan
>>threads - one of those will have to bounce around from CPU to CPU.  8
>>threads and you would get two per cpu - no bouncing.  Mind you this is best
>>when your database has to do scans of all fragments.

>>Fragmented indices tend to also be shallower - hence not as many reads when
>>looking for data.  If the index is big........

>>Round Robin is a lovely thing when it comes time to load, but it means no
>>fragment elimination when it comes time to read.  Read on.

>***Dirk
>This makes perfect sense thank you. The only thing that I still wonder about
>is the round robin fragmentation. I would guess from what you are saying,
>that round robin fragmentation for reading doesn't help, because of the
>records not being contiguous, if you understand what I mean, ie. if I have
>to do a sequential scan on a range of records, the time to access them,
>because they are now scattered across fragments, will actually slow me down
>?

***Will
Actually just the opposite.  If you do a sequential scan, it will have one
scan thread for each fragment, reading your data in parallel(it is sequential
for each fragment, it doesn't have to search one fragment and then another
...)  The only case I can think of where things might behave differently is if
you had a table clustered by an index(I have never fragmented a table and
clustered it by an index, so don't know if it's possible).

For random access, if your index is fragmented with your table(round robin)
index lookups will incur more page reads because the database will have to
look at every fragment of the fragmented index to look up one record...

Unless you detach your index and put it in one dbspace, but that brings up
index contention for updates inserts, and deletes...

Fragmentation does have a lot of benefits, but in some cases it has penalties
too.

<SNIP>

Hope this helps,
Will

+++
"I am not laughing at you mom, I am laughing with Will" -Celena Cox
+++

 
 
 

Migrating from Dynamic Server 7.2 to IDS 9 (2000)

Post by William Ric » Wed, 19 Sep 2001 00:55:33



>> Unless you detach your index and put it in one dbspace, but that brings up
>> index contention for updates inserts, and deletes...

>Beware the 32GB/dbspace index limitation.

>cheers
>j.

I guess that could be a pretty big issue on some tables ...

 
 
 

Migrating from Dynamic Server 7.2 to IDS 9 (2000)

Post by Jack Parke » Tue, 18 Sep 2001 21:51:23


Might consider taking this offline - nobody else seems to be jumping all
over it.

Some minor deletia of concluded discussion.

Quote:> >***Jack
> > >You don't include an onstat -d - but from the -D I can see some layout
> > >issues.  Your dbspaces are deep, - probably designed around tables?
They
> > >are not balanced to the number of CPUs.  A table fragmented over eight
or
> > >twelve dbspaces might perform better that one built on a single
> seven-chunk
> > >dbspace.  I'm not really sure what your schema is or your application -
> so
> > >YMMV.  However you can see that the load on the disks is not even.
Your
> > >disks are getting hammered  You number of temp spaces should be a
> multiple
> > >of the number of CPUs.  I would pull the whole thing apart and rebuild
it
> > >just to get the disks to group on the onstat -d output.  .

> >***Dirk
> > I am starting to see disk bottlenecks yes. The previous weekend I
> > unloaded and fragmented one of our bigger tables that gets accessed
> > a lot.
> > Fragmented by round robin and  the disks seem much better this week.
> > (This specific table  gets a lot of reads and not so many writes).

> > Could you perhaps elaborate a bit more on the issue of balancing dbspace
> to
> > the number of cpus ? Are you just refering to fragmentation of tables so
> > that multithreading can take place ?

> >***Jack
> >If you have a table fragmented over 4 dbspaces and 4 CPUs, you will get 4
> >scan threads.  Odds are that they will run together and complete in about
> >the same amount of time.  If you have 5 dbspaces, then you will get 5
scan
> >threads - one of those will have to bounce around from CPU to CPU.  8
> >threads and you would get two per cpu - no bouncing.  Mind you this is
best
> >when your database has to do scans of all fragments.

> >Fragmented indices tend to also be shallower - hence not as many reads
when
> >looking for data.  If the index is big........

> >Round Robin is a lovely thing when it comes time to load, but it means no
> >fragment elimination when it comes time to read.  Read on.

> ***Dirk
> This makes perfect sense thank you. The only thing that I still wonder
about
> is the round robin fragmentation. I would guess from what you are saying,
> that round robin fragmentation for reading doesn't help, because of the
> records not being contiguous, if you understand what I mean, ie. if I have
> to do a sequential scan on a range of records, the time to access them,
> because they are now scattered across fragments, will actually slow me
down
> ?

Round robin means that as each row is inserted it goes into the next
dbspace.  If you have four dbspaces, then row one goes to dbspace 1, 2->2,
3>3, 4>4, 5>1, 6>2, .........  Easiest fragmentation scheme for the engine
to figure out while loading.  However, there is no real intelligence to such
a fragmentation scheme.  If you have fragmented by customer name (for
example)  A-F>dbspace1, G-M>dbspace2, N-Q>dbspace3....  Then when the user
does a query by last name the optimizer is smart enough not to bother with
fragments where the data can't be - it goes straight to the proper fragment.
Kind of a moot point, EXCEPT that if you have attached indices, they use the
same fragmentation schema as the data.  In fact XPS will complain if you try
to build an index on a round robin table without using a fragmentation
scheme.  Does it really matter if you are building a detached index?
probably not, unless you are going to query by something that is not in an
index and might have been in your fragmentation scheme.

With XPS there is another side to this.  You have multiple nodes (coservers)
across which the database is laid.  If you fragment two tables across these
coservers using a hash on the same column (fragment by hash(column)) AND you
join these two tables on that column (and you've laid out your disks
properly) you get a co-located join - each coserver does it's own join and
then sends the result set to the requesting coserver.  Also, the gating
factor when loading data is not generally your fragmentation scheme (ok -
half-truth there, you can slow things down with a * scheme) but disk
i/o.  Hence I am not a big fan of Round Robin.

- Show quoted text -

Quote:

> >***Jack

> > >4.5 million compresses?  Someone is doing a lot of deletes to cause
this.
> > >Find them, blame them, make them pay.  Look into why they are
deleting -
> > >perhaps truncates, or table rewrite would be more effective.

> > In our environment we do a lot of deletes yes. Our data gets old quickly
> > and then get moved to "history" tables. The data is still online and
> > reports / queries still take place on the older data, especially the
> > very recent records.
> > By the way, what exactly are compresseses - is this index rebuilds ?

> >When you 'delete' a row that space is not automatically re-available.
> There
> >is a 'next space' pointer in the page header which always points to the
end
> >of the data - new data is always placed at the 'end' of the page.  Over
> time
> >the engine will recognize that a page is getting 'holey' and will
re-write
> >the page freeing up space at the end - this is a compress.  Note that
there
> >is stuff at the end of the physical page - but let's not go there now -
> >hence the tic marks around the word 'end'.

> >In a delete situation like this consider fragmenting by the date of the
> data
> >(provided it is not updated - that would be a disaster).  Then when you
> need
> >to delete you can detach the old data fragment and re-attach it as a new
> >fragment for the new data.  This may imply a lot of fragments - depending
> on
> >how low a granularity you need and how long you keep the data.  It may
also
> >mean unbalancing your fragments against the number of CPUs.  But it's way
> >fast.  I've done 33 fragments on a 16CPU box for this sort of thing (40GB
> of
> >data).  The table screamed.

> ***Dirk
> I will have to take another look at fragmentation. I just don't know if it
> will help with our compression rate. It is highly likely that the programs
> do deletes on individual records as statuses change, and that I will not
> have much control over  this.

Do the rows have to be actually deleted?  Can you build a table of rows to
be deleted and then remove them in batch every evening?  At that point you
could do a table rewrite.  (What the h*ll is he suggesting???  Not do
deletes realtime?  (yes))

Quote:> > >Are you really supporting 2600 connections?

(because I want to remember to look that up)

cheers
j.

 
 
 

Migrating from Dynamic Server 7.2 to IDS 9 (2000)

Post by Jack Parke » Wed, 19 Sep 2001 00:16:47


Quote:

> Unless you detach your index and put it in one dbspace, but that brings up
> index contention for updates inserts, and deletes...

Beware the 32GB/dbspace index limitation.

cheers
j.

 
 
 

Migrating from Dynamic Server 7.2 to IDS 9 (2000)

Post by Dirk Moolma » Tue, 18 Sep 2001 18:32:15


>***Jack
> >You don't indicate how long this box has been up.  How long did it take
it
> >to get to 12 billion isam instructions?  That would be kind of
interesting
> >to know. 1200 checkpoint waits - so 1000 hours?  Or are you getting
> >checkpoints more frequently than CKPTINTL - my guess is yes, so maybe
2-300
> >hours?
>***Dirk
> Yes, sorry, the system was up for approximately 4 days by the time I sent
> the mail.
>***Jack
> >You don't include an onstat -d - but from the -D I can see some layout
> >issues.  Your dbspaces are deep, - probably designed around tables?  They
> >are not balanced to the number of CPUs.  A table fragmented over eight or
> >twelve dbspaces might perform better that one built on a single
seven-chunk
> >dbspace.  I'm not really sure what your schema is or your application -
so
> >YMMV.  However you can see that the load on the disks is not even.  Your
> >disks are getting hammered  You number of temp spaces should be a
multiple
> >of the number of CPUs.  I would pull the whole thing apart and rebuild it
> >just to get the disks to group on the onstat -d output.  .
>***Dirk
> I am starting to see disk bottlenecks yes. The previous weekend I
> unloaded and fragmented one of our bigger tables that gets accessed
> a lot.
> Fragmented by round robin and  the disks seem much better this week.
> (This specific table  gets a lot of reads and not so many writes).

> Could you perhaps elaborate a bit more on the issue of balancing dbspace
to
> the number of cpus ? Are you just refering to fragmentation of tables so
> that multithreading can take place ?
>***Jack
>If you have a table fragmented over 4 dbspaces and 4 CPUs, you will get 4
>scan threads.  Odds are that they will run together and complete in about
>the same amount of time.  If you have 5 dbspaces, then you will get 5 scan
>threads - one of those will have to bounce around from CPU to CPU.  8
>threads and you would get two per cpu - no bouncing.  Mind you this is best
>when your database has to do scans of all fragments.

>Fragmented indices tend to also be shallower - hence not as many reads when
>looking for data.  If the index is big........

>Round Robin is a lovely thing when it comes time to load, but it means no
>fragment elimination when it comes time to read.  Read on.

***Dirk
This makes perfect sense thank you. The only thing that I still wonder about
is the round robin fragmentation. I would guess from what you are saying,
that round robin fragmentation for reading doesn't help, because of the
records not being contiguous, if you understand what I mean, ie. if I have
to do a sequential scan on a range of records, the time to access them,
because they are now scattered across fragments, will actually slow me down
?

- Show quoted text -

>***Jack

> >Nice write cache hit rate.  Read cache hit - ok.

> >4.5 million compresses?  Someone is doing a lot of deletes to cause this.
> >Find them, blame them, make them pay.  Look into why they are deleting -
> >perhaps truncates, or table rewrite would be more effective.

> In our environment we do a lot of deletes yes. Our data gets old quickly
> and then get moved to "history" tables. The data is still online and
> reports / queries still take place on the older data, especially the
> very recent records.
> By the way, what exactly are compresseses - is this index rebuilds ?

>When you 'delete' a row that space is not automatically re-available.
There
>is a 'next space' pointer in the page header which always points to the end
>of the data - new data is always placed at the 'end' of the page.  Over
time
>the engine will recognize that a page is getting 'holey' and will re-write
>the page freeing up space at the end - this is a compress.  Note that there
>is stuff at the end of the physical page - but let's not go there now -
>hence the tic marks around the word 'end'.

>In a delete situation like this consider fragmenting by the date of the
data
>(provided it is not updated - that would be a disaster).  Then when you
need
>to delete you can detach the old data fragment and re-attach it as a new
>fragment for the new data.  This may imply a lot of fragments - depending
on
>how low a granularity you need and how long you keep the data.  It may also
>mean unbalancing your fragments against the number of CPUs.  But it's way
>fast.  I've done 33 fragments on a 16CPU box for this sort of thing (40GB
of
>data).  The table screamed.

***Dirk
I will have to take another look at fragmentation. I just don't know if it
will help with our compression rate. It is highly likely that the programs
do deletes on individual records as statuses change, and that I will not
have much control over  this.

- Show quoted text -

>***Jack

> >2.7 million seq scans?  Do you have a lot of small tables without
indices?
> >Or without stats updated?  If sequential scans are desirable then you
might
> >want to look into light scans and bumping up your DS parameters to
support
> >those.

> Yes, we have some small tables and do a lot of sequential scans.

>Check into those and check the size of the tables involved - if they are 8
>pages (or whatever your disk read is) then that means one disk access to
get
>to your data.  If they are 32 pages (e.g.) that means on average two disk
>accesses.  An indexed read is always one index read + 1 data read.  For
each
>level your index descends the number of reads required to find your index
>may be one per level.  Hence sequential scans CAN be more effective in some
>situations.  Generally in OLTP they are undesirable.

>A quick check might be select tabname from systables where npused > 8 and
>nindexes = 0;

***Dirk
Thanks, I will have a look at these.

- Show quoted text -

>***Dirk

> Update stats runs every night and is up to date. I double checked this
> recently by running the query -
> select tabname, max(constructed) from
> systables,sysdistrib
> where systables.tabid = sysdistrib.tabid
> group by 1
> order by 2,1

> I have a problem with the DS (PDQ) parameters. When I give a value  to
> MAX_PDQPRIORITY, all the users  start using PDQ, even without PDQPRIORITY
> being set in their environments. I had to disable MAX_PDQPRIORITY to get
> around this problem.
> I did the same for DS_TOTAL_MEMORY, gave it the lowest value I could.

>***Jack
>Yes, it is an issue.  XPS has some stored procedures which will be run
>before a user session starts where you can change that, but I see your
>problem.  You can also set the environment variable PDQPRIORITY=1 to
acheive
>the same effect.

> >While we're there, your RA_PAGES and THRESHOLD are way out of whack.
They
> >should be closer to one another.  With a setting of 128, you are doing a
> lot
> >of reading ahead - which goes into your buffers and implies management of
> >same.  This may be desirable, in which case your threshold should
increase
> >some to meet it.  120 for example.  If you want light scans then 128/120
is
> >an ideal setting.  However if you are doing OLTP work - which the rest of
> >this engine looks like, then you want to decrease these dependant on the
> >size of your tables.  Again, I don't know your application or how the
data
> >is being read/used.  You may have a buffer management issue here 13
million
> >buffer waits, 42 million latch waits - depending on how long the engine
has
> >been up.  You may have a problem here where you are flooding your buffers
> >with read ahead for a random access set of applications.

> I will play around with these values thank you.

> >Impressively low lockwaits - especially considering 14 billion lock
> >requests.  Why so many?  Is there really a need for all of those locks or
> >has some developer gone amok and decided that he will always lock
> everything
> >(s)he reads - even when running a report.  Find them....

> I do not know where the lock requests come from - any way I can trace this
?
> We do a lot of inserts on this system. To give you a better picture, this
is
> a transport company scanning thousands of barcodes into and out of
vehicles.
> Isn't this perhaps where  the lock requests  come from ?

>You can look at onstat -u and see who has a lot of lock requests and then
>see what they're doing.  For something like that I generally do up a script
>which scans every 'n' sceonds, when it finds something big in the locks
>column it would call a function to onstat -g sql and ses that session.

> >Are you really supporting 2600 connections?

>***Dirk
> My NETTYPE was originally set to 4,300   ,which I thought were 1200
> connections. Tech Support (and also in a recent discussion on this list)
> told me that it is in fact only 300 connections, and they requested me
> to change it to 4,600    ,which according to them is now only 600
> connections.
> Apparently the number of poll threads do not affect the number of
> connections like I originally thought.

>***Jack
>Surprises me too.  Of course it's one of those 'set and forget' things.
>I'll have to go read Manuella for that one.
>***Dirk
> The change to 4,600 was due to the error
> -25580  System error occurred in network function.

>***Jack
> >First thing I would do is decrease the read ahead.
> >Next thing I would do is plan a weekend of pulling apart the dbspaces and
> >rebuilding them.

> >Then I would re-visit and see what affect that had.

> >Of course that's my .02 worth.  I'm sure others will gently correct me or
> >provide other insight.

>***Dirk
> Thank you for all the input. We are moving from the current L2000 to
> an N4000 server next weekend (at long last!) and from there  on I think
> most of my effort will go into disk reorgs.

***Jack
>"Know thy data and know thy queries".  Once you can figure out what the
>users are doing against what data you can plan more effectively on how to
>deal with it.

>Let me know if the RA helps things.

***Dirk
Thanks, will let you know, and thank you for all the info.

==========================================================
This message contains information intended ...

read more »

 
 
 

Migrating from Dynamic Server 7.2 to IDS 9 (2000)

Post by Jack Parke » Wed, 19 Sep 2001 01:04:44


Fellow I worked with ran into it last year.  Took a while to figure out what
was happening.  I was a little curious as to why he had a single monolithic
index like that, but the application was rather unique.

cheers
j.

----- Original Message -----




Sent: Monday, September 17, 2001 11:55 AM
Subject: RE: Migrating from Dynamic Server 7.2 to IDS 9 (2000)


=====

> >> Unless you detach your index and put it in one dbspace, but that brings
up
> >> index contention for updates inserts, and deletes...

> >Beware the 32GB/dbspace index limitation.

> >cheers
> >j.

> I guess that could be a pretty big issue on some tables ...