data conversion for a very large table

data conversion for a very large table

Post by Sih Jenn » Sun, 31 Dec 1899 09:00:00

Hi! Folks,

Can someone provide me with some good ideas how to run this conversion.

I want to convert a integer column to serial and create a unique index on

table has 40 million rows
table has 3 dbspaces, each with 2 chunks and each chunk is 2 Gb, therefore
12 Gb in total
currently table also has 2 indexes, each in separate dbspaces and each is 2
Gb large

this table has duplicates in it at the moment therefore in order to create a
unique on it I need to clean up the duplicates first.

What I have tried was -

(1) find the duplicates
(2) create a dummy table and load the duplicates in
(3) deleted duplicates using delete * from table where duplicate column in
(select duplicat column from dummy table)
(4) when I ran "alter table" statement it ran out of temp dbspaces (I have

I think it requires the size of the temp dbspaces to be 12 Gb for sorting,

Anyway, this exercise was extremely slow.  I am looking for a smart

Thanks in advance for your help.


Jenny Sih
Database Administrator
Siemens Limited
Level 15  9 Hunter Street
Sydney 2000 Australia
Direct Line: (612) 9230 1039

CAUTION - This message may contain privileged and confidential information
intended only for the use of the addressee named above.  If you are not the
intended recipient of this message you are hereby notified that any use,
distribution or reproduction of this message is prohibited.  If you have
received this message in error please notify Siemens Ltd., ACN 004 347 880,
or Siemens (NZ) Limited immediately.


data conversion for a very large table

Post by Robert Taylo » Sun, 31 Dec 1899 09:00:00

My idea ( while not smart ) might work.
Sort your duplicates out as planned but , instead of ALTERing the table
simply unload it to a text file.
Drop the table and re-create it with the serial field instead of the
integer, load up using DBload and finally create the index when the load is
Don't forget that when creating the tables you should mention EXTENT SIZE
and NEXT SIZE and also specify another other info required.
Best to make these parameters being passed in to your program or environment
variables at the very least. This will mean its flexible. ( Assuming your
using 4GL , if not it can still all be done using SQL and scripts ).
Mail me if you need further help.