You query references a table named zip. Should this be zip9?
You may want to display the execution plan in Query Analyzer to see if the
plan provides some insight into the issue. Also, it could be that the
update is being blocked which is more likely when more rows are updated.
An unrelated issue is that the rows will be updated even if they already
contain the desired value. If the Client5M table may already contain the
values, you may want to add the following to the WHERE clause so that rows
are only updated if needed:
(zhen.dbo.Client5M.msa <> data.dbo.zip.msa OR
zhen.dbo.Client5M.blockgroup <> data.dbo.zip.blockgroup OR
zhen.dbo.Client5M.longitude <> data.dbo.zip.longitude OR
zhen.dbo.Client5M.latitude <> data.dbo.zip.latitude)
If you still need help, please post the table and index create DDL along
with some sample data.
Hope this helps.
SQL FAQ links (courtesy Neil Pike):
(faqxxx.zip in lib 7)
I am using the following query to update a 5-millon-row table (Client5M)
information from a table indexed by zip. In both cases there is a clustered
on the 9-digit zip code.
What I find is that as I manipulate the WHERE clause to vary the number of
updated, performance degrades more than linearly. For example, updating
74K rows takes 4 minutes,
385K rows takes 8 minutes, and
896K rows takes 54 minutes, and
5000K rows takes over 16 hours.
My question is this: Since both tables are indexed the same way an in the
order, which does the time required increase more than linearly with the
of rows affected, and what can I do about it? (So far, what I do is run
smaller updates, which is fine, but I would like to understand the reason.)
Here is the query:
SET msa = data.dbo.zip.msa,
blockgroup = data.dbo.zip.blockgroup,
longitude = data.dbo.zip.longitude,
latitude = data.dbo.zip.latitude
WHERE (zhen.dbo.Client5M.zip9 < '2') AND
(zhen.dbo.Client5M.Zip9 = data.dbo.zip.zip9)
Thanks in advance.