>Exchange 5.5 SP2,
Ummm, why not SP3 -- or even better, SP4? SP2 is years old.
Quote:>It appears my Exchange 5.5 SP2 database is corrupt. It is up and running at
>the moment, but a client is getting an error when trying to archive
>messages. I have had a look at the event log and there are a heap of 116
>event IDs with occasional 117, and 200. They all refer to -1018 errors in
The -1018 is telling you that the data, when read from a page, fails
to recompute the correct checksum. IOW, the page is nt as it was when
it was written.
Quote:>For the 116 errors Technet advises to restore from backup
>after determining the source of the problem.
That's the correct course of action.
Quote:>The source is a faulty disk
>that I will run checkdisk on (or possibly add a new disk and move the
>database to it).
It's unlikely that will result in the database being rendered error
free, but it /may/ relocate the damaged sector -- if the problem is
the disk and not the controller or cables (or memory).
Quote:>However, the last backup worked OK and therefore removed
>the transaction log files, but the Event ID errors started prior to this
Are you making on-line backups? An off-line backup will not detect
this damage because it doesn't verify that the checksum for each page
is correct. It allows you to make backups of data that is unusable.
Quote:>So if I restore this backup, I'll restore the corruption - Yes?
If you were making off-line backups, yes. If you were making ON-LINE
backups and you didn't get any -1018 errors then the backup is okay.
However, if that is the case (that you made on-line backups and didn't
log -1018 errors) then I'd try making another backup and see if you
are successful. If you are, then it's probably NOT the disk, but the
disk controller firmware that's the problem. There's a tool (ESEFILE)
that appeared with SP3 (IIRC) that will read the entire database and
verify the checksums. Since it does so with the Information Store
service stopped it eliminates the load on the controller and simply
reads the file sequentially. SP3 introduce another beneficial feture
-- a retry on -1018 errors that almost always succeeds if the cause is
firmware that delivers disk sectors in the incorrect order.
>I restore from a backup prior to this I'll lose all information after that
Not if the log files are on another disk spindle and that disk is
okay. When you restore the previous ON-LINE backup, the log files will
be replayed when the IS retarts.
Quote:>What is the best course of action to limit the amount of data loss.
Make an on-line backup and check the logs for success.
>I just run CheckDisk,
I'd try the backup first. If the ON-line backup fails, then try to get
an OFF-line backup. That way you'll at least have something to restore
if your earlier backup proves to be unusable.
Quote:>and provided it fixes the problem, then restore from
>the backup after the errors started appearing then run eseutil /r /is ?
After restoring from an on-line backup, the IS performs a recovery,
which is what that ESEUTIL would do with those parameters.
But, after restoring from the ON-LINE backup, you MUST restart the IS
before you do /anything/ else. Do NOT run ESEUTIL or ISINTEG until
AFTER the IS has had a chance to apply the .PAT file and replay the
>forget the backup and use eseutil.
ESEUTIL is the /last/ thing to try, not the first.
Quote:>Or will I just have to suffer the loss
>and restore from an earlier backup?
MCSE+I, Exchange MVP
MS Exchange FAQ at http://www.swinc.com/resource/exch_faq.htm