sys-unconfig hosed system and now it will not come up!!!

sys-unconfig hosed system and now it will not come up!!!

Post by reg.. » Mon, 11 Aug 1997 04:00:00



Hi,

We just moved two machines to a new location and hence a new IP#.  I
decided to use sys-unconfig.  On the first machine, a 3000 running
2.5.1 I executed the command, the system halted and when it came up,
sysidtool, etc. prompted me.  In short, everything worked fine.

Then on the second system, a 5000 running 2.5.1 we brought the machine
down to single user mode (from multiuser mode) and then executed the
sys-unconfig command.  The command did its stuff and then once it
halted the system, the system would not come up _at all_!!  It gave
the prompt ^D to continue with normal startup or root passwd for
maintenance mode - but either way it would give the error:

unable to write to /var/adm/utmpx.

The first time it also gave the error: zs2: ring buffer overflow which
I was later told was a trivial error (I am still not so sure).

And yes, /var is a mounted file system as is /usr.

What I was able to do was boot from cdrom and see that vfstab had the
contents of the mnttab file, mnttab had the contents of nsswitch.conf
(!!!???) and nsswitch.conf was empty.  There may have been other files
but an ls -tal in /etc didn't find them.  I was at points able to
reconstruct the vfstab file to just the root filesystems (not the
logical volumes [we use veritas and has an ssa and an rsm on this
machine]).  Also I was able to mount all filesystems, including /var
and adm/utmpx etc. are all ok looking and none of the fsck's of /,
/usr, /var, etc. gave _any_ errors - not even superblock wrong counts,
etc.

But the system will still not come up.  I am leaning towards
reinstalling the OS because I have tried just about everything I can
think of.  Does anyone have any input on:
1. did sys-unconfig cause the file corruption (possibly because we
launched it in single user mode)
2. did the ring-buffer over flow cause the file corruption (just a
coincidence of bad timing)
3. are there other files out there that have been corrupted (and how
can I find that out) since the sytem still refuses to boot (the error
is always /var/adm/utmpx cannot write to... etc.

any help or ideas would be greatly appreciated!

thanks,

Adam

 
 
 

sys-unconfig hosed system and now it will not come up!!!

Post by Rachel Polansk » Mon, 11 Aug 1997 04:00:00




Quote:> Hi,

> We just moved two machines to a new location and hence a new IP#.  I
> decided to use sys-unconfig.  On the first machine, a 3000 running
> 2.5.1 I executed the command, the system halted and when it came up,
> sysidtool, etc. prompted me.  In short, everything worked fine.

> What I was able to do was boot from cdrom and see that vfstab had the
> contents of the mnttab file, mnttab had the contents of nsswitch.conf
> (!!!???) and nsswitch.conf was empty.  There may have been other files
> but an ls -tal in /etc didn't find them.  I was at points able to
> reconstruct the vfstab file to just the root filesystems (not the
> logical volumes [we use veritas and has an ssa and an rsm on this
> machine]).  Also I was able to mount all filesystems, including /var
> and adm/utmpx etc. are all ok looking and none of the fsck's of /,
> /usr, /var, etc. gave _any_ errors - not even superblock wrong counts,
> etc.

Hello,

I had exactly the same problem, when I installed 2.5.1 4/97 on my Ultra
1/200

When I ran sys-unconfig, the vfstab file was truncated and contained only
the info in nsswitch.conf

nsswitch.conf itself was empty!

I had no other file corruption.

I created a new copy of the vfstab, after coming up in single user mode,
and then rebooted.

I ran sys-unconfig a second time, and it worked just fine.

I am wondering why this happened to me and not to the other 2 Ultras
installed at our site....

rachel

--
Rachel Polanskis                 Kingswood, Greater Western Sydney, Australia


 "Yow!  Am I having fun yet?!" - John Howard^H^H^H^H^H^H^H^H Zippy the Pinhead

 
 
 

sys-unconfig hosed system and now it will not come up!!!

Post by Richard B. Joh » Tue, 12 Aug 1997 04:00:00



<some text removed>

Quote:

> But the system will still not come up.  I am leaning towards
> reinstalling the OS because I have tried just about everything I can
> think of.  Does anyone have any input on:
> 1. did sys-unconfig cause the file corruption (possibly because we
> launched it in single user mode)
> 2. did the ring-buffer over flow cause the file corruption (just a
> coincidence of bad timing)
> 3. are there other files out there that have been corrupted (and how
> can I find that out) since the sytem still refuses to boot (the error
> is always /var/adm/utmpx cannot write to... etc.

> any help or ideas would be greatly appreciated!

> thanks,

> Adam

Howdy.

I have come across the exact same thing several times, and never really
investigated why it happened.

If you can boot off of the cdrom, mount whatever partition you have /etc
on and cd to where the /etc directory would be.  Look for a hidden file
named ".sysIDtool.state".  The contents of it are similar to the answers
you would normally give during a manual installation (or after a
successful 'sys-unconfig').  

I have had a measure of success manually editing this file and setting the
values all back to "0" and doing an 'init 6' to reboot the system.  I was
then prompted again for the hostname, IP address, etc. just as if the
system was completely unconfigured.  This usually worked if I had made no
hardware or other changes that would affect the OS.

The ring buffer overflow is usually due to a different mouse or keyboard
being used than that which was configured since the system last did a
reconfiguration reboot.  Next chance you get, a 'touch /reconfiguration;
init 6' should clear it up.

I've also seen your #3 problem before.  If you haven't yet, you may need
to check your EPROM value for your "boot-device" and make sure that you
haven't inadvertently changed it so that the system is trying to boot off
of a different device.  Do a  'probe-scsi-all' to make sure that all your
SCSI devices are connected properly and that there's no problem with the
bus.

Lastly, I had this problem when I had backed up a system to tape and
restored it to a different drive at a different SCSI address.  I had had
the system booting off of a disk at SCSI address 3, and now wanted to boot
off of a disk at SCSI address 0.  The problem was that before I backed up
the system, I had had no drive already at SCSI address 0, so the OS had no
device driver already in the system so it didn't recognize a disk at
address 0 as being valid.  It took a week for me and SUN to figure that
one out...

Anyway, sorry for being so verbose.  And I hope that I was a little help.

Good luck.

Regards,

R Johns