crash help, pgsql 7.2.1 on RH7.3

crash help, pgsql 7.2.1 on RH7.3

Post by Tom La » Fri, 22 Nov 2002 12:53:37




> running pgsql 7.2.1 on redhat7.3 SMP. installed a hacked glibc to fix the
> mktime() timezone problem for dates < 1970
> (http://rpms.arvin.dk/glibc/rh73/i686/)
> three times now the backend process has unexpectedly quit. what happens is
> the postmaster process and the stats processes disappear and only the client
> connection processes remain.

Really!?  That would seem to indicate a postmaster crash.  (The stats
processes are designed to quit automatically when the parent postmaster
exits, so it's no surprise they'd exit too.)  This is highly unusual,
and worth looking into more closely.

Quote:> i don't see a core file.

Check that you are starting the postmaster with "ulimit -c unlimited";
this is not the default on most Linuxen, so you may have to add that to
the start script.  Also note that the postmaster never does a chdir,
so if it drops core it will be in the same directory the start script
was running in.

Quote:> now that 7.2.3 is out and fixes the mktime() problem i should probably
> upgrade to that and revert to stock redhat glibc stuff.

Probably.  But I do not think the postmaster ever calls mktime(), so the
odds are that your glibc hack is unrelated.

                        regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 2: you can get off all lists at once with the unregister command

 
 
 

crash help, pgsql 7.2.1 on RH7.3

Post by Tom La » Fri, 22 Nov 2002 13:32:56


I said:


>> i don't see a core file.
> Check that you are starting the postmaster with "ulimit -c unlimited";
> this is not the default on most Linuxen, so you may have to add that to
> the start script.  Also note that the postmaster never does a chdir,
> so if it drops core it will be in the same directory the start script
> was running in.

Drat, I forgot to mention an important corollary: make sure the
postmaster is started in a directory that's writable by the postgres
user, else you'll get no corefile.

(For completeness I'll mention here that when individual backends dump
core, it's in the $PGDATA/base/nnn/ directory of the database they're
connected to.  So you can easily distinguish a postmaster core from
a backend core, just by where it was dropped.)

                        regards, tom lane

---------------------------(end of broadcast)---------------------------


 
 
 

crash help, pgsql 7.2.1 on RH7.3

Post by Tim Lync » Sun, 24 Nov 2002 14:35:54


okay, argh, after messing around with /etc/security/limits.conf, it would
have been nice to know that limits.conf doesn't change the default ulimit
rather the limits of user ulimit changes! mean to say, pam_limits.so and
limits.conf do not change the default ulimit, just the bounds, so then the
user can ulimit -c unlimited. i expect regular user to never be able to
increase their ulimits - call me old fasioned... what's next, regular user
negative renice?!? anyways...

but, uh, what am i going to do with a core file? i would need a non-stripped
postgres binary first, right?

i checked out the cwd in /proc, it is /var/lib/pgsql (actally i symlinked it
into another fs) which is postgres:postgres mode 700.

----- Original Message -----



Sent: Wednesday, November 20, 2002 8:31 PM
Subject: Re: [ADMIN] crash help, pgsql 7.2.1 on RH7.3

: I said:

: >> i don't see a core file.
:
: > Check that you are starting the postmaster with "ulimit -c unlimited";
: > this is not the default on most Linuxen, so you may have to add that to
: > the start script.  Also note that the postmaster never does a chdir,
: > so if it drops core it will be in the same directory the start script
: > was running in.
:
: Drat, I forgot to mention an important corollary: make sure the
: postmaster is started in a directory that's writable by the postgres
: user, else you'll get no corefile.
:
: (For completeness I'll mention here that when individual backends dump
: core, it's in the $PGDATA/base/nnn/ directory of the database they're
: connected to.  So you can easily distinguish a postmaster core from
: a backend core, just by where it was dropped.)
:
: regards, tom lane
:

---------------------------(end of broadcast)---------------------------
TIP 2: you can get off all lists at once with the unregister command


 
 
 

crash help, pgsql 7.2.1 on RH7.3

Post by Lamar Ow » Sun, 24 Nov 2002 15:31:48



Quote:> increase their ulimits - call me old fasioned... what's next, regular user
> negative renice?!? anyways...

Actually.... yes.

Quote:> but, uh, what am i going to do with a core file? i would need a
> non-stripped postgres binary first, right?

If you have the RPM, you have no debugging symbols.  You can rebuild it with
debugging -- the PGDG RPMset's can have debugging symbols enabled with a
simple macro define close to the top of the spec file.

Quote:> i checked out the cwd in /proc, it is /var/lib/pgsql (actally i symlinked
> it into another fs) which is postgres:postgres mode 700.

That's the standard place for PGDATA in Red Hat.
--
Lamar Owen
WGCR Internet Radio
1 Peter 4:11

---------------------------(end of broadcast)---------------------------

 
 
 

crash help, pgsql 7.2.1 on RH7.3

Post by Tom La » Mon, 25 Nov 2002 02:10:27



> but, uh, what am i going to do with a core file? i would need a non-stripped
> postgres binary first, right?

Yup, you would.  I'd recommend building from source so that you can add
both --enable-debug and --enable-cassert to the configure flags.  (It
may actually be possible to do that with the SRPM distro, but I don't
know how...)

                        regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 4: Don't 'kill -9' the postmaster

 
 
 

crash help, pgsql 7.2.1 on RH7.3

Post by Lamar Ow » Mon, 25 Nov 2002 05:10:48




> > but, uh, what am i going to do with a core file? i would need a
> > non-stripped postgres binary first, right?
> Yup, you would.  I'd recommend building from source so that you can add
> both --enable-debug and --enable-cassert to the configure flags.  (It
> may actually be possible to do that with the SRPM distro, but I don't
> know how...)

Install the source RPM (.src.rpm), then edit
/usr/src/redhat/SPECS/postgresql.spec, changing the line near the top that
says:
%define beta 0
to
%define beta 1

Save and exit, then 'rpmbuild -ba postgresql.spec', then install the rpms to
be found in /usr/src/redhat/RPMS/arch (i386 probably).

beta=1 defines both --enable-debug and --enable-cassert and allows the full
debugging, AFAIK.  If it doesn't, then we need to look a little closer at
it...
--
Lamar Owen
WGCR Internet Radio
1 Peter 4:11

---------------------------(end of broadcast)---------------------------
TIP 4: Don't 'kill -9' the postmaster

 
 
 

crash help, pgsql 7.2.1 on RH7.3

Post by Tom La » Mon, 25 Nov 2002 05:28:44




>> Yup, you would.  I'd recommend building from source so that you can add
>> both --enable-debug and --enable-cassert to the configure flags.  (It
>> may actually be possible to do that with the SRPM distro, but I don't
>> know how...)
> Install the source RPM (.src.rpm), then edit
> /usr/src/redhat/SPECS/postgresql.spec, changing the line near the top that
> says:
> %define beta 0
> to
> %define beta 1

Cool.  Thanks for the tip.

                        regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 2: you can get off all lists at once with the unregister command

 
 
 

1. crash help, pgsql 7.2.1 on RH7.3

running pgsql 7.2.1 on redhat7.3 SMP. installed a hacked glibc to fix the
mktime() timezone problem for dates < 1970
(http://rpms.arvin.dk/glibc/rh73/i686/)

three times now the backend process has unexpectedly quit. what happens is
the postmaster process and the stats processes disappear and only the client
connection processes remain.

i don't see a core file. nothing interesting is mentioned in the logs except
for the usual redo post-mortem on startup, "database system was interrupted
at ...". i'm new to postgres admin, so i'm hoping folks can give some
direction as to where to begin solving this problem. it's happened about 3
times in two months.

now that 7.2.3 is out and fixes the mktime() problem i should probably
upgrade to that and revert to stock redhat glibc stuff. my concern is that i
have no idea what caused these events and i don't know what to do to ensure
that when it happens again i'll be able to determine the cause. what type of
logging and monitoring is recommended to report the health of a running
postgres?

---------------------------(end of broadcast)---------------------------
TIP 6: Have you searched our list archives?

http://archives.postgresql.org

2. Moving to a new sql server

3. Help: Interrupt key is not working in 4gl RDS 7.31.UC1 on Linux RH7.2

4. Truncated filenames: dBase VI: exportTable <name> already exists. (Error 3010)

5. RH7.1 setup problem - HELP!

6. Dynamically typed data and DCE

7. pgsql 7.2.3 crash

8. Disabling triggers (was Re: pgsql 7.2.3 crash)

9. Apache with PHP and PGSQL crashing ...

10. backend process crash - PL/pgSQL functions - parsing problem?

11. pgsql 7.2.3 crash

12. Disabling triggers (was Re: pgsql 7.2.3 crash)