Post by Emmanuele Bass » Sat, 01 Dec 2001 09:04:15

Hi everyone,

I've recently compiled and tested each kernel since 2.4.13-pre6[0], and
I've noticed a recurrent (and reproducible[1]) deadlock on my system
when I try to play an mp3[2].

It occurs randomly, i.e. not after a precise amount of time the mp3 is
playing, but each and every time I try to play an mp3 file, my box
suddenly ``freeze'': no life signs at all (SysRq keys, network, even via
a serial terminal), no Oops, no trace in logs. The box simply `dies'.

I've tried hundreds of combinations, trying to understand where the
problem lies, and I've come up with... er... nothing...

o       it's not ext3: even vanilla kernels lock up;
o       it's not an hardware problem: I've tested my RAM and compiled
        kernels over kernels with (and without) optimization;
o       kernels <= 2.4.13-pre6 works properly;
o       it's not the player/library fault: I've tried many players, on
        different libraries; besides, a user-level program shouldn't
        cause such deadlocks;
o       every other operation on kernels > 2.4.13-pre6 works quite well
        (this new VM is *great*), *except* when I try to listen a
        mp3[3]: that always leads to disaster.

So far, I've excluded everything but a bug in the OSS sound drivers,
but, according to the ChangeLogs, they did not change from 2.4.13-pre6
(the last working kernel) to 2.4.13.


[0] Mainly, because it was the first kernel with the new VM and with the
ext3 patch available, excluding 2.4.10.

[1] At least, on my box.

[2] I use a SoundBlaster AWE64 (ISA) perfectly recognized both by isapnp
and 2.4.x kernels, using OSS modules. Yes, I've also tried not to use
modules. No, I did not try ALSA. Yes, the card works perfectly.

[3] Any other format, except .MOD files, works perfectly. And that's why
I suspect the sequencer code.


Emmanuele Bassi (Zefram)               [ ]
GnuPG Key fingerprint = 4DD0 C90D 4070 F071 5738  08BD 8ECC DB8F A432 0FF4


1. Deadlock on kernels > 2.4.13-pre6

I had this kind of deadlock on a MSI-6215 (i815) running in console mode (no X).
It always happened during screen blanking while there was interrupt load
(networking via ISDN). APM based screen blanking didn't work so I suspected APM
but at least this is only half true at maximum. The system does run fine with
APM but no APM screen blanking, if you disable console blanking completely by

echo -n -e "\033[9;0]\033[10;0]\033[11;0]\033[14;0]"

during the boot sequence, i.e. output to /dev/console (beeps silenced too but I
do believe this can be ignored). By now I do suspect the console blanking code
to be the trigger of the lockup, not the APM code.

Andreas Steinmetz
D.O.M. Datenverarbeitung GmbH
