aic7xxx driver panics under heavy swap.

aic7xxx driver panics under heavy swap.

Post by Bulent Abal » Thu, 21 Jun 2001 00:50:09



Justin,
When free memory is low, I get a series of aic7xxx messages followed by
panic.
It appears to be a race condition in the code.  Should you panic?  I tried
the following
patch to not panic.  But I am not sure if it is functionally correct.
Bulent

scsi0: Temporary Resource Shortage
scsi0: Temporary Resource Shortage
scsi0: Temporary Resource Shortage
scsi0: Temporary Resource Shortage
scsi0: Temporary Resource Shortage
Kernel panic: running device on run list

--- aic7xxx_linux.c.save Mon Jun 18 20:25:35 2001

           * Get an scb to use.
           */
          if ((scb = ahc_get_scb(ahc)) == NULL) {
+              ahc->flags |= AHC_RESOURCE_SHORTAGE;
               if ((dev->flags & AHC_DEV_ON_RUN_LIST) != 0)
-                   panic("running device on run list");
+                   return;
+                   // panic("running device on run list");
               LIST_INSERT_HEAD(&ahc->platform_data->device_runq,
                          dev, links);
               dev->flags |= AHC_DEV_ON_RUN_LIST;
-              ahc->flags |= AHC_RESOURCE_SHORTAGE;
+              // ahc->flags |= AHC_RESOURCE_SHORTAGE;
               printf("%s: Temporary Resource Shortage\n",
                      ahc_name(ahc));
               return;

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in

More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

 
 
 

aic7xxx driver panics under heavy swap.

Post by Justin T. Gibb » Thu, 21 Jun 2001 02:10:05


Quote:

>Justin,
>When free memory is low, I get a series of aic7xxx messages followed by
>panic.  It appears to be a race condition in the code.

Its actually a logic error, not a race condition.  You should never
enter ahc_linux_run_device_queue() while the device is still on the
run queue.  The real issue is that ahc_linux_queue bypasses the
round-robin device scheduler by calling ahc_linux_run_device_queue()
directly.  The code should look like this (the LIST macro calls
where switched to TAILQ calls a bit ago to ensure round-robin, but
that change came just after 6.1.13).  I haven't tested this yet...

Thanks for the bug report.  If you can verify that this works under
memeory pressure, the printf can go away.

--
Justin

==== //depot/src/linux/drivers/scsi/aic7xxx/aic7xxx_linux.c#67 - /usr/src/linux/drivers/scsi/aic7xxx/aic7xxx_linux.c ====
--- /tmp/tmp.3288.0     Tue Jun 19 11:07:32 2001

        }
        cmd->result = CAM_REQ_INPROG << 16;
        TAILQ_INSERT_TAIL(&dev->busyq, (struct ahc_cmd *)cmd, acmd_links.tqe);
-       ahc_linux_run_device_queue(ahc, dev);
+       if ((dev->flags & AHC_DEV_ON_RUN_LIST) == 0) {
+               TAILQ_INSERT_TAIL(&ahc->platform_data->device_runq, dev, links);
+               dev->flags |= AHC_DEV_ON_RUN_LIST;
+               ahc_linux_run_device_queues(ahc);
+       }
        ahc_unlock(ahc, &flags);
        return (0);

        struct   ahc_tmode_tstate *tstate;
        uint16_t mask;

+       if ((dev->flags & AHC_DEV_ON_RUN_LIST) != 0)
+               panic("running device on run list");
+
        while ((acmd = TAILQ_FIRST(&dev->busyq)) != NULL
            && dev->openings > 0 && dev->qfrozen == 0) {

                 * running is because the whole controller Q is frozen.
                 */
                if (ahc->platform_data->qfrozen != 0) {
-                       if ((dev->flags & AHC_DEV_ON_RUN_LIST) != 0)
-                               return;

                        TAILQ_INSERT_TAIL(&ahc->platform_data->device_runq,

                 * Get an scb to use.
                 */
                if ((scb = ahc_get_scb(ahc)) == NULL) {
-                       if ((dev->flags & AHC_DEV_ON_RUN_LIST) != 0)
-                               panic("running device on run list");
                        TAILQ_INSERT_TAIL(&ahc->platform_data->device_runq,
                                         dev, links);
                        dev->flags |= AHC_DEV_ON_RUN_LIST;
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in

More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

 
 
 

aic7xxx driver panics under heavy swap.

Post by Bulent Abal » Thu, 21 Jun 2001 23:10:06


Justin,
Your patch works for me.  printk "Temporary Resource Shortage"
has to go, or may be you can make it a debug option.

Here is the cleaned up patch for 2.4.5-ac15 with TAILQ
macros replaced with LIST macros.  Thanks for the help.
Bulent

--- aic7xxx_linux.c.save Mon Jun 18 20:25:35 2001

     }
     cmd->result = CAM_REQ_INPROG << 16;
     TAILQ_INSERT_TAIL(&dev->busyq, (struct ahc_cmd *)cmd, acmd_links.tqe);
-    ahc_linux_run_device_queue(ahc, dev);
+    if ((dev->flags & AHC_DEV_ON_RUN_LIST) == 0) {
+         LIST_INSERT_HEAD(&ahc->platform_data->device_runq, dev, links);
+         dev->flags |= AHC_DEV_ON_RUN_LIST;
+         ahc_linux_run_device_queues(ahc);
+    }
     ahc_unlock(ahc, &flags);
     return (0);

     struct     ahc_tmode_tstate *tstate;
     uint16_t mask;

+    if ((dev->flags & AHC_DEV_ON_RUN_LIST) != 0)
+         panic("running device on run list");
+
     while ((acmd = TAILQ_FIRST(&dev->busyq)) != NULL
         && dev->openings > 0 && dev->qfrozen == 0) {

           * running is because the whole controller Q is frozen.
           */
          if (ahc->platform_data->qfrozen != 0) {
-              if ((dev->flags & AHC_DEV_ON_RUN_LIST) != 0)
-                   return;

               LIST_INSERT_HEAD(&ahc->platform_data->device_runq,

           * Get an scb to use.
           */
          if ((scb = ahc_get_scb(ahc)) == NULL) {
-              if ((dev->flags & AHC_DEV_ON_RUN_LIST) != 0)
-                   panic("running device on run list");
               LIST_INSERT_HEAD(&ahc->platform_data->device_runq,
                          dev, links);
               dev->flags |= AHC_DEV_ON_RUN_LIST;

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in

More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

 
 
 

1. Kernel panic: Loop 1 (aic7xxx driver)

Since kernel version 2.4.9 I haven't compiled a kernel until recently. With
2.4.17 I always got the "Kernel panic: Loop 1" from the aic7xxx driver. When I
removed this driver the kernel started correctly. My research on the net showed
that "Gregoire Favre (Kernel panic: Loop 1 (aic7xxx under 2.4.13-ac[246]))" has
the same problem as I and his report is almost exactly as mine. I won't write
mine here since I don't know how to save it into a file, anything is lost after
the hard reset (does anybody know how to save/retrieve kernel messages after a panic?).

Well but I can give a hint where to look for since the problem disappears when I
compile the aic7xxx driver into the kernel. If (with make menuconfig) the option
"SCSI support" and the low level driver "Adaptec aic7xxx support" is set to "M"
the problem occurs. It occurs only if the option _"SCSI support"_ is set to "M".
I guess only very few people use this option.

O. Wyss

--
Author of "Debian partial mirror synch script"
   ("http://dpartialmirror.sourceforge.net/")
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in

More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

2. COLA FAQ 7 of 7 23-Nov-2002

3. Kernel Panic with aic7xxx driver 6.2.1 in kernel 2.4.9

4. COMMERCIAL: MpegTV Player 1.0 - Realtime MPEG Video Player

5. kernel panic due to swap problem or broken SCSI driver or ???

6. Frustrated newbie. basic problem.

7. 2.4.14: crashing on heavy swap-load with SmartArray (dmesg/.config output)

8. Reading a file as it's written (a al tail -f)

9. Heavy duty disk swapping

10. Heavy Disk Swapping!!! Why?!?!?

11. 2.3 crashing under heavy swap

12. 2.4.14: crashing on heavy swap-load with SmartArray

13. HD powering down during heavy swap?