RS6000 network performance problem

RS6000 network performance problem

Post by Wilfried Philip » Fri, 01 Nov 1996 04:00:00



Hello,

When testing a network switch, we encountered a very strange
performance problem involving two RS6000s (a model 365 and a model
320, respectively), the switch and other network traffic.

From the description below, you will see that the problem is not the
switch itself, but rather the RS6000, or the network, or some other
machine on the network.

The many experiments that we have performed (which are not all
described here) suggest that under some circumstances, the RS6000 reacts
badly to collisions, which results in an "avalanche" of colisions
(possibly due to  the small extra delay introduced by the switch?)
NOTE: the collisions are NOT late collisions!

Does anybody have any ideas on the cause of this problem? Or better
yet, does anybody have solution?

problem description
--------------------
When the RS6000/320 and the RS6000/365 communicate through the switch,
we observe very low throughputs  with rcp, nfs and ftp (of the
order of 100  kByte/s). Note that the switch is not very busy:
it only isolates the RS6000/320 from the network (see the diagram of
our network below, case A, connection indicated with ;;;;).

When we connect the 320 directly to the network (case B, connection
indicated with *****) the problem disappears and we achieve
throughputs of the order of 800 kbyte/s.

Also, when the  RS6000/365 is replaced with a HP-UX workstation or a SUN
workstation the problem remains in case A (and again does not occur in
case B)

Both IBMS are running AIX3.2.5. The 365 has a Standard Ethernet
Adapter and the 320 a "Ethernet High-Performance LAN Adapter"

Related problems
---------------------
A similar problems occurs between a RS6000/365 (or a  RS6000/320,
or an HP workstation) and an AXIS cd-rom server when  these systems are
connected directly to the (same stretch of) the ethernet, i.e., when
there is no switch involved. In this case the CD-romserver behaves as
an NFS-server and  the NFS troughput is below 100kbyte/s on the RS6000
clients. With a packet tracer we have noticed a "collision
multiplication"
in this case: there is an excessive number of collisions when
data is sent to the  RS6000 clients and collisions usually come in
bursts.

On the other hand, when the nfs-client is   a powerpc P20, or a Sun
workstation, or even an RS6000 with a different version of AIX (i.e.,
AIX 3.2)  the problem disappears (and throughputs of more than 600
kbyte/s are achieved). In this case all systems are again connected
directly to the ethernet and there is no switch involved.

Finally, when the cd-server and the RS6000/320 are connected
directly to the switch, which is in turn connected to the ethernet (see
diagram 2), the problem remains. However when the ethernet is
disconnected from the switch,  the problem disappears.

diagram 1:our network
-----------------------

                       | internet                        | other lab        
                       |                                 |            
            |------------------|                  |------------------|
            |multiport repeater|                  |         brige    |
            |------------------|                  |------------------|      
                       |                                 |
                       |        ethernet                 |
           ____________|_________________________________|  
          |                         |            
          |                         |            
   |--------------------|   |--------------------|
   |multiport repeater 1|   |multiport repeater 2|
   |--------------------|   |--------------------|
     |   |  | (8 stretches      |   |  | (8 stretches  
     |   |  | many machines     |   |  |  many machines      
     |   |  | on each stretch)           on each stretch)      
     |   |                                          
     |   |___________________***********************************
     |                      |  ethernet                        *
     |                      |                                  *  
     |              |-------------|                 case B-->  *
     |              |media changer|                            *
     |              |-------------|                            *
     |                      | utp                              *
     |              |-------------|  utp  |-------------|      *
     |              | 10Mb switch |-------|media changer|      *
     |              |-------------|       |-------------|      *
     |                                           ;             *  
     |                              case A -->   ;    **********
     |                                           ;    *
 |--------------|                          |--------------|
 |IBM RS6000/365|                          |IBM RS6000/320|
 |--------------|                          |--------------|

diagram 2: test configuration with cdrom-server
----------
__________________________________________ethernet stretch (many
machines)
           |
     |-------------|  
     |media changer|  
     |-------------|  
           |utp
           |
     |-------------|  utp  |-------------|    
     | 10Mb switch |-------|media changer|      
     |-------------|       |-------------|      
          |                       |  
          |utp                    | ethernet
          |                       |
     |-----------------|   |--------------|
     |AXIS CDROM SERVER|   |IBM RS6000/320|
     |-----------------|   |--------------|

 
 
 

RS6000 network performance problem

Post by Rick Jon » Sat, 02 Nov 1996 04:00:00


Sounds like it could be capture effect. One way to verify would be to
run something like ttcp or netperf with differnet socket buffer sizes
- say 4096, 8192, 16384, 32768 and 57344.

One of the symptoms of capture effect is larger windows get lower
throughput.

rick jones
http://www.cup.hp.com/netperf/NetperfPage.html

PS - netperf also has a latency test that you could use to see how
much latency is added by the switch. I doubt it adds much, but it
still might be interesting to check.

 
 
 

RS6000 network performance problem

Post by Colin Ri » Sun, 03 Nov 1996 04:00:00


I haven't got time to look properly, but as a quick suggestion, if you
are conducting testing when not connected to the rest of your network
and are using DNS on the test hosts then everything will grind to a
halt. Try removing your /etc/resolv.conf....

Colin Rice

Wilfried Philips <phil...@elis.rug.ac.be> wrote:
>Hello,
>When testing a network switch, we encountered a very strange
>performance problem involving two RS6000s (a model 365 and a model
>320, respectively), the switch and other network traffic.
>From the description below, you will see that the problem is not the
>switch itself, but rather the RS6000, or the network, or some other
>machine on the network.
>The many experiments that we have performed (which are not all
>described here) suggest that under some circumstances, the RS6000 reacts
>badly to collisions, which results in an "avalanche" of colisions
>(possibly due to  the small extra delay introduced by the switch?)
>NOTE: the collisions are NOT late collisions!
>Does anybody have any ideas on the cause of this problem? Or better
>yet, does anybody have solution?
>problem description
>--------------------
>When the RS6000/320 and the RS6000/365 communicate through the switch,
>we observe very low throughputs  with rcp, nfs and ftp (of the
>order of 100  kByte/s). Note that the switch is not very busy:
>it only isolates the RS6000/320 from the network (see the diagram of
>our network below, case A, connection indicated with ;;;;).
>When we connect the 320 directly to the network (case B, connection
>indicated with *****) the problem disappears and we achieve
>throughputs of the order of 800 kbyte/s.
>Also, when the  RS6000/365 is replaced with a HP-UX workstation or a SUN
>workstation the problem remains in case A (and again does not occur in
>case B)
>Both IBMS are running AIX3.2.5. The 365 has a Standard Ethernet
>Adapter and the 320 a "Ethernet High-Performance LAN Adapter"
>Related problems
>---------------------
>A similar problems occurs between a RS6000/365 (or a  RS6000/320,
>or an HP workstation) and an AXIS cd-rom server when  these systems are
>connected directly to the (same stretch of) the ethernet, i.e., when
>there is no switch involved. In this case the CD-romserver behaves as
>an NFS-server and  the NFS troughput is below 100kbyte/s on the RS6000
>clients. With a packet tracer we have noticed a "collision
>multiplication"
>in this case: there is an excessive number of collisions when
>data is sent to the  RS6000 clients and collisions usually come in
>bursts.
>On the other hand, when the nfs-client is   a powerpc P20, or a Sun
>workstation, or even an RS6000 with a different version of AIX (i.e.,
>AIX 3.2)  the problem disappears (and throughputs of more than 600
>kbyte/s are achieved). In this case all systems are again connected
>directly to the ethernet and there is no switch involved.
>Finally, when the cd-server and the RS6000/320 are connected
>directly to the switch, which is in turn connected to the ethernet (see
>diagram 2), the problem remains. However when the ethernet is
>disconnected from the switch,  the problem disappears.
>diagram 1:our network
>-----------------------
>                       | internet                        | other lab        
>                       |                                 |            
>            |------------------|              |------------------|
>            |multiport repeater|              |         brige    |
>            |------------------|                  |------------------|      
>                       |                                 |
>                       |        ethernet                 |
>           ____________|_________________________________|  
>          |                         |            
>          |                     |            
>   |--------------------|   |--------------------|
>   |multiport repeater 1|   |multiport repeater 2|
>   |--------------------|   |--------------------|
>     |   |  | (8 stretches      |   |  | (8 stretches  
>     |   |  | many machines     |   |  |  many machines      
>     |   |  | on each stretch)           on each stretch)      
>     |   |                                          
>     |   |___________________***********************************
>     |                      |  ethernet                        *
>     |                      |                                  *  
>     |              |-------------|                 case B-->  *
>     |              |media changer|                            *
>     |              |-------------|                            *
>     |                      | utp                              *
>     |              |-------------|  utp  |-------------|      *
>     |              | 10Mb switch |-------|media changer|      *
>     |              |-------------|   |-------------|      *
>     |                                           ;             *  
>     |                              case A -->   ;    **********
>     |                                           ;    *
> |--------------|                          |--------------|
> |IBM RS6000/365|                      |IBM RS6000/320|
> |--------------|                      |--------------|

>diagram 2: test configuration with cdrom-server
>----------
>__________________________________________ethernet stretch (many
>machines)
>           |
>     |-------------|  
>     |media changer|  
>     |-------------|  
>           |utp
>           |
>     |-------------|  utp  |-------------|    
>     | 10Mb switch |-------|media changer|      
>     |-------------|           |-------------|      
>          |                       |  
>          |utp                    | ethernet
>          |                       |
>     |-----------------|   |--------------|
>     |AXIS CDROM SERVER|   |IBM RS6000/320|
>     |-----------------|   |--------------|

 
 
 

RS6000 network performance problem

Post by Wilfried Philip » Tue, 05 Nov 1996 04:00:00



> I haven't got time to look properly, but as a quick suggestion, if you
> are conducting testing when not connected to the rest of your network
> and are using DNS on the test hosts then everything will grind to a
> halt. Try removing your /etc/resolv.conf....

Well, this is certainly not the problem. During the test, we did
exactly what you suggest.
Thanks anyway.

----
Wilfried Philips                                    
Vakgroep ELIS (Electronica en Informatiesystemen)  
RUG                                                  
St.-Pietersnieuwstraat 41                          
B9000 Gent                                        

Tel: 32-9-264.33.85
Fax: 32-9-264.35.94

----

 
 
 

RS6000 network performance problem

Post by Visj » Tue, 05 Nov 1996 04:00:00


Quote:

> Well, this is certainly not the problem. During the test, we did
> exactly what you suggest.
> Thanks anyway.

Maybe it has something to do whith the configurations you can see with
the command 'no -a'. An important part of that is tcp_sendspace and
tcp_recvspace. (in your case also maybe udp_xxx). It's important that
the computers who comunicate witch each other have roughly the same
values. (I don't know why, but it solved our problem :)

Maybe another hint: how about the 3-4-5 rule for ethernet? I had some
difficulties to understand your diagram, but maybe...

Regards, Klaas Visser.

> ----
> Wilfried Philips
> Vakgroep ELIS (Electronica en Informatiesystemen)
> RUG
> St.-Pietersnieuwstraat 41
> B9000 Gent

> Tel: 32-9-264.33.85
> Fax: 32-9-264.35.94

> ----

--

=-----------------------------------------------------------------=
 K.R. Visser    |      You want a T3 *and* correct routing?            

 blah blah      |      What is the world coming to these days.
=-----------------------------------------------------------------=

 
 
 

RS6000 network performance problem

Post by Wilfried Philip » Wed, 06 Nov 1996 04:00:00



> > Well, this is certainly not the problem. During the test, we did
> > exactly what you suggest.
> > Thanks anyway.

> Maybe it has something to do whith the configurations you can see with
> the command 'no -a'. An important part of that is tcp_sendspace and
> tcp_recvspace. (in your case also maybe udp_xxx). It's important that
> the computers who comunicate witch each other have roughly the same
> values. (I don't know why, but it solved our problem :)

Well, we modified all network paramaters that could be
modified. Anyway, the defaults should work, shouldn't they.

Quote:> Maybe another hint: how about the 3-4-5 rule for ethernet? I had some
> difficulties to understand your diagram, but maybe...

What is that? (I am not a network specialist and I did not design our
network.)

----
Wilfried Philips                                    
Vakgroep ELIS (Electronica en Informatiesystemen)  
RUG                                                  
St.-Pietersnieuwstraat 41                          
B9000 Gent                                        

Tel: 32-9-264.33.85
Fax: 32-9-264.35.94

----

 
 
 

1. Performance Problem with RS6000/350 Network I/F card

We have an RS6000 model 350 that we have connected to our
ethernet. It has a Type 2-8 ethernet interface which seems
to be a newer version of the rs6000 ethernet interface card.

We see a tremendous performance hit when we ftp files
back and forth. We isolated the two systems by having
a short piece of wire connecting the two machines
with no other traffic and see the same result.

We are running AIX 3.2.3 on both systems. ALso,
it was quite a surprise to discover that in order
to switch between Thick and THin wire on the card
we had to move a jumper on the card instead of
being able to do it via SMIT.

Has anyone else run into this problem?

        -Randy Marchany
        VA Tech Computing Center
        Blacksburg, VA 24060


2. Wireless Lan support - Cheap?

3. Re. Problems adding tape drives to RS6000/220 Re: Problems adding tape drives to RS6000/220

4. Fonts problem

5. Performance problems on RS6000

6. NFS: NeXT <--> Linux

7. RS6000, PowerServer 530 performance under load conditions..?

8. ptyflush: need help

9. Mathematica performance on rs6000

10. xengine performance on RS6000

11. RS6000 I/O Performance

12. Performance Tools on RS6000/AIX

13. RS6000 & SAN and very slow performance