Athlon cache-line fix

Athlon cache-line fix

Post by Akira Tsukamot » Sun, 03 Nov 2002 09:10:04



This is a fix for Athlon cache-line.

For Athlon CPU, CONFIG_X86_MK7,
the X86_L1_CACHE_SHIFT is set to 6, 128 Bytes, and this value is used
for L1 cache aligning.

But the AMDs document clearly states that the cache-line for
Athlon is 64 Bytes.
When I set the X86_L1_CACHE_SHIFT = 5 the performance increased
significantly about 30%.

These are measurements from Takas simple socket benchmark program.
http://www.suna-asobi.com/~akira-t/linux/netio-bench/netio2.c

This is result for X86_L1_CACHE_SHIFT = 6.
(off:100, size:0x800000)
send/recv: copied 40.0 Mbytes in 0.117 seconds at 341.6 Mbytes/sec
(off:104, size:0x800000)
send/recv: copied 40.0 Mbytes in 0.116 seconds at 343.9 Mbytes/sec
(off:108, size:0x800000)
send/recv: copied 40.0 Mbytes in 0.116 seconds at 345.4 Mbytes/sec
(off:112, size:0x800000)
send/recv: copied 40.0 Mbytes in 0.115 seconds at 348.7 Mbytes/sec
(off:116, size:0x800000)
send/recv: copied 40.0 Mbytes in 0.114 seconds at 352.4 Mbytes/sec
(Entire log is here,
http://www.suna-asobi.com/~akira-t/linux/cache-align-fix/K7_cache_shi...)

This is result for X86_L1_CACHE_SHIFT = 5
(off:100, size:0x800000)
send/recv: copied 40.0 Mbytes in 0.086 seconds at 462.4 Mbytes/sec
(off:104, size:0x800000)
send/recv: copied 40.0 Mbytes in 0.087 seconds at 458.5 Mbytes/sec
(off:108, size:0x800000)
send/recv: copied 40.0 Mbytes in 0.087 seconds at 461.8 Mbytes/sec
(off:112, size:0x800000)
send/recv: copied 40.0 Mbytes in 0.088 seconds at 453.9 Mbytes/sec
(off:116, size:0x800000)
send/recv: copied 40.0 Mbytes in 0.088 seconds at 456.7 Mbytes/sec
(Entire log is here,
http://www.suna-asobi.com/~akira-t/linux/cache-align-fix/K7_cache_shi...)

I attached the patch to fix this. But a bit worry that somebody might
reverse this changes because Athlon has 128bytes L1.
(Athlon-L1, data 64bytes + instruction 64bytes = total 128bytes)

(I found this problem by accident while I was making faster
user_to/from_copy function, inspired from taka's faster_intel_copy,
which went into 2.5.45)

--- linux-2.5.45/arch/i386/Kconfig      Thu Oct 31 22:40:01 2002

 config X86_L1_CACHE_SHIFT
        int
-       default "5" if MWINCHIP3D || MWINCHIP2 || MWINCHIPC6 || MCRUSOE || MCYRIXIII || MK6 || MPENTIUMIII || M686 || M586MMX || M586TSC || M586
+       default "5" if MWINCHIP3D || MWINCHIP2 || MWINCHIPC6 || MCRUSOE || MCYRIXIII || MK6 || MK7|| MPENTIUMIII || M686 || M586MMX || M586TSC || M586
        default "4" if MELAN || M486 || M386
-       default "6" if MK7
        default "7" if MPENTIUM4

 config RWSEM_GENERIC_SPINLOCK

--

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in

More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

 
 
 

Athlon cache-line fix

Post by steve roeme » Sun, 03 Nov 2002 21:50:10


it speeds mine up too.

-steve

-----Original Message-----


Sent: Saturday, November 02, 2002 12:04 AM

Cc: Hirokazu Takahashi; Andrew Morton
Subject: [PATCH] Athlon cache-line fix

This is a fix for Athlon cache-line.

For Athlon CPU, CONFIG_X86_MK7,
the X86_L1_CACHE_SHIFT is set to 6, 128 Bytes, and this value is used
for L1 cache aligning.

But the AMDs document clearly states that the cache-line for
Athlon is 64 Bytes.
When I set the X86_L1_CACHE_SHIFT = 5 the performance increased
significantly about 30%.

These are measurements from Takas simple socket benchmark program.
http://www.suna-asobi.com/~akira-t/linux/netio-bench/netio2.c

This is result for X86_L1_CACHE_SHIFT = 6.
(off:100, size:0x800000)
send/recv: copied 40.0 Mbytes in 0.117 seconds at 341.6 Mbytes/sec
(off:104, size:0x800000)
send/recv: copied 40.0 Mbytes in 0.116 seconds at 343.9 Mbytes/sec
(off:108, size:0x800000)
send/recv: copied 40.0 Mbytes in 0.116 seconds at 345.4 Mbytes/sec
(off:112, size:0x800000)
send/recv: copied 40.0 Mbytes in 0.115 seconds at 348.7 Mbytes/sec
(off:116, size:0x800000)
send/recv: copied 40.0 Mbytes in 0.114 seconds at 352.4 Mbytes/sec
(Entire log is here,
http://www.suna-asobi.com/~akira-t/linux/cache-align-fix/K7_cache_shi...
g)

This is result for X86_L1_CACHE_SHIFT = 5
(off:100, size:0x800000)
send/recv: copied 40.0 Mbytes in 0.086 seconds at 462.4 Mbytes/sec
(off:104, size:0x800000)
send/recv: copied 40.0 Mbytes in 0.087 seconds at 458.5 Mbytes/sec
(off:108, size:0x800000)
send/recv: copied 40.0 Mbytes in 0.087 seconds at 461.8 Mbytes/sec
(off:112, size:0x800000)
send/recv: copied 40.0 Mbytes in 0.088 seconds at 453.9 Mbytes/sec
(off:116, size:0x800000)
send/recv: copied 40.0 Mbytes in 0.088 seconds at 456.7 Mbytes/sec
(Entire log is here,
http://www.suna-asobi.com/~akira-t/linux/cache-align-fix/K7_cache_shi...
g)

I attached the patch to fix this. But a bit worry that somebody might
reverse this changes because Athlon has 128bytes L1.
(Athlon-L1, data 64bytes + instruction 64bytes = total 128bytes)

(I found this problem by accident while I was making faster
user_to/from_copy function, inspired from taka's faster_intel_copy,
which went into 2.5.45)

--- linux-2.5.45/arch/i386/Kconfig      Thu Oct 31 22:40:01 2002

 config X86_L1_CACHE_SHIFT
        int
-       default "5" if MWINCHIP3D || MWINCHIP2 || MWINCHIPC6 || MCRUSOE ||
MCYRIXIII || MK6 || MPENTIUMIII || M686 || M586MMX || M586TSC || M586
+       default "5" if MWINCHIP3D || MWINCHIP2 || MWINCHIPC6 || MCRUSOE ||
MCYRIXIII || MK6 || MK7|| MPENTIUMIII || M686 || M586MMX || M586TSC || M586
        default "4" if MELAN || M486 || M386
-       default "6" if MK7
        default "7" if MPENTIUM4

 config RWSEM_GENERIC_SPINLOCK

--

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in

More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

  netio_test_shift_6.txt
35K Download

  netio_test_shift_5.txt
35K Download

 
 
 

Athlon cache-line fix

Post by Andrew Kanabe » Mon, 04 Nov 2002 01:20:09



> For Athlon CPU, CONFIG_X86_MK7,
> the X86_L1_CACHE_SHIFT is set to 6, 128 Bytes

Eh? L1_CACHE_BYTES is defined as (1 << L1_CACHE_SHIFT) in
include/asm-i386/cache.h, which makes for a cache line size of 64 bytes
which is right. Perhaps you were assuming the cache line size was
2 << L1_CACHE_SHIFT ?

Quote:>  config X86_L1_CACHE_SHIFT
>         int
> -       default "5" if MWINCHIP3D || MWINCHIP2 || MWINCHIPC6 || MCRUSOE || MCYRIXIII || MK6 || MPENTIUMIII || M686 || M586MMX || M586TSC || M586
> +       default "5" if MWINCHIP3D || MWINCHIP2 || MWINCHIPC6 || MCRUSOE || MCYRIXIII || MK6 || MK7|| MPENTIUMIII || M686 || M586MMX || M586TSC || M586
>         default "4" if MELAN || M486 || M386
> -       default "6" if MK7
>         default "7" if MPENTIUM4

Regardless of the above this patch can't be right: the PIII's cache line
size is 32 bytes and the P4's is 128 bytes. Interesting that it increases
performance (on at least one benchmark) though.

Andrew
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in

More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

 
 
 

Athlon cache-line fix

Post by Akira Tsukamot » Mon, 04 Nov 2002 06:10:06


Thank you for tring it.

Akira

On Sat, 02 Nov 2002 13:40:39 -0600

Quote:> it speeds mine up too.

> -steve

> -----Original Message-----
> Subject: [PATCH] Athlon cache-line fix

> This is a fix for Athlon cache-line.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in

More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/
 
 
 

Athlon cache-line fix

Post by Akira Tsukamot » Mon, 04 Nov 2002 06:30:09


On Sat, 2 Nov 2002 23:09:45 +0000


> > For Athlon CPU, CONFIG_X86_MK7,
> > the X86_L1_CACHE_SHIFT is set to 6, 128 Bytes

> Eh? L1_CACHE_BYTES is defined as (1 << L1_CACHE_SHIFT) in
> include/asm-i386/cache.h, which makes for a cache line size of 64 bytes
> which is right. Perhaps you were assuming the cache line size was
> 2 << L1_CACHE_SHIFT ?

Yes, it is 32bytes. :)
I think I was not sleeping right.

Quote:> Interesting that it increases
> performance (on at least one benchmark) though.

I also tried many times and it increases performace.

--

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in

More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

 
 
 

1. Athlon adv speculative caching fix removed from 2.4.20?

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Hi,

in patch-2.4.20 'static void amd_adv_spec_cache_disable(void)' was removed
from arch/i386/kernel/setup.c.

I haven't found any patch on lkml posted that did this and it wasn't mentioned
in 2.4.20's changelog.

Can someone point to a why this was removed, maybe a different fix since the
one mentioned says "Short-term fix" or is it not needed anymore for some
reason?

Bye,
Oliver

- --

http://kiza.kcore.de/    <--    homepage
PGP-key      -->    /pgpkey.shtml
http://kiza.kcore.de/journal/
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.1 (GNU/Linux)

iD8DBQE+SCXqOpifZVYdT9IRAlpTAKDRjsbgPJDj6MhjnrjuCnXzKaoODACg4f8/
w8NSN6QG9K+ULPaB9/AVRDU=
=xj4v
-----END PGP SIGNATURE-----

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in

More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

2. Can you see disk I/O partition by partition?

3. Slab cache name fixes / reiserfs boot bug fix.

4. Kernel Compile Help SPARC linux

5. Cache-attribute conflict bug in the kernel exposed on newer A MD Athlon CPUs

6. Netatalk on Linux/Alpha?

7. Pragma "no-cache" and Cache-Control "no-cache"

8. Mandrake 8 and 8.1 on IMB PC Server 330 with ServeRaid

9. [PATCH] fix UP local APIC on SMP Athlon

10. AMD Athlon GPF on mtrr kernel fix

11. Fix MTRR support for AMD Athlon

12. OPPS + FIX on Athlon 1800+ 1gb DDR Nvidia Frame buffer.

13. Athlon + Athlon optimized kernel => _mmx_mmcpy problems