Newbie cache alignment question

Newbie cache alignment question

Post by John Edward » Wed, 30 Jul 2003 07:11:25



 Forgive the newbie-type question, but I have a structure that
I'm trying to make 32 byte aligned for cache efficiency.

  I've consulted various sources on this but can't seem to
come up with a viable soution. My guess is extra padding?

The entire structure definition:

#ifdef _MY_ARRAY_ACCESS_STRUCT

    struct MYArray
    {
        MYInt8 *yArrayBuffer;
        MYSize  itemCount;
        MYSize  itemSize;
        MYSize  capacity;
    };

#endif

typedef struct MYArray MYArray;

      [ See http://www.gotw.ca/resources/clcm.htm for info about ]
      [ comp.lang.c++.moderated.    First time posters: Do this! ]

 
 
 

Newbie cache alignment question

Post by Ben Hutching » Thu, 31 Jul 2003 07:19:40




>  Forgive the newbie-type question, but I have a structure that
> I'm trying to make 32 byte aligned for cache efficiency.

>   I've consulted various sources on this but can't seem to
> come up with a viable soution. My guess is extra padding?

There isn't a truly portable way to do this.

Quote:> The entire structure definition:

> #ifdef _MY_ARRAY_ACCESS_STRUCT

Names beginning with underscore and a capital letter are reserved
for the implementation, so you should change this.  This is not
just a theoretical problem.

Quote:>     struct MYArray
>     {
>         MYInt8 *yArrayBuffer;
>         MYSize  itemCount;
>         MYSize  itemSize;
>         MYSize  capacity;
>     };

> #endif

> typedef struct MYArray MYArray;

This typedef is redundant in C++.

Now, to get an instance of this that's 32-byte aligned, you will
need to do something along these lines:

    enum { ALIGNMENT = 32, MASK = ALIGNMENT - 1 };
    char buf[sizeof(MYArray) + MASK];
    MYArray * array =
        new ((void *)(((size_t)buf + MASK) & ~MASK)) MYArray;

Note that casting between integer and pointer types is not truly
portable, though it works on most implementations.  Also note
that this assumes that 32-byte-alignment is sufficient for the
structure.

If you need an array of MYArray structs, each 32-byte-aligned,
then this should work:

    enum { ALIGNMENT = 32, MASK = ALIGNMENT - 1 };

    union MYArrayPadded
    {
        MYArray array;
        char padding[(sizeof(MYArray) + MASK) & ~MASK];
    };

    char buf[sizeof(MYArrayPadded * n + MASK];
    MYArrayPadded * arrays =
        new ((void *)(((size_t)buf + MASK) & ~MASK))
            MYArrayPadded[n];

      [ See http://www.gotw.ca/resources/clcm.htm for info about ]
      [ comp.lang.c++.moderated.    First time posters: Do this! ]

 
 
 

Newbie cache alignment question

Post by Stefan Heinzma » Thu, 31 Jul 2003 07:44:25



> Forgive the newbie-type question, but I have a structure that
> I'm trying to make 32 byte aligned for cache efficiency.

>   I've consulted various sources on this but can't seem to
> come up with a viable soution. My guess is extra padding?

> The entire structure definition:

> #ifdef _MY_ARRAY_ACCESS_STRUCT

>     struct MYArray
>     {
>         MYInt8 *yArrayBuffer;
>         MYSize  itemCount;
>         MYSize  itemSize;
>         MYSize  capacity;
>     };

> #endif

I know of no portable solution. However, since you're assuming that a
cache line  holds 32 bytes, you're probably not interested much in
portability. So it would be a good idea to peruse the documentation of
your compiler. Some allow you to control alignment explicitly.

You may also experiment with placement new. The idea is to allocate a
buffer that is somewhat larger than your struct size and construct
your struct within it at the correct address (that has the lower 5
bits set to zero).

Quote:> typedef struct MYArray MYArray;

I thought you were talking C++, not C, so what's the use of this
typedef?!?

Cheers
Stefan

      [ See http://www.gotw.ca/resources/clcm.htm for info about ]
      [ comp.lang.c++.moderated.    First time posters: Do this! ]

 
 
 

Newbie cache alignment question

Post by llewell » Thu, 31 Jul 2003 07:58:00



>  Forgive the newbie-type question, but I have a structure that
> I'm trying to make 32 byte aligned for cache efficiency.

The standard says nothing about how to specify alignment, and
    alignment varies quite a bit depending on compiler, compiler
    flags, architecture, etc - so this isn't a newbie-type question,
    IMO. (Nor is the general answer simple.)

Several implmentations have extensions with which to specify
    alignment. These are essentially unportable, but sometimes
    appropriate.

Quote:

>   I've consulted various sources on this but can't seem to
> come up with a viable soution. My guess is extra padding?

Addiing padding is another way, but IMO it's not much more portable
    than the aforementioned compiler-specific extensions. In both
    cases you must either decide your code will only support a
    particular implmentation (and often, a particular set of compiler
    flags as well), or implement build-time logic which detects
    implementation characteristics and selects aligment specification
    extensions or padding amounts appropriately.

Quote:

> The entire structure definition:

> #ifdef _MY_ARRAY_ACCESS_STRUCT

>     struct MYArray
>     {
>         MYInt8 *yArrayBuffer;
>         MYSize  itemCount;
>         MYSize  itemSize;
>         MYSize  capacity;
>     };

If MYSize is 4 bytes wide, and MYInt8* is also 4 bytes, wide, 16 byte
    alignment will give better caching (than 32 byte) because two will
    precisely fill 32 bytes.

Quote:

> #endif

> typedef struct MYArray MYArray;

[snip]

      [ See http://www.gotw.ca/resources/clcm.htm for info about ]
      [ comp.lang.c++.moderated.    First time posters: Do this! ]

 
 
 

Newbie cache alignment question

Post by Dhru » Thu, 31 Jul 2003 08:11:01



>  Forgive the newbie-type question, but I have a structure that
> I'm trying to make 32 byte aligned for cache efficiency.

>   I've consulted various sources on this but can't seem to
> come up with a viable soution. My guess is extra padding?

Yes, extra padding, but that is system dependant. You couls try compiler
switches, which doo all of that for you, and make it more portable (if all
compilers that you use support some sort of padding arguments).

Quote:

> The entire structure definition:

> #ifdef _MY_ARRAY_ACCESS_STRUCT

>     struct MYArray
>     {
>         MYInt8 *yArrayBuffer;
>         MYSize  itemCount;
>         MYSize  itemSize;
>         MYSize  capacity;
>     };

> #endif

What is MYSize, and MYInt8?

Quote:

> typedef struct MYArray MYArray;

Not needed in C++.

      [ See http://www.gotw.ca/resources/clcm.htm for info about ]
      [ comp.lang.c++.moderated.    First time posters: Do this! ]

 
 
 

Newbie cache alignment question

Post by Jon Hein » Thu, 31 Jul 2003 17:16:40


 > Forgive the newbie-type question, but I have a structure that
 > I'm trying to make 32 byte aligned for cache efficiency.

no worries. this is a good question. generally, if you're worried
about alignment, then you are likely not worried about portability. in
that case, you should work with your compiler to ensure alignment.

 >   I've consulted various sources on this but can't seem to
 > come up with a viable soution. My guess is extra padding?

don't do the padding yourself. let the compiler worry about the
padding, rather than you trying to count bytes. and padding does not
really guarantee alignment! the VC++ compiler, for instance, supports
the following:

__declspec( align( 16 ) )  // align to 16 byte boundary
struct MyStruct
{

Quote:};

further, you may be interested in checking the documentation about
packing  as well as disabling warnings in your compiler so that it
does not spew about padded structures. something like the following
pragma.

#pragma warning( disable: 4324 ) // structure was padded
                                  // due to __declspec( align())

there are several other suggestions, but it depends heavily on your
compiler, and specifically what you are trying to do. good luck!

regards,
jon heiner

      [ See http://www.gotw.ca/resources/clcm.htm for info about ]
      [ comp.lang.c++.moderated.    First time posters: Do this! ]

 
 
 

Newbie cache alignment question

Post by Matthew Towle » Fri, 01 Aug 2003 21:38:26




 > Forgive the newbie-type question, but I have a structure that
 >I'm trying to make 32 byte aligned for cache efficiency.
 >
 >  I've consulted various sources on this but can't seem to
 >come up with a viable soution. My guess is extra padding?
 >

Usually this sort of optimisation is only worth doing on embedded
systems where every cycle counts.  On such systems you can usually
customise how operator new allocates its memory, and just make it
always allocate on 32 byte boundaries.  Often the heap is aleady setup
to allocate on 8 byte boundaries as this is a good general tradeoff
between good alignment and wasting memory.

This wil obviously waste a lot of memory for smaller items so you
could do it otherwise by overloading the class operator new and delete
to do this for you.

I do not think it is worth worrying about this for stack based data,
as the stack will be in cache anyway and the best thing you can do is
use as little space as possible (so packing will probably hinder
things).

As with all optimisation, get overything working first, then try
chaning something and measure the effect.  Often these sorts of
opimisations can go the opposite way to what you expect.

Matt

      [ See http://www.gotw.ca/resources/clcm.htm for info about ]
      [ comp.lang.c++.moderated.    First time posters: Do this! ]

 
 
 

1. Cache alignment and dib-sections

I use CreateDibSection to create a bitmap.
This function returns a pointer to the memory-location of this bitmap.

This is all working fine.

But now I want to make sure this bitmap-data (memory location) is cache
aligned.
I can not supply this function with any alignment data, and also not with a
memory-location I can allocate and align myself.

According to the Win32 API documentation I must never use CreateBitmap(This
function accepts a self allocated memory-location) for performance reasons.

Is this done through the use of a FileMapping object, and if so, how can I
use the CreateFileMapping api to allocate the number of bytes for my bitmap,
which are also cache aligned(cache line size is 32 or 64 bytes)?

Many thanks in advance......

2. SANE, anyone?

3. Cache alignment and dibsections

4. R2 defaults

5. Pb with cache group constants (IE4 Cache management)

6. A3000-Problems

7. (newbie) makefile question (newbie)

8. Palm apps come back

9. Question on required 4-byte parameter alignment in Win32

10. fread byte alignment (NOT a GDI question)

11. structs, sizes and alignment - question

12. An alignment-related question

13. Question on FSD and file system cache