(Please forgive if this has been posted twice. The first posting
didn't show up on our own system.)
There has been some discussion of the compatibility of
the Motorola 68851 (PMMU) and the 68030 on chip PMMU.
Neither the PMMU nor the 68030 MMU consitutes a proper superset of the other.
The PMMU has instructions and registers not found in the 68030 MMU,
while the latter has registers not on the PMMU. However,
in a typical Unix implementation little work would needed to port
PMMU specific code to the 68030. The 68030 has registers not on the PMMU.
The important differences between the PMMU and the 68030 MMU in a Unix
environment, as I see it, are:
(1) The PMMU has a 64 entry cache with 8 process id tags.
The 68030 MMU has a 22 entry cache with no process id tags.
(2) The 68030 has two transparent translation registers (TTRs) that
pass the address through untranslated. Effectively, they are
cache hits not requiring cache entries.
Limiting the cache to 22 entries and forcing clearing and reload on
every context switch seems likely to result in performance degradation.
As mentioned, there are no process ids attached to the cache entries.
However, with but 22 entries this may not be the wrong approach.
My beliefs on the effects of the cache reduction on performance are
intuitive. Does anyone have actual measurements on cache size with/without
process id tags? in a Unix environment? in other environments?
If kernel text and data are mapped transparently then use of the transparent
translation registers always results in cache hits for the kernel whether
or not the kernel exhibits good locality of reference. Further, no cache
entries are required for addresses mapped by the TTRs.
We will the TTRs to map RAM and our other board addresses. Ram starts at
0 up to 64 MB, but the other board space begins at 0x2000,0000. So we will
use one TTR for kernel ram, and one for the hardware addresses.
However, the TTRs are programmed with a base address and mask,
like the 68451 descriptors, as opposed to a more sensible base
address and limit, so the memory map must be appropriate. The
minimum area mapped by a TTR is 16MB.
With the PMMU a "shared globally" bit can be set in a long (8 byte)
descriptor. The effect is that the cache entry is valid for all
process id's. However, each such page still requires a cache
entry, so non-locality of the kernel could be an issue if
too many cache entries are used.
(I seem to remember some past discussion on kernel non-locality,
but don't remember if the effects on overall cache hit rate
within a Unix environment were reported.)