[Main]
[Docs]
[Installs]
[Search]
[Team]
[Guestbook]
[Links]
CPU Cache Overview
To improve performance some CPUs of the 68k family are able to cache
memory accesses.
Caches are always refered using logical addresses, including the
function code of the access. That means that accesses in User Mode and
Supervisor Mode will create different Cache entries (please consult the Motorola documentations
for more info).
Following an overview about the abilities in caching on the 68k CPUs:
- 68000
nothing
- 68010
- Instruction Prefetch
two words prefetch, one word decoding register
- Loop Mode
is entered if a one word instruction is followed by a DBcc looping to the
previous instruction, no more instruction fetches occur until loop end
- 68020
- Instruction Prefetch
one long word
- Instruction Cache
16 lines á 16 byte = 256 byte
can be enabled or frozen via the CACR
- 68030
- Instruction Prefetch
one long word
- Instruction Cache
16 lines á 16 byte = 256 byte
can be enabled or frozen via the CACR
Burst mode forces whole cache lines to be read at once if hardware supports it
- Data Cache
16 lines á 16 byte = 256 byte
can be enabled or frozen via the CACR
always WriteThrough
selectable Write Allocation mode to force on write operations other
user/supervisor entries to get invalidated
Burst mode forces whole cache lines to be read at once if hardware supports it
- 68040
- Instruction Prefetch
one long word
- Instruction Cache
256 lines á 16 byte = 4096 byte
can be enabled via the CACR
- Data Cache
256 lines á 16 byte = 4096 byte
can be enabled via the CACR
selectable modes CopyBack/WriteThrough via MMU
- 68060
- Instruction Prefetch
one long word
- Instruction Cache
512 lines á 16 byte = 8192 byte
can be enabled, frozen and reduced to half size via the CACR
- Branch Cache
can be enabled via the CACR
not affected by the MMU setup!
- Superscalar Dispatch
can be enabled via the CACR
- Data Cache
512 lines á 16 byte = 8192 byte
can be enabled, frozen and reduced to half size via the CACR
selectable modes CopyBack/WriteThrough via MMU
- Push Buffer
can be disabled via the PCR
- Store Buffer
can be enabled via the CACR
Pages must not be NonCachable Serialized (precise)
The most important thing to understand is that the caches on 68030..68060
are controlled by the Cache Control Register (CACR) and the MMU!
In the CACR the caches will be globally enabled or disabled. Using the
MMU single Pages (4 KiB with WHDLoad) will be marked how they can be
cached.
On the 68030 a memory page can be Cacheable or NotCacheable. On a
68040/68060 it can be cachable WriteThrough, cachable CopyBack, NonCachable
(Imprecise) or NonCachable Serialized (Precise).
If the MMU is not used by WHDLoad, it controls only the CACR.
Default Cache Setup
By default the areas of WHDLoad, the Slave and ExpMem are marked as
cacheable CopyBack. The BaseMem area is marked as NonCachable and the Data
and Instruction Cache are enabled in the CACR. So the program located in
the BaseMem area runs without Caches but WHDLoad, the Slave and ExpMem uses the
Caches for best performance. If the MMU is not used by WHDLoad this setup
results in both Caches disabled because without the MMU a different setup
for different memory regions cannot be configured and therefore if any
region is marked as NonCacheable all Caches have to be disabled.
Programmers Cache Control
There are two resload functions to control
the Caches: resload_SetCACR and
resload_SetCPU. The resload_SetCACR is the historical
older routine and can be fully replaced by resload_SetCPU (WHDLoad internally
maps the arguments of resload_SetCACR and calls resload_SetCPU). Anyway the usage of
resload_SetCACR is recommended
for all people which do not know all about Caches and their behavior in a Amiga
system. Using resload_SetCACR the
instruction and data cache can seperately enabled or disabled. resload_SetCACR does only affect the
cacheability of the BaseMem area.
User Cache Control
If the programmer has done a clear job then the user has nothing to do
regarding the Caches because all required setup is already done by the
Slave.
Nevertheless there may be two reasons for changing manually the Cache
setup. First to make an install working which has problems because running
to fast (e.g. creating graphic errors) and second to make an installed
program run faster.
To make a crashing program work the option NoCache can be used. This option disables all
caches and marks all memory as NonCachable Serialized (Precise). If the
machine has 32-bit Chip-Memory it will still be faster than an original A500.
To make an installed program faster some options can be set to enable
Caches. That will overwrite the setup by the Slave. On the 68020 the option
Cache can be set. On 68030 also the option DCache can be used which includes option Cache. On
the 68060 there are some more options: BranchCache,
StoreBuffer and
SuperScalar. The option ChipNoCache/S can improve the performance
on 68040 and 68060, see below.
Cachebility of Chip-Memory
The cachebility can not only set by the CPU itself (CACR) and the MMU setup
but also by external hardware. The CPU is signaling on the bus if it tries
to cache an access. An external hardware can signal the CPU (after an
address has been put on the address bus during an memory access) that an
access must not be cached.
The mechanism that hardware signals the CPU that memory is not
cacheable or not is used on all (AFAIK) Amigas and CPU-Boards containing
CPUs >= 68030 because these have a data cache. Affected is the whole
Chip-Memory and IO-Space (CIA/Custom/RTC) which must not be cached
by the data cache. This is neccessary to avoid cache inconsistenies
because DMA activity or because hardware registers are volatile.
The reaction of the CPU on a hardware reject of a cacheable access varies
on the different CPUs. On the 68030 there is no impact on performance of the
access, the data will simply not cached. On the 68040 read accesses will be
performed in full speed but write accesses (CopyBack) will be canceled and
restarted without cachebility which results in around 5 times (depends on
hardware and CPU speed) slower access compared to an noncacheable access. On
the 68060 read and write accesses will be canceled and restarted. Read accesses
will be around 3 times slower and write accesses around 5 times.
The mentioned issues are related to data accesses. Instruction accesses are
usually not affected and also cacheable inside the Chip-Memory. There is some
(maybe broken) hardware which does not allow instructions to be cached in
Chip-Memory. On such hardware the option ChipNoCache/S should be used to avoid a major
slowdown in the execution speed because elsewhere instruction accesses will be
around 2 times slower.
This behavior can be checked by executing the Speed.Slave contained
in the src/memory-speed directory of the developer archive.
Burst Mode
The Burst mode on the 68030 tells the CPU to read always a
full cache line (16 bytes) if a cache miss occurs instead of only the long word
which was requested. The Burst mode must be supported by the hardware, if it
isn't no Burst happens without a time penalty. The Burst mode can be separately
enabled for the instruction and the data cache. Because a Burst access takes
longer than a single access the Burst mode gives only a performance advantage
if most of the entries in the cache line are also used before the cache line
gets flushed. For the instructions cache the Burst mode usually improves the
performance. For the data cache often only in scenarios where consecutive
memory reads occure. WHDLoad enables the instruction Burst together with the
instruction cache starting with version 18.0. The data Burst will not be
enabled by WHDLoad.
Write Allocation
The Write Allocation controls the cache handling on the 68030 when a cache miss
occurs on a write operation. Write Allocation must be enabled when parts of the
installed program are running in User Mode. If the installed program runs only
in Supervisor Mode Write Alloction can be disabled, which may give a minimal
performance advantage.
Branch Cache
The Branch Cache is only available in the 68060. It is a kind of
instruction cache for branch instructions. But in difference to the
instruction cache it is not affected by the MMU setup! That means even when
the appropriate memory Page is marked as Non Cacheable, branch instructions
will be cached if the Branch Cache is enabled.
Read the Motorola Microprocessors User Manuals for further information.
If you have corrections or additions to this page please contact me.
[Main]
[Docs]
[Installs]
[Search]
[Team]
[Guestbook]
[Links]