Emu68 Control Registers
Register name | Number | RW | Size | Description |
---|---|---|---|---|
CNTFRQ | 0xe0 | RO | LONG | Frequency in Hz of the free running counter |
CNTVALLO | 0xe1 | RO | LONG | Free running counter value, lower 32 bits |
CNTVALHI | 0xe2 | RO | LONG | Free running counter value, higher 32 bits |
INSNCNTLO | 0xe3 | RO | LONG | Number of executed M68k instructions, lower 32 bits |
INSNCNTHI | 0xe4 | RO | LONG | Number of executed M68k instructions, higher 32 bits |
ARMCNTLO | 0xe5 | RO | LONG | Number of executed ARM instructions, lower 32 bits |
ARMCNTHI | 0xe6 | RO | LONG | Number of executed ARM instructions, higher 32 bits |
JITSIZE | 0xe7 | RO | LONG | Total cache size in bytes |
JITFREE | 0xe8 | RO | LONG | Number of bytes free in the JIT cache |
JITCOUNT | 0xe9 | RO | LONG | Number of JIT units in the cache |
JITSCFTHRESH | 0xea | RW | LONG | JIT threshold for soft cache flushes |
JITCTRL | 0xeb | RW | LONG | JIT control register |
JITCMISS | 0xec | RO | LONG | Number of JIT cache misses |
DBGCTRL | 0xed | RW | LONG | Debug control register |
DBGADDRLO | 0xee | RW | LONG | Lowest debug address |
DBGADDRHI | 0xef | RW | LONG | Highest debug address |
JITCTRL2 | 0x1e0 | RW | LONG | JIT control register 2 |
CNTFRQ - Counter frequency
AArch64 features a free running 64-bit counter which can be used for timing purposes. This counter is exposed to the M68k and can be freely used by the software. The frequency of the counter is available through this register.
CNTVALLO, CNTVALHI - Free running counter
The value of free running counter is available through two registers. CNTVALLO
contains lower 32 bits of the free running counter, whereas CNTVALHI
contains the upper 32 bits. In order to make sure that the counter is read properly, i.e. that the lower 32 bit did not wrap between reading lower and higher longword, it is advisable to read CNTVALHI twice. If the value has changed on second read, it means that the lower 32 bits have wrapped and register read procedure should be repeated.
# Read CNTVAL register into d0:d1 pair.
ReadCNT:
move.l d2, -(a7)
1: movec.l #0xe2, d2
movec.l #0xe1, d1
movec.l #0xe2, d0
cmp.l d0, d2
bne.b 1b
move.l (a7)+, d2
rts
INSNCNTLO, INSNCNTHI - M68k instruction counter
Emu68 provides a real time counter of executed M68k instructions. The value of this 64 bit counter, stored in two read only control registers, allows one to learn about current performance of Emu68.
ARMCNTLO, ARMCNTHI - ARM instruction counter
Current count of executed ARM instructions, including translated JIT code as well as exceptions, translator and main JIT loop.
JITSIZE - JIT cache sile
Total size of JIT cache in bytes. This value is defined once during compilation.
JITFREE - JIT cache free
Number of free bytes in JIT cache.
JITCOUNT - JIT unit count
This register contains number of JIT units available in the cache at the moment.
JITSCFTHRESH - Soft flush threshold
The soft flush of JIT cache, controlled by the JITCTRL
register is time consuming, since the entire cache has to be walked through. If the JIT cache contains less entries than the threshold value, a soft flush will be eventually applied. If number of units exceeds the threshold, regular cache flush will be applied regardless of JCC_SOFT
bit value.
JITCTRL - JIT control register
Configures behaviour of JIT translator.
Name | Offset | Field size | Description |
---|---|---|---|
JCC_SOFT | 0 | 1 | Use “soft flush” of JIT cache. |
JCC_LOOP_COUNT | 4 | 4 | Inline loop count |
JCC_INLINE_RANGE | 8 | 16 | Maximal distance for inline |
JCC_INSN_DEPTH | 24 | 8 | Maximal JIT unit size |
JCC_SOFT
If this bit is set, instruction cache flush does not remove units from the JIT cache. Instead, they are marked as not verified. On next execution of the code the CRC32 checksum of the unit will be verified and, if unchanged, the unit will be marked as valid, omitting compilation phase.
JCC_LOOP_COUNT
If JIT Translator finds a way to unroll the loop in the code, it will attempt to fit up to JCC_LOOP_COUNT
loops, provided there is enough place to fit given number of m68k instructions into the cache.
JCC_INLINE_RANGE
When JIT translator finds a branch (conditional or unconditional) with target address computable during compilation time, the branch will be inlined into current JIT translation unit if the branch distance is within a proximity given by JCC_INLINE_RANGE
in bytes. Value of 0
disables branch inlining.
JCC_INSN_DEPTH
Translator will put not more than JCC_INSN_DEPTH
m68k instructions within single JIT compilation unit. Value of 0
sets maximal number of instructions to 256
. It must be noted that the JIT unit can contain less m68k instructions than the value set here, since every branch which is not computable during compilation phase as well as many context-synchronising instructions will break the translation.
JITCMISS - Cache miss counter
The value of this 32 bit counter is increased every time a JIT cache miss occurred and the JIT compiler is started.
DBGCTRL - Debug control register
Configures behaviour of debug messages. It can be switched on the fly to change verbosity of debug messages as well as to switch disassemble of translated code on or off. The change affects only the newly compiled units, therefore, it is advisable to flush entire code cache after applying any changes here.
Name | Offset | Field size | Description |
---|---|---|---|
DC_VERBOSE | 0 | 2 | Set verbosity level of debug |
DC_DISASM | 2 | 1 | Enable/disable disassembler |
DBGADDRLO, DBGADDRHI - Debug range
Debug information about JIT units is usually shown for all blocks of the memory going into the translator. Since such debug can be extremely huge (above 200 megabytes on regular system boot), the range where the verbosity of JIT units is elevated through DBGCTRL
register may be limited. If M68k address is not within a range between DBGADDRLO
and DBGADDRHI
, no information about such JIT unit will be written to the console.
JITCTRL2 - second JIT control register
Second control register influencing behavior of Emu68
Name | Offset | Field size | Description |
---|---|---|---|
JC2_CHIP_SLOWDOWN | 0 | 1 | Slow down code executing from CHIP memory |
JC2_DBF_SLOWDOWN | 1 | 1 | Slow down special case of DBF busy loops |
JC2_CCR_SCAN_DEPTH | 3 | 5 | Controls forward scan depth of CCR optimizer |
JC2_CHIP_SLOWDOWN_RATIO | 8 | 3 | Controls amount of slowdown running from CHIP memory |
JC2_BLITWAIT | 11 | 1 | Automatically wait for blitter to finish |
JC2_CHIP_SLOWDOWN
If this bit is set, Emu68 will add a word read from current PC location before every translated m68k instruction. This setting will make code executed from CHIP memory significantly slower. Might be used in case of some ancient software designed for much slower CPUs.
JC2_DBF_SLOWDOWN
This bit slows down special case of DBF instruction often used e.g. in old MOD replayers as a busy loop delay:
move.w #xxx, Dn
loop: dbf Dn, loop
Due to nature of Emu68 such busy loops are much faster then expected. When this bit is set, each DBF executed from CHIP memory branching to itself will take the same amount of time as three subsequent byte reads from CHIP.
JC2_CCR_SCAN_DEPTH
When Emu68 is translating m68k code to AArch64 code, it perform forward scanning of further m68k instructions to estimate if and, if yes, which bits of CCR should be updated. This greatly reduces amount of generated AArch64 code, but might be prone to errors e.g. in case of self-modifying code. By adjusting JC2_CCR_SCAN_DEPTH field it is possible to instruct Emu68 how many opcodes shall be scanned in advance. Valid values vary from 0 (CCR optimization completely disabled) up to 31. Default value on startup of Emu68 is 20.
JC2_CHIP_SLOWDOWN_RATIO
When JC2_CHIP_SLOWDOWN
is enabled, controls the ratio of instructions that are slowed down.
JC2_BLITWAIT
If this bit is set, Emu68 monitors writes by the CPU to blitter registers, and ensures the blitter is not active before proceeding. This will fix issues caused by missing blitter waits in software that was written to expect A500 speed when executing code from CHIP or SLOW memory. Blitter heavy code will be slowed down a bit by this setting.