Hello,
I am currently studying the mechanism of O3 CPU. I have enabled the O3CPUAll, BTB, and Branch debugging flags.
Disassembly of section .text:
0000000000010000 <.text>:
10000: f3 0f 1e fa endbr64
10004: 55 push rbp
10005: 48 89 e5 mov rbp,rsp
10008: c7 45 fc 00 00 00 00 mov DWORD PTR [rbp-0x4],0x0
1000f: 8b 05 8b 00 00 00 mov eax,DWORD PTR [rip+0x8b] # 0x100a0
10015: 89 45 fc mov DWORD PTR [rbp-0x4],eax
10018: 8b 45 fc mov eax,DWORD PTR [rbp-0x4]
1001b: 0f b6 c0 movzx eax,al
1001e: 48 98 cdqe
10020: 48 8d 14 85 00 00 00 lea rdx,[rax*4+0x0]
10027: 00
10028: 48 8d 05 71 00 02 00 lea rax,[rip+0x20071] # 0x300a0
1002f: c7 04 02 01 00 00 00 mov DWORD PTR [rdx+rax*1],0x1
10036: 90 nop
10037: 5d pop rbp
10038: c3 ret
10039: 0f 1f 80 00 00 00 00 nop DWORD PTR [rax+0x0]
10040: bc 00 00 01 00 mov esp,0x10000
10045: e8 b6 ff ff ff call 0x10000
1004a: eb fe jmp 0x1004a
Disassembly of section .eh_frame:
0000000000010050 <.eh_frame>:
10050: 14 00 adc al,0x0
10052: 00 00 add BYTE PTR [rax],al
10054: 00 00 add BYTE PTR [rax],al
10056: 00 00 add BYTE PTR [rax],al
10058: 01 7a 52 add DWORD PTR [rdx+0x52],edi
1005b: 00 01 add BYTE PTR [rcx],al
1005d: 78 10 js 0x1006f
In my code as above, with the program entry point at 0x10040, the instructions at 0x1004a should be repeatedly executed, without executing the instructions after it. Transient execution cases are not ruled out, but the expectation is that only a few instructions after 0x1004a will be executed once.
However, the puzzling issue is that, I have been running the simulation for a long time, but the prediction for the instructions at the infinite loop location is only 7 times. On the other hand, the conditional instructions after 0x1004a have been predicted many times, and if the simulation continues indefinitely, I expect this count to be infinite.
I'm certain that in my configuration, I'm using the default cache size, which is more than enough to accommodate these instructions. However, I noticed that after the "ret" instruction at 0x10038, there is still an ICache miss.
I'm using the configuration "x86-ubuntu-run.py" from gem5_library, with only the workload modified. I've been debugging the gem5 program for several days, but I haven't been able to locate the error. I'm wondering if there's anything I need to configure to make it run as expected. Thank you very much for your valuable suggestions.
I'm using the configuration based on X86-ubuntu-run-with-kvm.py from the gem5_library. Here is my configuration.
from gem5.coherence_protocol import CoherenceProtocol
from gem5.components.boards.x86_board import X86Board
from gem5.components.memory.single_channel import SingleChannelDDR3_1600
from gem5.components.processors.cpu_types import CPUTypes
from gem5.components.processors.simple_processor import (
SimpleProcessor,
)
from gem5.isas import ISA
from gem5.resources.resource import (
DiskImageResource,
KernelResource,
)
from gem5.simulate.simulator import Simulator
from gem5.utils.requires import requires
from gem5.components.boards.kernel_disk_workload import KernelDiskWorkload
# This runs a check to ensure the gem5 binary is compiled to X86 and to the
# MESI Two Level coherence protocol.
requires(
isa_required=ISA.X86,
coherence_protocol_required=CoherenceProtocol.MESI_TWO_LEVEL,
)
from gem5.components.cachehierarchies.ruby.mesi_two_level_cache_hierarchy import (
MESITwoLevelCacheHierarchy,
)
cache_hierarchy = MESITwoLevelCacheHierarchy(
l1d_size="16kB",
l1d_assoc=8,
l1i_size="16kB",
l1i_assoc=8,
l2_size="256kB",
l2_assoc=16,
num_l2_banks=1,
)
# Setup the system memory.
memory = SingleChannelDDR3_1600(size="3GB")
processor = SimpleProcessor(
cpu_type=CPUTypes.O3,
isa=ISA.X86,
num_cores=1,
)
board = X86Board(
clk_freq="3GHz",
processor=processor,
memory=memory,
cache_hierarchy=cache_hierarchy,
)
board.set_kernel_disk_workload(kernel=KernelResource(local_path="/workspace/example"),
disk_image=DiskImageResource(local_path="/workspace/rootfs.ext2"),
)
simulator = Simulator(
board=board,
on_exit_event={
# Here we want override the default behavior for the first m5 exit
# exit event. Instead of exiting the simulator, we just want to
# switch the processor. The 2nd m5 exit after will revert to using
# default behavior where the simulator run will exit.
# ExitEvent.EXIT: (func() for func in [processor.switch])
},
)
simulator.run()
Hi Peter,
I am not sure, but could the issue be this one:
https://www.mail-archive.com/gem5-dev@gem5.org/msg16924.html
At least as far as I know, the problem that the X86-O3-CPU cannot handle
jumps to the same location still exists.
Best regards,
Nils
On 3/25/24 14:29, Peter Wang via gem5-dev wrote:
|I'm using the configuration based on X86-ubuntu-run-with-kvm.py from
the gem5_library. Here is my configuration. from gem5.coherence_protocol
import CoherenceProtocol from gem5.components.boards.x86_board import
X86Board from gem5.components.memory.single_channel import
SingleChannelDDR3_1600 from gem5.components.processors.cpu_types import
CPUTypes from gem5.components.processors.simple_processor import (
SimpleProcessor, ) from gem5.isas import ISA from
gem5.resources.resource import ( DiskImageResource, KernelResource, )
from gem5.simulate.simulator import Simulator from gem5.utils.requires
import requires from gem5.components.boards.kernel_disk_workload import
KernelDiskWorkload # This runs a check to ensure the gem5 binary is
compiled to X86 and to the # MESI Two Level coherence protocol.
requires( isa_required=ISA.X86,
coherence_protocol_required=CoherenceProtocol.MESI_TWO_LEVEL, ) from
gem5.components.cachehierarchies.ruby.mesi_two_level_cache_hierarchy
import ( MESITwoLevelCacheHierarchy, ) cache_hierarchy =
MESITwoLevelCacheHierarchy( l1d_size="16kB", l1d_assoc=8,
l1i_size="16kB", l1i_assoc=8, l2_size="256kB", l2_assoc=16,
num_l2_banks=1, ) # Setup the system memory. memory =
SingleChannelDDR3_1600(size="3GB") processor = SimpleProcessor(
cpu_type=CPUTypes.O3, isa=ISA.X86, num_cores=1, ) board = X86Board(
clk_freq="3GHz", processor=processor, memory=memory,
cache_hierarchy=cache_hierarchy, )
board.set_kernel_disk_workload(kernel=KernelResource(local_path="/workspace/example"), disk_image=DiskImageResource(local_path="/workspace/rootfs.ext2"), ) simulator = Simulator( board=board, on_exit_event={ # Here we want override the default behavior for the first m5 exit # exit event. Instead of exiting the simulator, we just want to # switch the processor. The 2nd m5 exit after will revert to using # default behavior where the simulator run will exit. # ExitEvent.EXIT: (func() for func in [processor.switch]) }, ) simulator.run()|
gem5-dev mailing list -- gem5-dev@gem5.org
To unsubscribe send an email to gem5-dev-leave@gem5.org