gem5-dev@gem5.org

The gem5 Developer List

View all threads

Question about O3 CPU

D
despotxjjrr@gmail.com
Mon, Mar 25, 2024 1:17 PM

Hello,

I am currently studying the mechanism of O3 CPU. I have enabled the O3CPUAll, BTB, and Branch debugging flags.

Disassembly of section .text:

0000000000010000 <.text>:
   10000:	f3 0f 1e fa          	endbr64 
   10004:	55                   	push   rbp
   10005:	48 89 e5             	mov    rbp,rsp
   10008:	c7 45 fc 00 00 00 00 	mov    DWORD PTR [rbp-0x4],0x0
   1000f:	8b 05 8b 00 00 00    	mov    eax,DWORD PTR [rip+0x8b]        # 0x100a0
   10015:	89 45 fc             	mov    DWORD PTR [rbp-0x4],eax
   10018:	8b 45 fc             	mov    eax,DWORD PTR [rbp-0x4]
   1001b:	0f b6 c0             	movzx  eax,al
   1001e:	48 98                	cdqe   
   10020:	48 8d 14 85 00 00 00 	lea    rdx,[rax*4+0x0]
   10027:	00 
   10028:	48 8d 05 71 00 02 00 	lea    rax,[rip+0x20071]        # 0x300a0
   1002f:	c7 04 02 01 00 00 00 	mov    DWORD PTR [rdx+rax*1],0x1
   10036:	90                   	nop
   10037:	5d                   	pop    rbp
   10038:	c3                   	ret    
   10039:	0f 1f 80 00 00 00 00 	nop    DWORD PTR [rax+0x0]
   10040:	bc 00 00 01 00       	mov    esp,0x10000
   10045:	e8 b6 ff ff ff       	call   0x10000
   1004a:	eb fe                	jmp    0x1004a

Disassembly of section .eh_frame:

0000000000010050 <.eh_frame>:
   10050:	14 00                	adc    al,0x0
   10052:	00 00                	add    BYTE PTR [rax],al
   10054:	00 00                	add    BYTE PTR [rax],al
   10056:	00 00                	add    BYTE PTR [rax],al
   10058:	01 7a 52             	add    DWORD PTR [rdx+0x52],edi
   1005b:	00 01                	add    BYTE PTR [rcx],al
   1005d:	78 10                	js     0x1006f

In my code as above, with the program entry point at 0x10040, the instructions at 0x1004a should be repeatedly executed, without executing the instructions after it. Transient execution cases are not ruled out, but the expectation is that only a few instructions after 0x1004a will be executed once.

However, the puzzling issue is that, I have been running the simulation for a long time, but the prediction for the instructions at the infinite loop location is only 7 times. On the other hand, the conditional instructions after 0x1004a have been predicted many times, and if the simulation continues indefinitely, I expect this count to be infinite.

I'm certain that in my configuration, I'm using the default cache size, which is more than enough to accommodate these instructions. However, I noticed that after the "ret" instruction at 0x10038, there is still an ICache miss.

I'm using the configuration "x86-ubuntu-run.py" from gem5_library, with only the workload modified. I've been debugging the gem5 program for several days, but I haven't been able to locate the error. I'm wondering if there's anything I need to configure to make it run as expected. Thank you very much for your valuable suggestions.

Hello, I am currently studying the mechanism of O3 CPU. I have enabled the O3CPUAll, BTB, and Branch debugging flags. ``` Disassembly of section .text: 0000000000010000 <.text>: 10000: f3 0f 1e fa endbr64 10004: 55 push rbp 10005: 48 89 e5 mov rbp,rsp 10008: c7 45 fc 00 00 00 00 mov DWORD PTR [rbp-0x4],0x0 1000f: 8b 05 8b 00 00 00 mov eax,DWORD PTR [rip+0x8b] # 0x100a0 10015: 89 45 fc mov DWORD PTR [rbp-0x4],eax 10018: 8b 45 fc mov eax,DWORD PTR [rbp-0x4] 1001b: 0f b6 c0 movzx eax,al 1001e: 48 98 cdqe 10020: 48 8d 14 85 00 00 00 lea rdx,[rax*4+0x0] 10027: 00 10028: 48 8d 05 71 00 02 00 lea rax,[rip+0x20071] # 0x300a0 1002f: c7 04 02 01 00 00 00 mov DWORD PTR [rdx+rax*1],0x1 10036: 90 nop 10037: 5d pop rbp 10038: c3 ret 10039: 0f 1f 80 00 00 00 00 nop DWORD PTR [rax+0x0] 10040: bc 00 00 01 00 mov esp,0x10000 10045: e8 b6 ff ff ff call 0x10000 1004a: eb fe jmp 0x1004a Disassembly of section .eh_frame: 0000000000010050 <.eh_frame>: 10050: 14 00 adc al,0x0 10052: 00 00 add BYTE PTR [rax],al 10054: 00 00 add BYTE PTR [rax],al 10056: 00 00 add BYTE PTR [rax],al 10058: 01 7a 52 add DWORD PTR [rdx+0x52],edi 1005b: 00 01 add BYTE PTR [rcx],al 1005d: 78 10 js 0x1006f ``` In my code as above, with the program entry point at 0x10040, the instructions at 0x1004a should be repeatedly executed, without executing the instructions after it. Transient execution cases are not ruled out, but the expectation is that only a few instructions after 0x1004a will be executed once. However, the puzzling issue is that, I have been running the simulation for a long time, but the prediction for the instructions at the infinite loop location is only 7 times. On the other hand, the conditional instructions after 0x1004a have been predicted many times, and if the simulation continues indefinitely, I expect this count to be infinite. I'm certain that in my configuration, I'm using the default cache size, which is more than enough to accommodate these instructions. However, I noticed that after the "ret" instruction at 0x10038, there is still an ICache miss. I'm using the configuration "x86-ubuntu-run.py" from gem5_library, with only the workload modified. I've been debugging the gem5 program for several days, but I haven't been able to locate the error. I'm wondering if there's anything I need to configure to make it run as expected. Thank you very much for your valuable suggestions.
D
despotxjjrr@gmail.com
Mon, Mar 25, 2024 1:29 PM
I'm using the configuration based on X86-ubuntu-run-with-kvm.py from the gem5_library. Here is my configuration.

from gem5.coherence_protocol import CoherenceProtocol
from gem5.components.boards.x86_board import X86Board
from gem5.components.memory.single_channel import SingleChannelDDR3_1600
from gem5.components.processors.cpu_types import CPUTypes
from gem5.components.processors.simple_processor import (
    SimpleProcessor,
)
from gem5.isas import ISA
from gem5.resources.resource import (
    DiskImageResource,
    KernelResource,
)
from gem5.simulate.simulator import Simulator
from gem5.utils.requires import requires
from gem5.components.boards.kernel_disk_workload import KernelDiskWorkload

# This runs a check to ensure the gem5 binary is compiled to X86 and to the
# MESI Two Level coherence protocol.
requires(
    isa_required=ISA.X86,
    coherence_protocol_required=CoherenceProtocol.MESI_TWO_LEVEL,
)

from gem5.components.cachehierarchies.ruby.mesi_two_level_cache_hierarchy import (
    MESITwoLevelCacheHierarchy,
)


cache_hierarchy = MESITwoLevelCacheHierarchy(
    l1d_size="16kB",
    l1d_assoc=8,
    l1i_size="16kB",
    l1i_assoc=8,
    l2_size="256kB",
    l2_assoc=16,
    num_l2_banks=1,
)

# Setup the system memory.
memory = SingleChannelDDR3_1600(size="3GB")

processor = SimpleProcessor(
    cpu_type=CPUTypes.O3,
    isa=ISA.X86,
    num_cores=1,
)

board = X86Board(
    clk_freq="3GHz",
    processor=processor,
    memory=memory,
    cache_hierarchy=cache_hierarchy,
)


board.set_kernel_disk_workload(kernel=KernelResource(local_path="/workspace/example"),
                                 disk_image=DiskImageResource(local_path="/workspace/rootfs.ext2"),
                                 )

simulator = Simulator(
    board=board,
    on_exit_event={
        # Here we want override the default behavior for the first m5 exit
        # exit event. Instead of exiting the simulator, we just want to
        # switch the processor. The 2nd m5 exit after will revert to using
        # default behavior where the simulator run will exit.
        # ExitEvent.EXIT: (func() for func in [processor.switch])
    },
)
simulator.run()
``` I'm using the configuration based on X86-ubuntu-run-with-kvm.py from the gem5_library. Here is my configuration. from gem5.coherence_protocol import CoherenceProtocol from gem5.components.boards.x86_board import X86Board from gem5.components.memory.single_channel import SingleChannelDDR3_1600 from gem5.components.processors.cpu_types import CPUTypes from gem5.components.processors.simple_processor import ( SimpleProcessor, ) from gem5.isas import ISA from gem5.resources.resource import ( DiskImageResource, KernelResource, ) from gem5.simulate.simulator import Simulator from gem5.utils.requires import requires from gem5.components.boards.kernel_disk_workload import KernelDiskWorkload # This runs a check to ensure the gem5 binary is compiled to X86 and to the # MESI Two Level coherence protocol. requires( isa_required=ISA.X86, coherence_protocol_required=CoherenceProtocol.MESI_TWO_LEVEL, ) from gem5.components.cachehierarchies.ruby.mesi_two_level_cache_hierarchy import ( MESITwoLevelCacheHierarchy, ) cache_hierarchy = MESITwoLevelCacheHierarchy( l1d_size="16kB", l1d_assoc=8, l1i_size="16kB", l1i_assoc=8, l2_size="256kB", l2_assoc=16, num_l2_banks=1, ) # Setup the system memory. memory = SingleChannelDDR3_1600(size="3GB") processor = SimpleProcessor( cpu_type=CPUTypes.O3, isa=ISA.X86, num_cores=1, ) board = X86Board( clk_freq="3GHz", processor=processor, memory=memory, cache_hierarchy=cache_hierarchy, ) board.set_kernel_disk_workload(kernel=KernelResource(local_path="/workspace/example"), disk_image=DiskImageResource(local_path="/workspace/rootfs.ext2"), ) simulator = Simulator( board=board, on_exit_event={ # Here we want override the default behavior for the first m5 exit # exit event. Instead of exiting the simulator, we just want to # switch the processor. The 2nd m5 exit after will revert to using # default behavior where the simulator run will exit. # ExitEvent.EXIT: (func() for func in [processor.switch]) }, ) simulator.run() ```
NA
Nils Asmussen
Tue, Mar 26, 2024 4:56 AM

Hi Peter,

I am not sure, but could the issue be this one:

https://www.mail-archive.com/gem5-dev@gem5.org/msg16924.html

At least as far as I know, the problem that the X86-O3-CPU cannot handle
jumps to the same location still exists.

Best regards,
Nils

On 3/25/24 14:29, Peter Wang via gem5-dev wrote:

|I'm using the configuration based on X86-ubuntu-run-with-kvm.py from
the gem5_library. Here is my configuration. from gem5.coherence_protocol
import CoherenceProtocol from gem5.components.boards.x86_board import
X86Board from gem5.components.memory.single_channel import
SingleChannelDDR3_1600 from gem5.components.processors.cpu_types import
CPUTypes from gem5.components.processors.simple_processor import (
SimpleProcessor, ) from gem5.isas import ISA from
gem5.resources.resource import ( DiskImageResource, KernelResource, )
from gem5.simulate.simulator import Simulator from gem5.utils.requires
import requires from gem5.components.boards.kernel_disk_workload import
KernelDiskWorkload # This runs a check to ensure the gem5 binary is
compiled to X86 and to the # MESI Two Level coherence protocol.
requires( isa_required=ISA.X86,
coherence_protocol_required=CoherenceProtocol.MESI_TWO_LEVEL, ) from
gem5.components.cachehierarchies.ruby.mesi_two_level_cache_hierarchy
import ( MESITwoLevelCacheHierarchy, ) cache_hierarchy =
MESITwoLevelCacheHierarchy( l1d_size="16kB", l1d_assoc=8,
l1i_size="16kB", l1i_assoc=8, l2_size="256kB", l2_assoc=16,
num_l2_banks=1, ) # Setup the system memory. memory =
SingleChannelDDR3_1600(size="3GB") processor = SimpleProcessor(
cpu_type=CPUTypes.O3, isa=ISA.X86, num_cores=1, ) board = X86Board(
clk_freq="3GHz", processor=processor, memory=memory,
cache_hierarchy=cache_hierarchy, )
board.set_kernel_disk_workload(kernel=KernelResource(local_path="/workspace/example"), disk_image=DiskImageResource(local_path="/workspace/rootfs.ext2"), ) simulator = Simulator( board=board, on_exit_event={ # Here we want override the default behavior for the first m5 exit # exit event. Instead of exiting the simulator, we just want to # switch the processor. The 2nd m5 exit after will revert to using # default behavior where the simulator run will exit. # ExitEvent.EXIT: (func() for func in [processor.switch]) }, ) simulator.run()|


gem5-dev mailing list -- gem5-dev@gem5.org
To unsubscribe send an email to gem5-dev-leave@gem5.org

Hi Peter, I am not sure, but could the issue be this one: https://www.mail-archive.com/gem5-dev@gem5.org/msg16924.html At least as far as I know, the problem that the X86-O3-CPU cannot handle jumps to the same location still exists. Best regards, Nils On 3/25/24 14:29, Peter Wang via gem5-dev wrote: > |I'm using the configuration based on X86-ubuntu-run-with-kvm.py from > the gem5_library. Here is my configuration. from gem5.coherence_protocol > import CoherenceProtocol from gem5.components.boards.x86_board import > X86Board from gem5.components.memory.single_channel import > SingleChannelDDR3_1600 from gem5.components.processors.cpu_types import > CPUTypes from gem5.components.processors.simple_processor import ( > SimpleProcessor, ) from gem5.isas import ISA from > gem5.resources.resource import ( DiskImageResource, KernelResource, ) > from gem5.simulate.simulator import Simulator from gem5.utils.requires > import requires from gem5.components.boards.kernel_disk_workload import > KernelDiskWorkload # This runs a check to ensure the gem5 binary is > compiled to X86 and to the # MESI Two Level coherence protocol. > requires( isa_required=ISA.X86, > coherence_protocol_required=CoherenceProtocol.MESI_TWO_LEVEL, ) from > gem5.components.cachehierarchies.ruby.mesi_two_level_cache_hierarchy > import ( MESITwoLevelCacheHierarchy, ) cache_hierarchy = > MESITwoLevelCacheHierarchy( l1d_size="16kB", l1d_assoc=8, > l1i_size="16kB", l1i_assoc=8, l2_size="256kB", l2_assoc=16, > num_l2_banks=1, ) # Setup the system memory. memory = > SingleChannelDDR3_1600(size="3GB") processor = SimpleProcessor( > cpu_type=CPUTypes.O3, isa=ISA.X86, num_cores=1, ) board = X86Board( > clk_freq="3GHz", processor=processor, memory=memory, > cache_hierarchy=cache_hierarchy, ) > board.set_kernel_disk_workload(kernel=KernelResource(local_path="/workspace/example"), disk_image=DiskImageResource(local_path="/workspace/rootfs.ext2"), ) simulator = Simulator( board=board, on_exit_event={ # Here we want override the default behavior for the first m5 exit # exit event. Instead of exiting the simulator, we just want to # switch the processor. The 2nd m5 exit after will revert to using # default behavior where the simulator run will exit. # ExitEvent.EXIT: (func() for func in [processor.switch]) }, ) simulator.run()| > > > _______________________________________________ > gem5-dev mailing list -- gem5-dev@gem5.org > To unsubscribe send an email to gem5-dev-leave@gem5.org