gem5-users@gem5.org

The gem5 Users mailing list

View all threads

Spec2017 GCC benchmark crashes in SE mode

M
muke101
Sat, Jan 13, 2024 8:40 PM

Hi, I'm trying to checkpoint spec2017 with NonCachingSimpleCPU according to simpoints I generated from native AArch64 binaries. On an unmodified Gem5, a page fault occurs in the simulation (pasted the error message at the bottom).

I've read that GCC can run out of stack space when simulated, so I edited arch/arm/process.cc to have a max stack size of 16MB instead of the default 8MB, but this hasn't helped. Can someone confirm this is the right way to increase the max stack size? The total memory allowed to the simulation if 50GB (set if --mem-size on the command line).

Another possibility is that it's due to me compiling the benchmark without '--no-strict-aliasing', which the spec documentation suggests to, but I imagine if this were the problem then the program wouldn't had been able to run natively, which it does.

Has anyone run into this problem before and been able to solve it?

Thanks.

build/ARM/sim/simulate.cc:194: info: Entering event queue @ 188431954680500. Starting simulation...
build/ARM/sim/faults.cc:104: panic: panic condition !handled && !tc->getSystemPtr()->trapToGdb(SIGSEGV, tc->contextId()) occurred: Page table fault when accessing virtual address 0x660000195c
Memory Usage: 52735052 KBytes
Program aborted at tick 191641021589000
--- BEGIN LIBC BACKTRACE ---
/sim_home/luke/PND-Loads/gem5/build/ARM/gem5.fast(+0x97d088)[0xaaaaca97c088]
/sim_home/luke/PND-Loads/gem5/build/ARM/gem5.fast(+0x9ef1bc)[0xaaaaca9ee1bc]
linux-vdso.so.1(__kernel_rt_sigreturn+0x0)[0x4000161c2688]
/lib/aarch64-linux-gnu/libc.so.6(raise+0xb0)[0x4000164c2598]
--- END LIBC BACKTRACE ---
Aborted (core dumped)

Sent with Proton Mail secure email.

Hi, I'm trying to checkpoint spec2017 with NonCachingSimpleCPU according to simpoints I generated from native AArch64 binaries. On an unmodified Gem5, a page fault occurs in the simulation (pasted the error message at the bottom). I've read that GCC can run out of stack space when simulated, so I edited arch/arm/process.cc to have a max stack size of 16MB instead of the default 8MB, but this hasn't helped. Can someone confirm this is the right way to increase the max stack size? The total memory allowed to the simulation if 50GB (set if --mem-size on the command line). Another possibility is that it's due to me compiling the benchmark without '--no-strict-aliasing', which the spec documentation suggests to, but I imagine if this were the problem then the program wouldn't had been able to run natively, which it does. Has anyone run into this problem before and been able to solve it? Thanks. build/ARM/sim/simulate.cc:194: info: Entering event queue @ 188431954680500. Starting simulation... build/ARM/sim/faults.cc:104: panic: panic condition !handled && !tc->getSystemPtr()->trapToGdb(SIGSEGV, tc->contextId()) occurred: Page table fault when accessing virtual address 0x660000195c Memory Usage: 52735052 KBytes Program aborted at tick 191641021589000 --- BEGIN LIBC BACKTRACE --- /sim_home/luke/PND-Loads/gem5/build/ARM/gem5.fast(+0x97d088)[0xaaaaca97c088] /sim_home/luke/PND-Loads/gem5/build/ARM/gem5.fast(+0x9ef1bc)[0xaaaaca9ee1bc] linux-vdso.so.1(__kernel_rt_sigreturn+0x0)[0x4000161c2688] /lib/aarch64-linux-gnu/libc.so.6(raise+0xb0)[0x4000164c2598] --- END LIBC BACKTRACE --- Aborted (core dumped) Sent with [Proton Mail](https://proton.me/) secure email.
S
sun2k23
Mon, Jan 15, 2024 1:18 AM

Hi,

I have ever met several similar issues but i'm not running Spec2017. I think that's because you use this address 0x660000195c before allocating page firstly. You can enable the debug-flag of MMU to check whether the page related to this address is allocated when the image download is done.

S2k

At 2024-01-14 04:40:13, "muke101 via gem5-users" gem5-users@gem5.org wrote:

Hi, I'm trying to checkpoint spec2017 with NonCachingSimpleCPU according to simpoints I generated from native AArch64 binaries. On an unmodified Gem5, a page fault occurs in the simulation (pasted the error message at the bottom).

I've read that GCC can run out of stack space when simulated, so I edited arch/arm/process.cc to have a max stack size of 16MB instead of the default 8MB, but this hasn't helped. Can someone confirm this is the right way to increase the max stack size? The total memory allowed to the simulation if 50GB (set if --mem-size on the command line).

Another possibility is that it's due to me compiling the benchmark without '--no-strict-aliasing', which the spec documentation suggests to, but I imagine if this were the problem then the program wouldn't had been able to run natively, which it does.

Has anyone run into this problem before and been able to solve it?

Thanks.

build/ARM/sim/simulate.cc:194: info: Entering event queue @ 188431954680500.  Starting simulation...
build/ARM/sim/faults.cc:104: panic: panic condition !handled && !tc->getSystemPtr()->trapToGdb(SIGSEGV, tc->contextId()) occurred: Page table fault when accessing virtual address 0x660000195c
Memory Usage: 52735052 KBytes
Program aborted at tick 191641021589000
--- BEGIN LIBC BACKTRACE ---
/sim_home/luke/PND-Loads/gem5/build/ARM/gem5.fast(+0x97d088)[0xaaaaca97c088]
/sim_home/luke/PND-Loads/gem5/build/ARM/gem5.fast(+0x9ef1bc)[0xaaaaca9ee1bc]
linux-vdso.so.1(__kernel_rt_sigreturn+0x0)[0x4000161c2688]
/lib/aarch64-linux-gnu/libc.so.6(raise+0xb0)[0x4000164c2598]
--- END LIBC BACKTRACE ---
Aborted (core dumped)

Sent with Proton Mail secure email.

Hi, I have ever met several similar issues but i'm not running Spec2017. I think that's because you use this address 0x660000195c before allocating page firstly. You can enable the debug-flag of MMU to check whether the page related to this address is allocated when the image download is done. S2k At 2024-01-14 04:40:13, "muke101 via gem5-users" <gem5-users@gem5.org> wrote: Hi, I'm trying to checkpoint spec2017 with NonCachingSimpleCPU according to simpoints I generated from native AArch64 binaries. On an unmodified Gem5, a page fault occurs in the simulation (pasted the error message at the bottom). I've read that GCC can run out of stack space when simulated, so I edited arch/arm/process.cc to have a max stack size of 16MB instead of the default 8MB, but this hasn't helped. Can someone confirm this is the right way to increase the max stack size? The total memory allowed to the simulation if 50GB (set if --mem-size on the command line). Another possibility is that it's due to me compiling the benchmark without '--no-strict-aliasing', which the spec documentation suggests to, but I imagine if this were the problem then the program wouldn't had been able to run natively, which it does. Has anyone run into this problem before and been able to solve it? Thanks. build/ARM/sim/simulate.cc:194: info: Entering event queue @ 188431954680500. Starting simulation... build/ARM/sim/faults.cc:104: panic: panic condition !handled && !tc->getSystemPtr()->trapToGdb(SIGSEGV, tc->contextId()) occurred: Page table fault when accessing virtual address 0x660000195c Memory Usage: 52735052 KBytes Program aborted at tick 191641021589000 --- BEGIN LIBC BACKTRACE --- /sim_home/luke/PND-Loads/gem5/build/ARM/gem5.fast(+0x97d088)[0xaaaaca97c088] /sim_home/luke/PND-Loads/gem5/build/ARM/gem5.fast(+0x9ef1bc)[0xaaaaca9ee1bc] linux-vdso.so.1(__kernel_rt_sigreturn+0x0)[0x4000161c2688] /lib/aarch64-linux-gnu/libc.so.6(raise+0xb0)[0x4000164c2598] --- END LIBC BACKTRACE --- Aborted (core dumped) Sent with Proton Mail secure email.
M
muke101
Mon, Jan 15, 2024 1:25 AM

Hi, thanks for the reply.

Assuming that the page isn't being allocated when it should be, what could I do with this information? I'm not familiar with this part of the Gem5 codebase.

Also, just to be clear, what do you mean by 'when the image download is done'?

Thanks.

Sent from Proton Mail mobile

-------- Original Message --------
On 15 Jan 2024, 01:18, sun2k23 via gem5-users wrote:

Hi,

I have ever met several similar issues but i'm not running Spec2017. I think that's because you use this address 0x660000195c before allocating page firstly. You can enable the debug-flag of MMU to check whether the page related to this address is allocated when the image download is done.

S2k

At 2024-01-14 04:40:13, "muke101 via gem5-users" gem5-users@gem5.org wrote:

Hi, I'm trying to checkpoint spec2017 with NonCachingSimpleCPU according to simpoints I generated from native AArch64 binaries. On an unmodified Gem5, a page fault occurs in the simulation (pasted the error message at the bottom).

I've read that GCC can run out of stack space when simulated, so I edited arch/arm/process.cc to have a max stack size of 16MB instead of the default 8MB, but this hasn't helped. Can someone confirm this is the right way to increase the max stack size? The total memory allowed to the simulation if 50GB (set if --mem-size on the command line).

Another possibility is that it's due to me compiling the benchmark without '--no-strict-aliasing', which the spec documentation suggests to, but I imagine if this were the problem then the program wouldn't had been able to run natively, which it does.

Has anyone run into this problem before and been able to solve it?

Thanks.

build/ARM/sim/simulate.cc:194: info: Entering event queue @ 188431954680500. Starting simulation...
build/ARM/sim/faults.cc:104: panic: panic condition !handled && !tc->getSystemPtr()->trapToGdb(SIGSEGV, tc->contextId()) occurred: Page table fault when accessing virtual address 0x660000195c
Memory Usage: 52735052 KBytes
Program aborted at tick 191641021589000
--- BEGIN LIBC BACKTRACE ---
/sim_home/luke/PND-Loads/gem5/build/ARM/gem5.fast(+0x97d088)[0xaaaaca97c088]
/sim_home/luke/PND-Loads/gem5/build/ARM/gem5.fast(+0x9ef1bc)[0xaaaaca9ee1bc]
linux-vdso.so.1(__kernel_rt_sigreturn+0x0)[0x4000161c2688]
/lib/aarch64-linux-gnu/libc.so.6(raise+0xb0)[0x4000164c2598]
--- END LIBC BACKTRACE ---
Aborted (core dumped)

Sent with Proton Mail secure email.

Hi, thanks for the reply. Assuming that the page isn't being allocated when it should be, what could I do with this information? I'm not familiar with this part of the Gem5 codebase. Also, just to be clear, what do you mean by 'when the image download is done'? Thanks. Sent from Proton Mail mobile -------- Original Message -------- On 15 Jan 2024, 01:18, sun2k23 via gem5-users wrote: > Hi, > > I have ever met several similar issues but i'm not running Spec2017. I think that's because you use this address 0x660000195c before allocating page firstly. You can enable the debug-flag of MMU to check whether the page related to this address is allocated when the image download is done. > > S2k > > At 2024-01-14 04:40:13, "muke101 via gem5-users" <gem5-users@gem5.org> wrote: > >> Hi, I'm trying to checkpoint spec2017 with NonCachingSimpleCPU according to simpoints I generated from native AArch64 binaries. On an unmodified Gem5, a page fault occurs in the simulation (pasted the error message at the bottom). >> >> I've read that GCC can run out of stack space when simulated, so I edited arch/arm/process.cc to have a max stack size of 16MB instead of the default 8MB, but this hasn't helped. Can someone confirm this is the right way to increase the max stack size? The total memory allowed to the simulation if 50GB (set if --mem-size on the command line). >> >> Another possibility is that it's due to me compiling the benchmark without '--no-strict-aliasing', which the spec documentation suggests to, but I imagine if this were the problem then the program wouldn't had been able to run natively, which it does. >> >> Has anyone run into this problem before and been able to solve it? >> >> Thanks. >> >> build/ARM/sim/simulate.cc:194: info: Entering event queue @ 188431954680500. Starting simulation... >> build/ARM/sim/faults.cc:104: panic: panic condition !handled && !tc->getSystemPtr()->trapToGdb(SIGSEGV, tc->contextId()) occurred: Page table fault when accessing virtual address 0x660000195c >> Memory Usage: 52735052 KBytes >> Program aborted at tick 191641021589000 >> --- BEGIN LIBC BACKTRACE --- >> /sim_home/luke/PND-Loads/gem5/build/ARM/gem5.fast(+0x97d088)[0xaaaaca97c088] >> /sim_home/luke/PND-Loads/gem5/build/ARM/gem5.fast(+0x9ef1bc)[0xaaaaca9ee1bc] >> linux-vdso.so.1(__kernel_rt_sigreturn+0x0)[0x4000161c2688] >> /lib/aarch64-linux-gnu/libc.so.6(raise+0xb0)[0x4000164c2598] >> --- END LIBC BACKTRACE --- >> Aborted (core dumped) >> >> Sent with [Proton Mail](https://proton.me/) secure email.
S
sun2k23
Tue, Jan 16, 2024 6:07 AM

Hi,

Usually in SE mode, the page would be pre-allocated when the ELF file(what's also called image download by me previously) is initiated/downloaded to the memory.  The allocated page can be checked when enabling MMU debug-flag.

I would suggest you to check the error address to confirm whether the related page is allocated, and probably not, then you would check whether the error address is located inside stack location or somewhere else.  if it's outside stack and is an invalid location, maybe you should further enlarge the stack space.  except for the process.StackSize, i think maybe you should also modify the process.stackBase.

Please let me know if you fix this issue.

Regards,
S2K.

At 2024-01-15 09:25:02, "muke101 via gem5-users" gem5-users@gem5.org wrote:

Hi, thanks for the reply.

Assuming that the page isn't being allocated when it should be, what could I do with this information? I'm not familiar with this part of the Gem5 codebase.

Also, just to be clear, what do you mean by 'when the image download is done'?

Thanks.

Sent from Proton Mail mobile

-------- Original Message --------
On 15 Jan 2024, 01:18, sun2k23 via gem5-users wrote:

Hi,

I have ever met several similar issues but i'm not running Spec2017. I think that's because you use this address 0x660000195c before allocating page firstly. You can enable the debug-flag of MMU to check whether the page related to this address is allocated when the image download is done.

S2k

At 2024-01-14 04:40:13, "muke101 via gem5-users" gem5-users@gem5.org wrote:

Hi, I'm trying to checkpoint spec2017 with NonCachingSimpleCPU according to simpoints I generated from native AArch64 binaries. On an unmodified Gem5, a page fault occurs in the simulation (pasted the error message at the bottom).

I've read that GCC can run out of stack space when simulated, so I edited arch/arm/process.cc to have a max stack size of 16MB instead of the default 8MB, but this hasn't helped. Can someone confirm this is the right way to increase the max stack size? The total memory allowed to the simulation if 50GB (set if --mem-size on the command line).

Another possibility is that it's due to me compiling the benchmark without '--no-strict-aliasing', which the spec documentation suggests to, but I imagine if this were the problem then the program wouldn't had been able to run natively, which it does.

Has anyone run into this problem before and been able to solve it?

Thanks.

build/ARM/sim/simulate.cc:194: info: Entering event queue @ 188431954680500.  Starting simulation...
build/ARM/sim/faults.cc:104: panic: panic condition !handled && !tc->getSystemPtr()->trapToGdb(SIGSEGV, tc->contextId()) occurred: Page table fault when accessing virtual address 0x660000195c
Memory Usage: 52735052 KBytes
Program aborted at tick 191641021589000
--- BEGIN LIBC BACKTRACE ---
/sim_home/luke/PND-Loads/gem5/build/ARM/gem5.fast(+0x97d088)[0xaaaaca97c088]
/sim_home/luke/PND-Loads/gem5/build/ARM/gem5.fast(+0x9ef1bc)[0xaaaaca9ee1bc]
linux-vdso.so.1(__kernel_rt_sigreturn+0x0)[0x4000161c2688]
/lib/aarch64-linux-gnu/libc.so.6(raise+0xb0)[0x4000164c2598]
--- END LIBC BACKTRACE ---
Aborted (core dumped)

Sent with Proton Mail secure email.

Hi, Usually in SE mode, the page would be pre-allocated when the ELF file(what's also called image download by me previously) is initiated/downloaded to the memory. The allocated page can be checked when enabling MMU debug-flag. I would suggest you to check the error address to confirm whether the related page is allocated, and probably not, then you would check whether the error address is located inside stack location or somewhere else. if it's outside stack and is an invalid location, maybe you should further enlarge the stack space. except for the process.StackSize, i think maybe you should also modify the process.stackBase. Please let me know if you fix this issue. Regards, S2K. At 2024-01-15 09:25:02, "muke101 via gem5-users" <gem5-users@gem5.org> wrote: Hi, thanks for the reply. Assuming that the page isn't being allocated when it should be, what could I do with this information? I'm not familiar with this part of the Gem5 codebase. Also, just to be clear, what do you mean by 'when the image download is done'? Thanks. Sent from Proton Mail mobile -------- Original Message -------- On 15 Jan 2024, 01:18, sun2k23 via gem5-users wrote: Hi, I have ever met several similar issues but i'm not running Spec2017. I think that's because you use this address 0x660000195c before allocating page firstly. You can enable the debug-flag of MMU to check whether the page related to this address is allocated when the image download is done. S2k At 2024-01-14 04:40:13, "muke101 via gem5-users" <gem5-users@gem5.org> wrote: Hi, I'm trying to checkpoint spec2017 with NonCachingSimpleCPU according to simpoints I generated from native AArch64 binaries. On an unmodified Gem5, a page fault occurs in the simulation (pasted the error message at the bottom). I've read that GCC can run out of stack space when simulated, so I edited arch/arm/process.cc to have a max stack size of 16MB instead of the default 8MB, but this hasn't helped. Can someone confirm this is the right way to increase the max stack size? The total memory allowed to the simulation if 50GB (set if --mem-size on the command line). Another possibility is that it's due to me compiling the benchmark without '--no-strict-aliasing', which the spec documentation suggests to, but I imagine if this were the problem then the program wouldn't had been able to run natively, which it does. Has anyone run into this problem before and been able to solve it? Thanks. build/ARM/sim/simulate.cc:194: info: Entering event queue @ 188431954680500. Starting simulation... build/ARM/sim/faults.cc:104: panic: panic condition !handled && !tc->getSystemPtr()->trapToGdb(SIGSEGV, tc->contextId()) occurred: Page table fault when accessing virtual address 0x660000195c Memory Usage: 52735052 KBytes Program aborted at tick 191641021589000 --- BEGIN LIBC BACKTRACE --- /sim_home/luke/PND-Loads/gem5/build/ARM/gem5.fast(+0x97d088)[0xaaaaca97c088] /sim_home/luke/PND-Loads/gem5/build/ARM/gem5.fast(+0x9ef1bc)[0xaaaaca9ee1bc] linux-vdso.so.1(__kernel_rt_sigreturn+0x0)[0x4000161c2688] /lib/aarch64-linux-gnu/libc.so.6(raise+0xb0)[0x4000164c2598] --- END LIBC BACKTRACE --- Aborted (core dumped) Sent with Proton Mail secure email.
M
muke101
Tue, Jan 16, 2024 5:22 PM

I see, thanks for the explanation.

Could you elaborate for me how I should change the stack base address to increase available size? I had thought setting a higher address would but GCC immediately crashes if I do this. If I set a lower address it works but does this provide more space?

Also, is the mmap_end variable relevant at all?

Sent from Proton Mail mobile

-------- Original Message --------
On 16 Jan 2024, 06:07, sun2k23 wrote:

Hi,

Usually in SE mode, the page would be pre-allocated when the ELF file(what's also called image download by me previously) is initiated/downloaded to the memory. The allocated page can be checked when enabling MMU debug-flag.

I would suggest you to check the error address to confirm whether the related page is allocated, and probably not, then you would check whether the error address is located inside stack location or somewhere else. if it's outside stack and is an invalid location, maybe you should further enlarge the stack space. except for the process.StackSize, i think maybe you should also modify the process.stackBase.

Please let me know if you fix this issue.

Regards,
S2K.

At 2024-01-15 09:25:02, "muke101 via gem5-users" gem5-users@gem5.org wrote:

Hi, thanks for the reply.

Assuming that the page isn't being allocated when it should be, what could I do with this information? I'm not familiar with this part of the Gem5 codebase.

Also, just to be clear, what do you mean by 'when the image download is done'?

Thanks.

Sent from Proton Mail mobile

-------- Original Message --------
On 15 Jan 2024, 01:18, sun2k23 via gem5-users  wrote:

Hi,

I have ever met several similar issues but i'm not running Spec2017. I think that's because you use this address 0x660000195c before allocating page firstly. You can enable the debug-flag of MMU to check whether the page related to this address is allocated when the image download is done.

S2k

At 2024-01-14 04:40:13, "muke101 via gem5-users" gem5-users@gem5.org wrote:

Hi, I'm trying to checkpoint spec2017 with NonCachingSimpleCPU according to simpoints I generated from native AArch64 binaries. On an unmodified Gem5, a page fault occurs in the simulation (pasted the error message at the bottom).

I've read that GCC can run out of stack space when simulated, so I edited arch/arm/process.cc to have a max stack size of 16MB instead of the default 8MB, but this hasn't helped. Can someone confirm this is the right way to increase the max stack size? The total memory allowed to the simulation if 50GB (set if --mem-size on the command line).

Another possibility is that it's due to me compiling the benchmark without '--no-strict-aliasing', which the spec documentation suggests to, but I imagine if this were the problem then the program wouldn't had been able to run natively, which it does.

Has anyone run into this problem before and been able to solve it?

Thanks.

build/ARM/sim/simulate.cc:194: info: Entering event queue @ 188431954680500. Starting simulation...
build/ARM/sim/faults.cc:104: panic: panic condition !handled && !tc->getSystemPtr()->trapToGdb(SIGSEGV, tc->contextId()) occurred: Page table fault when accessing virtual address 0x660000195c
Memory Usage: 52735052 KBytes
Program aborted at tick 191641021589000
--- BEGIN LIBC BACKTRACE ---
/sim_home/luke/PND-Loads/gem5/build/ARM/gem5.fast(+0x97d088)[0xaaaaca97c088]
/sim_home/luke/PND-Loads/gem5/build/ARM/gem5.fast(+0x9ef1bc)[0xaaaaca9ee1bc]
linux-vdso.so.1(__kernel_rt_sigreturn+0x0)[0x4000161c2688]
/lib/aarch64-linux-gnu/libc.so.6(raise+0xb0)[0x4000164c2598]
--- END LIBC BACKTRACE ---
Aborted (core dumped)

Sent with Proton Mail secure email.

I see, thanks for the explanation. Could you elaborate for me how I should change the stack base address to increase available size? I had thought setting a higher address would but GCC immediately crashes if I do this. If I set a lower address it works but does this provide more space? Also, is the mmap_end variable relevant at all? Sent from Proton Mail mobile -------- Original Message -------- On 16 Jan 2024, 06:07, sun2k23 wrote: > Hi, > > Usually in SE mode, the page would be pre-allocated when the ELF file(what's also called image download by me previously) is initiated/downloaded to the memory. The allocated page can be checked when enabling MMU debug-flag. > > I would suggest you to check the error address to confirm whether the related page is allocated, and probably not, then you would check whether the error address is located inside stack location or somewhere else. if it's outside stack and is an invalid location, maybe you should further enlarge the stack space. except for the process.StackSize, i think maybe you should also modify the process.stackBase. > > Please let me know if you fix this issue. > > Regards, > S2K. > > At 2024-01-15 09:25:02, "muke101 via gem5-users" <gem5-users@gem5.org> wrote: > >> Hi, thanks for the reply. >> >> Assuming that the page isn't being allocated when it should be, what could I do with this information? I'm not familiar with this part of the Gem5 codebase. >> >> Also, just to be clear, what do you mean by 'when the image download is done'? >> >> Thanks. >> >> Sent from Proton Mail mobile >> >> -------- Original Message -------- >> On 15 Jan 2024, 01:18, sun2k23 via gem5-users wrote: >> >>> Hi, >>> >>> I have ever met several similar issues but i'm not running Spec2017. I think that's because you use this address 0x660000195c before allocating page firstly. You can enable the debug-flag of MMU to check whether the page related to this address is allocated when the image download is done. >>> >>> S2k >>> >>> At 2024-01-14 04:40:13, "muke101 via gem5-users" <gem5-users@gem5.org> wrote: >>> >>>> Hi, I'm trying to checkpoint spec2017 with NonCachingSimpleCPU according to simpoints I generated from native AArch64 binaries. On an unmodified Gem5, a page fault occurs in the simulation (pasted the error message at the bottom). >>>> >>>> I've read that GCC can run out of stack space when simulated, so I edited arch/arm/process.cc to have a max stack size of 16MB instead of the default 8MB, but this hasn't helped. Can someone confirm this is the right way to increase the max stack size? The total memory allowed to the simulation if 50GB (set if --mem-size on the command line). >>>> >>>> Another possibility is that it's due to me compiling the benchmark without '--no-strict-aliasing', which the spec documentation suggests to, but I imagine if this were the problem then the program wouldn't had been able to run natively, which it does. >>>> >>>> Has anyone run into this problem before and been able to solve it? >>>> >>>> Thanks. >>>> >>>> build/ARM/sim/simulate.cc:194: info: Entering event queue @ 188431954680500. Starting simulation... >>>> build/ARM/sim/faults.cc:104: panic: panic condition !handled && !tc->getSystemPtr()->trapToGdb(SIGSEGV, tc->contextId()) occurred: Page table fault when accessing virtual address 0x660000195c >>>> Memory Usage: 52735052 KBytes >>>> Program aborted at tick 191641021589000 >>>> --- BEGIN LIBC BACKTRACE --- >>>> /sim_home/luke/PND-Loads/gem5/build/ARM/gem5.fast(+0x97d088)[0xaaaaca97c088] >>>> /sim_home/luke/PND-Loads/gem5/build/ARM/gem5.fast(+0x9ef1bc)[0xaaaaca9ee1bc] >>>> linux-vdso.so.1(__kernel_rt_sigreturn+0x0)[0x4000161c2688] >>>> /lib/aarch64-linux-gnu/libc.so.6(raise+0xb0)[0x4000164c2598] >>>> --- END LIBC BACKTRACE --- >>>> Aborted (core dumped) >>>> >>>> Sent with [Proton Mail](https://proton.me/) secure email.
S
sun2k23
Wed, Jan 17, 2024 2:50 AM

you may have below python coding in your *.py configuration file:
process = Process(...)
process.executable = ...

If you want to update the stack base, just add configuration like below:
process.stackBase = 0x.... // new stack base value
process.maxStackSize = 0x.... // stack size

You can have a try, hope it helps.

S2K

At 2024-01-17 01:22:30, "muke101 via gem5-users" gem5-users@gem5.org wrote:

I see, thanks for the explanation.

Could you elaborate for me how I should change the stack base address to increase available size? I had thought setting a higher address would but GCC immediately crashes if I do this. If I set a lower address it works but does this provide more space?

Also, is the mmap_end variable relevant at all?

Sent from Proton Mail mobile

-------- Original Message --------
On 16 Jan 2024, 06:07, sun2k23 wrote:

Hi,

Usually in SE mode, the page would be pre-allocated when the ELF file(what's also called image download by me previously) is initiated/downloaded to the memory.  The allocated page can be checked when enabling MMU debug-flag.

I would suggest you to check the error address to confirm whether the related page is allocated, and probably not, then you would check whether the error address is located inside stack location or somewhere else.  if it's outside stack and is an invalid location, maybe you should further enlarge the stack space.  except for the process.StackSize, i think maybe you should also modify the process.stackBase.

Please let me know if you fix this issue.

Regards,
S2K.

At 2024-01-15 09:25:02, "muke101 via gem5-users" gem5-users@gem5.org wrote:

Hi, thanks for the reply.

Assuming that the page isn't being allocated when it should be, what could I do with this information? I'm not familiar with this part of the Gem5 codebase.

Also, just to be clear, what do you mean by 'when the image download is done'?

Thanks.

Sent from Proton Mail mobile

-------- Original Message --------
On 15 Jan 2024, 01:18, sun2k23 via gem5-users wrote:

Hi,

I have ever met several similar issues but i'm not running Spec2017. I think that's because you use this address 0x660000195c before allocating page firstly. You can enable the debug-flag of MMU to check whether the page related to this address is allocated when the image download is done.

S2k

At 2024-01-14 04:40:13, "muke101 via gem5-users" gem5-users@gem5.org wrote:

Hi, I'm trying to checkpoint spec2017 with NonCachingSimpleCPU according to simpoints I generated from native AArch64 binaries. On an unmodified Gem5, a page fault occurs in the simulation (pasted the error message at the bottom).

I've read that GCC can run out of stack space when simulated, so I edited arch/arm/process.cc to have a max stack size of 16MB instead of the default 8MB, but this hasn't helped. Can someone confirm this is the right way to increase the max stack size? The total memory allowed to the simulation if 50GB (set if --mem-size on the command line).

Another possibility is that it's due to me compiling the benchmark without '--no-strict-aliasing', which the spec documentation suggests to, but I imagine if this were the problem then the program wouldn't had been able to run natively, which it does.

Has anyone run into this problem before and been able to solve it?

Thanks.

build/ARM/sim/simulate.cc:194: info: Entering event queue @ 188431954680500.  Starting simulation...
build/ARM/sim/faults.cc:104: panic: panic condition !handled && !tc->getSystemPtr()->trapToGdb(SIGSEGV, tc->contextId()) occurred: Page table fault when accessing virtual address 0x660000195c
Memory Usage: 52735052 KBytes
Program aborted at tick 191641021589000
--- BEGIN LIBC BACKTRACE ---
/sim_home/luke/PND-Loads/gem5/build/ARM/gem5.fast(+0x97d088)[0xaaaaca97c088]
/sim_home/luke/PND-Loads/gem5/build/ARM/gem5.fast(+0x9ef1bc)[0xaaaaca9ee1bc]
linux-vdso.so.1(__kernel_rt_sigreturn+0x0)[0x4000161c2688]
/lib/aarch64-linux-gnu/libc.so.6(raise+0xb0)[0x4000164c2598]
--- END LIBC BACKTRACE ---
Aborted (core dumped)

Sent with Proton Mail secure email.

you may have below python coding in your *.py configuration file: process = Process(...) process.executable = ... If you want to update the stack base, just add configuration like below: process.stackBase = 0x.... // new stack base value process.maxStackSize = 0x.... // stack size You can have a try, hope it helps. S2K At 2024-01-17 01:22:30, "muke101 via gem5-users" <gem5-users@gem5.org> wrote: I see, thanks for the explanation. Could you elaborate for me how I should change the stack base address to increase available size? I had thought setting a higher address would but GCC immediately crashes if I do this. If I set a lower address it works but does this provide more space? Also, is the mmap_end variable relevant at all? Sent from Proton Mail mobile -------- Original Message -------- On 16 Jan 2024, 06:07, sun2k23 wrote: Hi, Usually in SE mode, the page would be pre-allocated when the ELF file(what's also called image download by me previously) is initiated/downloaded to the memory. The allocated page can be checked when enabling MMU debug-flag. I would suggest you to check the error address to confirm whether the related page is allocated, and probably not, then you would check whether the error address is located inside stack location or somewhere else. if it's outside stack and is an invalid location, maybe you should further enlarge the stack space. except for the process.StackSize, i think maybe you should also modify the process.stackBase. Please let me know if you fix this issue. Regards, S2K. At 2024-01-15 09:25:02, "muke101 via gem5-users" <gem5-users@gem5.org> wrote: Hi, thanks for the reply. Assuming that the page isn't being allocated when it should be, what could I do with this information? I'm not familiar with this part of the Gem5 codebase. Also, just to be clear, what do you mean by 'when the image download is done'? Thanks. Sent from Proton Mail mobile -------- Original Message -------- On 15 Jan 2024, 01:18, sun2k23 via gem5-users wrote: Hi, I have ever met several similar issues but i'm not running Spec2017. I think that's because you use this address 0x660000195c before allocating page firstly. You can enable the debug-flag of MMU to check whether the page related to this address is allocated when the image download is done. S2k At 2024-01-14 04:40:13, "muke101 via gem5-users" <gem5-users@gem5.org> wrote: Hi, I'm trying to checkpoint spec2017 with NonCachingSimpleCPU according to simpoints I generated from native AArch64 binaries. On an unmodified Gem5, a page fault occurs in the simulation (pasted the error message at the bottom). I've read that GCC can run out of stack space when simulated, so I edited arch/arm/process.cc to have a max stack size of 16MB instead of the default 8MB, but this hasn't helped. Can someone confirm this is the right way to increase the max stack size? The total memory allowed to the simulation if 50GB (set if --mem-size on the command line). Another possibility is that it's due to me compiling the benchmark without '--no-strict-aliasing', which the spec documentation suggests to, but I imagine if this were the problem then the program wouldn't had been able to run natively, which it does. Has anyone run into this problem before and been able to solve it? Thanks. build/ARM/sim/simulate.cc:194: info: Entering event queue @ 188431954680500. Starting simulation... build/ARM/sim/faults.cc:104: panic: panic condition !handled && !tc->getSystemPtr()->trapToGdb(SIGSEGV, tc->contextId()) occurred: Page table fault when accessing virtual address 0x660000195c Memory Usage: 52735052 KBytes Program aborted at tick 191641021589000 --- BEGIN LIBC BACKTRACE --- /sim_home/luke/PND-Loads/gem5/build/ARM/gem5.fast(+0x97d088)[0xaaaaca97c088] /sim_home/luke/PND-Loads/gem5/build/ARM/gem5.fast(+0x9ef1bc)[0xaaaaca9ee1bc] linux-vdso.so.1(__kernel_rt_sigreturn+0x0)[0x4000161c2688] /lib/aarch64-linux-gnu/libc.so.6(raise+0xb0)[0x4000164c2598] --- END LIBC BACKTRACE --- Aborted (core dumped) Sent with Proton Mail secure email.
M
muke101
Wed, Jan 17, 2024 6:53 PM

Ah, I was trying to do this from arch/arm/process.cc, but I've found sim/Process.py which seems to actually control what the value is (at least when using the se.py config).

Turns out Process.py already sets a 64MB stack by default. This suggests GCC crashing isn't due to running out of stack after all then as this is likely already enough. To be certain I've spun up another run with a stack size of 1GB, but I still expect it to crash.

Is there anything else I can look into in the meantime to try and fix this? I've confirmed the problem wasn't fixed by -fno-strict-aliasing too.

Sent with Proton Mail secure email.

On Wednesday, 17 January 2024 at 02:50, sun2k23 via gem5-users gem5-users@gem5.org wrote:

you may have below python coding in your *.py configuration file:
process = Process(...)
process.executable = ...

If you want to update the stack base, just add configuration like below:
process.stackBase = 0x.... // new stack base value
process.maxStackSize = 0x.... // stack size

You can have a try, hope it helps.

S2K

At 2024-01-17 01:22:30, "muke101 via gem5-users" gem5-users@gem5.org wrote:

I see, thanks for the explanation.

Could you elaborate for me how I should change the stack base address to increase available size? I had thought setting a higher address would but GCC immediately crashes if I do this. If I set a lower address it works but does this provide more space?

Also, is the mmap_end variable relevant at all?

Sent from Proton Mail mobile

-------- Original Message --------
On 16 Jan 2024, 06:07, sun2k23 wrote:

Hi,

Usually in SE mode, the page would be pre-allocated when the ELF file(what's also called image download by me previously) is initiated/downloaded to the memory. The allocated page can be checked when enabling MMU debug-flag.

I would suggest you to check the error address to confirm whether the related page is allocated, and probably not, then you would check whether the error address is located inside stack location or somewhere else. if it's outside stack and is an invalid location, maybe you should further enlarge the stack space. except for the process.StackSize, i think maybe you should also modify the process.stackBase.

Please let me know if you fix this issue.

Regards,
S2K.

At 2024-01-15 09:25:02, "muke101 via gem5-users" gem5-users@gem5.org wrote:

Hi, thanks for the reply.

Assuming that the page isn't being allocated when it should be, what could I do with this information? I'm not familiar with this part of the Gem5 codebase.

Also, just to be clear, what do you mean by 'when the image download is done'?

Thanks.

Sent from Proton Mail mobile

-------- Original Message --------
On 15 Jan 2024, 01:18, sun2k23 via gem5-users wrote:

Hi,

I have ever met several similar issues but i'm not running Spec2017. I think that's because you use this address 0x660000195c before allocating page firstly. You can enable the debug-flag of MMU to check whether the page related to this address is allocated when the image download is done.

S2k

At 2024-01-14 04:40:13, "muke101 via gem5-users" gem5-users@gem5.org wrote:

Hi, I'm trying to checkpoint spec2017 with NonCachingSimpleCPU according to simpoints I generated from native AArch64 binaries. On an unmodified Gem5, a page fault occurs in the simulation (pasted the error message at the bottom).

I've read that GCC can run out of stack space when simulated, so I edited arch/arm/process.cc to have a max stack size of 16MB instead of the default 8MB, but this hasn't helped. Can someone confirm this is the right way to increase the max stack size? The total memory allowed to the simulation if 50GB (set if --mem-size on the command line).

Another possibility is that it's due to me compiling the benchmark without '--no-strict-aliasing', which the spec documentation suggests to, but I imagine if this were the problem then the program wouldn't had been able to run natively, which it does.

Has anyone run into this problem before and been able to solve it?

Thanks.

build/ARM/sim/simulate.cc:194: info: Entering event queue @ 188431954680500. Starting simulation...
build/ARM/sim/faults.cc:104: panic: panic condition !handled && !tc->getSystemPtr()->trapToGdb(SIGSEGV, tc->contextId()) occurred: Page table fault when accessing virtual address 0x660000195c
Memory Usage: 52735052 KBytes
Program aborted at tick 191641021589000
--- BEGIN LIBC BACKTRACE ---
/sim_home/luke/PND-Loads/gem5/build/ARM/gem5.fast(+0x97d088)[0xaaaaca97c088]
/sim_home/luke/PND-Loads/gem5/build/ARM/gem5.fast(+0x9ef1bc)[0xaaaaca9ee1bc]
linux-vdso.so.1(__kernel_rt_sigreturn+0x0)[0x4000161c2688]
/lib/aarch64-linux-gnu/libc.so.6(raise+0xb0)[0x4000164c2598]
--- END LIBC BACKTRACE ---
Aborted (core dumped)

Sent with Proton Mail secure email.

Ah, I was trying to do this from arch/arm/process.cc, but I've found sim/Process.py which seems to actually control what the value is (at least when using the se.py config). Turns out Process.py already sets a 64MB stack by default. This suggests GCC crashing isn't due to running out of stack after all then as this is likely already enough. To be certain I've spun up another run with a stack size of 1GB, but I still expect it to crash. Is there anything else I can look into in the meantime to try and fix this? I've confirmed the problem wasn't fixed by -fno-strict-aliasing too. Sent with [Proton Mail](https://proton.me/) secure email. On Wednesday, 17 January 2024 at 02:50, sun2k23 via gem5-users <gem5-users@gem5.org> wrote: > you may have below python coding in your *.py configuration file: > process = Process(...) > process.executable = ... > > If you want to update the stack base, just add configuration like below: > process.stackBase = 0x.... // new stack base value > process.maxStackSize = 0x.... // stack size > > You can have a try, hope it helps. > > S2K > > At 2024-01-17 01:22:30, "muke101 via gem5-users" <gem5-users@gem5.org> wrote: > >> I see, thanks for the explanation. >> >> Could you elaborate for me how I should change the stack base address to increase available size? I had thought setting a higher address would but GCC immediately crashes if I do this. If I set a lower address it works but does this provide more space? >> >> Also, is the mmap_end variable relevant at all? >> >> Sent from Proton Mail mobile >> >> -------- Original Message -------- >> On 16 Jan 2024, 06:07, sun2k23 wrote: >> >>> Hi, >>> >>> Usually in SE mode, the page would be pre-allocated when the ELF file(what's also called image download by me previously) is initiated/downloaded to the memory. The allocated page can be checked when enabling MMU debug-flag. >>> >>> I would suggest you to check the error address to confirm whether the related page is allocated, and probably not, then you would check whether the error address is located inside stack location or somewhere else. if it's outside stack and is an invalid location, maybe you should further enlarge the stack space. except for the process.StackSize, i think maybe you should also modify the process.stackBase. >>> >>> Please let me know if you fix this issue. >>> >>> Regards, >>> S2K. >>> >>> At 2024-01-15 09:25:02, "muke101 via gem5-users" <gem5-users@gem5.org> wrote: >>> >>>> Hi, thanks for the reply. >>>> >>>> Assuming that the page isn't being allocated when it should be, what could I do with this information? I'm not familiar with this part of the Gem5 codebase. >>>> >>>> Also, just to be clear, what do you mean by 'when the image download is done'? >>>> >>>> Thanks. >>>> >>>> Sent from Proton Mail mobile >>>> >>>> -------- Original Message -------- >>>> On 15 Jan 2024, 01:18, sun2k23 via gem5-users wrote: >>>> >>>>> Hi, >>>>> >>>>> I have ever met several similar issues but i'm not running Spec2017. I think that's because you use this address 0x660000195c before allocating page firstly. You can enable the debug-flag of MMU to check whether the page related to this address is allocated when the image download is done. >>>>> >>>>> S2k >>>>> >>>>> At 2024-01-14 04:40:13, "muke101 via gem5-users" <gem5-users@gem5.org> wrote: >>>>> >>>>>> Hi, I'm trying to checkpoint spec2017 with NonCachingSimpleCPU according to simpoints I generated from native AArch64 binaries. On an unmodified Gem5, a page fault occurs in the simulation (pasted the error message at the bottom). >>>>>> >>>>>> I've read that GCC can run out of stack space when simulated, so I edited arch/arm/process.cc to have a max stack size of 16MB instead of the default 8MB, but this hasn't helped. Can someone confirm this is the right way to increase the max stack size? The total memory allowed to the simulation if 50GB (set if --mem-size on the command line). >>>>>> >>>>>> Another possibility is that it's due to me compiling the benchmark without '--no-strict-aliasing', which the spec documentation suggests to, but I imagine if this were the problem then the program wouldn't had been able to run natively, which it does. >>>>>> >>>>>> Has anyone run into this problem before and been able to solve it? >>>>>> >>>>>> Thanks. >>>>>> >>>>>> build/ARM/sim/simulate.cc:194: info: Entering event queue @ 188431954680500. Starting simulation... >>>>>> build/ARM/sim/faults.cc:104: panic: panic condition !handled && !tc->getSystemPtr()->trapToGdb(SIGSEGV, tc->contextId()) occurred: Page table fault when accessing virtual address 0x660000195c >>>>>> Memory Usage: 52735052 KBytes >>>>>> Program aborted at tick 191641021589000 >>>>>> --- BEGIN LIBC BACKTRACE --- >>>>>> /sim_home/luke/PND-Loads/gem5/build/ARM/gem5.fast(+0x97d088)[0xaaaaca97c088] >>>>>> /sim_home/luke/PND-Loads/gem5/build/ARM/gem5.fast(+0x9ef1bc)[0xaaaaca9ee1bc] >>>>>> linux-vdso.so.1(__kernel_rt_sigreturn+0x0)[0x4000161c2688] >>>>>> /lib/aarch64-linux-gnu/libc.so.6(raise+0xb0)[0x4000164c2598] >>>>>> --- END LIBC BACKTRACE --- >>>>>> Aborted (core dumped) >>>>>> >>>>>> Sent with [Proton Mail](https://proton.me/) secure email.