Hello all,
I'm looking for some help regarding failing simulations. My host system is running Ubuntu 20.04 x86, and the gem5 source version is v21.2.1.1
I have started running long simulations, and ran into a problem with the O3CPU model. I seem to have no issues running up to around 5 Billion instructions, some binaries up to 10 Billion, but not a single binary up to 20Billion, because the O3CPU model terminates the gem5 simulations with the following error:
build/ARM/cpu/o3/lsq_unit.cc:1018: panic: Is stalled should have been cleared by stalling load!
This happens both in full system and in syscall emulation. I am using aarch64. This does not happen with atomic or timing cpu variants.
I am trying to run a select number of spec2017 benchmarks, namely
507.cactuBSSN_r, 607.cactuBSSN_s, 638.imagick_s, 500.perlbench_r_param3, 502.gcc_r_param4, 641.leela_s
Out of all of the above benchmarks, all successfully run up to 5B, image magic is the only one that runs up to 10B, and none of them run up to 20B instructions, and this is the same case for both FS and SE simulations.
The O3 model code in question is in src/cpu/o3/lsq_unit.cc, but I don't know enough to try and debug this issue:
1014 // I don't think this can happen. It should have been cleared
1015 // by the stalling load.
1016 if (isStalled() &&
1017 storeQueue.back().instruction()->seqNum == stallingStoreIsn) {
1018 panic("Is stalled should have been cleared by stalling load!\n");
1019 stalled = false;
1020 stallingStoreIsn = 0;
1021 }
Below are the details of some example simulation runs that fail, alongside the full stderr output.
Aarch64 Full System (restoring from checkpoint made with atomic cpu):
Config: configs/example/fs.py
System args:
--mem-size="8000MB"
--cpu-type="O3CPU"
--restore-with-cpu="O3CPU"
--caches
--l2cache
-I="10000000000" \
Error:
build/ARM/cpu/o3/lsq_unit.cc:1018: panic: Is stalled should have been cleared by stalling load!
Memory Usage: 8979848 KBytes
Program aborted at tick 5929446216500
--- BEGIN LIBC BACKTRACE ---
/work/a00558011/gem5-internal/build/ARM/gem5.fast(+0x1a94570)[0x55a0cd625570]
/work/a00558011/gem5-internal/build/ARM/gem5.fast(+0x1aa8f4e)[0x55a0cd639f4e]
/lib/x86_64-linux-gnu/libpthread.so.0(+0x143c0)[0x7f70b6d913c0]
/lib/x86_64-linux-gnu/libc.so.6(gsignal+0xcb)[0x7f70b5f3703b]
/lib/x86_64-linux-gnu/libc.so.6(abort+0x12b)[0x7f70b5f16859]
/work/a00558011/gem5-internal/build/ARM/gem5.fast(+0x576a35)[0x55a0cc107a35]
/work/a00558011/gem5-internal/build/ARM/gem5.fast(+0xb2c153)[0x55a0cc6bd153]
/work/a00558011/gem5-internal/build/ARM/gem5.fast(+0xb0dd36)[0x55a0cc69ed36]
/work/a00558011/gem5-internal/build/ARM/gem5.fast(+0xb0eaa3)[0x55a0cc69faa3]
/work/a00558011/gem5-internal/build/ARM/gem5.fast(+0xb0ffbe)[0x55a0cc6a0fbe]
/work/a00558011/gem5-internal/build/ARM/gem5.fast(+0xaedc7f)[0x55a0cc67ec7f]
/work/a00558011/gem5-internal/build/ARM/gem5.fast(+0x1a9ca68)[0x55a0cd62da68]
/work/a00558011/gem5-internal/build/ARM/gem5.fast(+0x1abf564)[0x55a0cd650564]
/work/a00558011/gem5-internal/build/ARM/gem5.fast(+0x1ac022e)[0x55a0cd65122e]
/work/a00558011/gem5-internal/build/ARM/gem5.fast(+0xb92c22)[0x55a0cc723c22]
/work/a00558011/gem5-internal/build/ARM/gem5.fast(+0xa6a1cd)[0x55a0cc5fb1cd]
/lib/x86_64-linux-gnu/libpython3.8.so.1.0(+0x2a8738)[0x7f70b7048738]
/lib/x86_64-linux-gnu/libpython3.8.so.1.0(_PyEval_EvalFrameDefault+0x8dd8)[0x7f70b6e1df48]
/lib/x86_64-linux-gnu/libpython3.8.so.1.0(_PyEval_EvalCodeWithName+0x8fb)[0x7f70b6f6ae3b]
/lib/x86_64-linux-gnu/libpython3.8.so.1.0(_PyFunction_Vectorcall+0x94)[0x7f70b7048114]
/lib/x86_64-linux-gnu/libpython3.8.so.1.0(+0x74d6d)[0x7f70b6e14d6d]
/lib/x86_64-linux-gnu/libpython3.8.so.1.0(_PyEval_EvalFrameDefault+0x7d86)[0x7f70b6e1cef6]
/lib/x86_64-linux-gnu/libpython3.8.so.1.0(+0x8006b)[0x7f70b6e2006b]
/lib/x86_64-linux-gnu/libpython3.8.so.1.0(+0x74d6d)[0x7f70b6e14d6d]
/lib/x86_64-linux-gnu/libpython3.8.so.1.0(_PyEval_EvalFrameDefault+0x12fd)[0x7f70b6e1646d]
/lib/x86_64-linux-gnu/libpython3.8.so.1.0(_PyEval_EvalCodeWithName+0x8fb)[0x7f70b6f6ae3b]
/lib/x86_64-linux-gnu/libpython3.8.so.1.0(_PyFunction_Vectorcall+0x94)[0x7f70b7048114]
/lib/x86_64-linux-gnu/libpython3.8.so.1.0(+0x74d6d)[0x7f70b6e14d6d]
/lib/x86_64-linux-gnu/libpython3.8.so.1.0(_PyEval_EvalFrameDefault+0x7d86)[0x7f70b6e1cef6]
/lib/x86_64-linux-gnu/libpython3.8.so.1.0(_PyEval_EvalCodeWithName+0x8fb)[0x7f70b6f6ae3b]
/lib/x86_64-linux-gnu/libpython3.8.so.1.0(PyEval_EvalCodeEx+0x42)[0x7f70b6f6b1c2]
/lib/x86_64-linux-gnu/libpython3.8.so.1.0(PyEval_EvalCode+0x1f)[0x7f70b6f6b5af]
--- END LIBC BACKTRACE ---
Aarch64 syscall emulation:
Config: configs/example/se.py
System args:
--mem-size="6000MB"
--cpu-type="O3CPU"
--caches
--l2cache
-I="10000000000"
--cmd="cpugcc_r_base.default"
--options="ref32.c -O5 -o ref32.opts-O5.s" \
output:
build/ARM/sim/syscall_emul.hh:1313: warn: mremapping to new vaddr 0x3fff082000-0x3fff183000, adding 524288
build/ARM/sim/syscall_emul.hh:1332: warn: returning 0x3fff082000 as start
build/ARM/sim/syscall_emul.hh:1313: warn: mremapping to new vaddr 0x3ffef81000-0x3fff082000, adding 524288
build/ARM/sim/syscall_emul.hh:1332: warn: returning 0x3ffef81000 as start
build/ARM/cpu/o3/lsq_unit.cc:1018: panic: Is stalled should have been cleared by stalling load!
Memory Usage: 6447992 KBytes
Program aborted at tick 5330662028500
--- BEGIN LIBC BACKTRACE ---
/work/a00558011/gem5-internal/build/ARM/gem5.fast(+0x1a94570)[0x562435997570]
/work/a00558011/gem5-internal/build/ARM/gem5.fast(+0x1aa8f4e)[0x5624359abf4e]
/lib/x86_64-linux-gnu/libpthread.so.0(+0x143c0)[0x7fccfddbd3c0]
/lib/x86_64-linux-gnu/libc.so.6(gsignal+0xcb)[0x7fccfcf6303b]
/lib/x86_64-linux-gnu/libc.so.6(abort+0x12b)[0x7fccfcf42859]
/work/a00558011/gem5-internal/build/ARM/gem5.fast(+0x576a35)[0x562434479a35]
/work/a00558011/gem5-internal/build/ARM/gem5.fast(+0xb2c153)[0x562434a2f153]
/work/a00558011/gem5-internal/build/ARM/gem5.fast(+0xb0dd36)[0x562434a10d36]
/work/a00558011/gem5-internal/build/ARM/gem5.fast(+0xb0eaa3)[0x562434a11aa3]
/work/a00558011/gem5-internal/build/ARM/gem5.fast(+0xb0ffbe)[0x562434a12fbe]
/work/a00558011/gem5-internal/build/ARM/gem5.fast(+0xaedc7f)[0x5624349f0c7f]
/work/a00558011/gem5-internal/build/ARM/gem5.fast(+0x1a9ca68)[0x56243599fa68]
/work/a00558011/gem5-internal/build/ARM/gem5.fast(+0x1abf564)[0x5624359c2564]
/work/a00558011/gem5-internal/build/ARM/gem5.fast(+0x1ac022e)[0x5624359c322e]
/work/a00558011/gem5-internal/build/ARM/gem5.fast(+0xb92c22)[0x562434a95c22]
/work/a00558011/gem5-internal/build/ARM/gem5.fast(+0xa6a1cd)[0x56243496d1cd]
/lib/x86_64-linux-gnu/libpython3.8.so.1.0(+0x2a8738)[0x7fccfe074738]
/lib/x86_64-linux-gnu/libpython3.8.so.1.0(_PyEval_EvalFrameDefault+0x8dd8)[0x7fccfde49f48]
/lib/x86_64-linux-gnu/libpython3.8.so.1.0(_PyEval_EvalCodeWithName+0x8fb)[0x7fccfdf96e3b]
/lib/x86_64-linux-gnu/libpython3.8.so.1.0(_PyFunction_Vectorcall+0x94)[0x7fccfe074114]
/lib/x86_64-linux-gnu/libpython3.8.so.1.0(+0x74d6d)[0x7fccfde40d6d]
/lib/x86_64-linux-gnu/libpython3.8.so.1.0(_PyEval_EvalFrameDefault+0x7d86)[0x7fccfde48ef6]
/lib/x86_64-linux-gnu/libpython3.8.so.1.0(+0x8006b)[0x7fccfde4c06b]
/lib/x86_64-linux-gnu/libpython3.8.so.1.0(+0x74d6d)[0x7fccfde40d6d]
/lib/x86_64-linux-gnu/libpython3.8.so.1.0(_PyEval_EvalFrameDefault+0x12fd)[0x7fccfde4246d]
/lib/x86_64-linux-gnu/libpython3.8.so.1.0(_PyEval_EvalCodeWithName+0x8fb)[0x7fccfdf96e3b]
/lib/x86_64-linux-gnu/libpython3.8.so.1.0(_PyFunction_Vectorcall+0x94)[0x7fccfe074114]
/lib/x86_64-linux-gnu/libpython3.8.so.1.0(+0x74d6d)[0x7fccfde40d6d]
/lib/x86_64-linux-gnu/libpython3.8.so.1.0(_PyEval_EvalFrameDefault+0x7d86)[0x7fccfde48ef6]
/lib/x86_64-linux-gnu/libpython3.8.so.1.0(_PyEval_EvalCodeWithName+0x8fb)[0x7fccfdf96e3b]
/lib/x86_64-linux-gnu/libpython3.8.so.1.0(PyEval_EvalCodeEx+0x42)[0x7fccfdf971c2]
/lib/x86_64-linux-gnu/libpython3.8.so.1.0(PyEval_EvalCode+0x1f)[0x7fccfdf975af]
--- END LIBC BACKTRACE ---
Any help with this issue would be sincerely appreciated. Thanks in advance,
Norbert K
Hi Norbert,
This is going to be a tough bug to track down! I would suggest enabling the
Exec debug flag (and maybe some others for the O3CPU) and using
--debug-start and starting the debug dumping a billion ticks or so before
the error happens. Hopefully, you can then trace back what is causing the
error. My bet is that there is some specific sequence of instructions which
is causing this.
This isn't an error I've seen before, so it's going to take some digging :).
Cheers,
Jason
On Thu, Jun 23, 2022 at 5:52 AM Norbertas Kremeris <
norbertas.kremeris@huawei.com> wrote:
Hello all,
I’m looking for some help regarding failing simulations. My host system is
running Ubuntu 20.04 x86, and the gem5 source version is v21.2.1.1
I have started running long simulations, and ran into a problem with the
O3CPU model. I seem to have no issues running up to around 5 Billion
instructions, some binaries up to 10 Billion, but not a single binary up to
20Billion, because the O3CPU model terminates the gem5 simulations with the
following error:
build/ARM/cpu/o3/lsq_unit.cc:1018: panic: Is stalled should have been
cleared by stalling load!
This happens both in full system and in syscall emulation. I am using
aarch64. This does not happen with atomic or timing cpu variants.
I am trying to run a select number of spec2017 benchmarks, namely
507.cactuBSSN_r, 607.cactuBSSN_s, 638.imagick_s, 500.perlbench_r_param3,
502.gcc_r_param4, 641.leela_s
Out of all of the above benchmarks, all successfully run up to 5B, image
magic is the only one that runs up to 10B, and none of them run up to 20B
instructions, and this is the same case for both FS and SE simulations.
The O3 model code in question is in src/cpu/o3/lsq_unit.cc, but I don’t
know enough to try and debug this issue:
1014 // I don't think this can happen. It should have been cleared
1015 // by the stalling load.
1016 if (isStalled() &&
1017 storeQueue.back().instruction()->seqNum ==
stallingStoreIsn) {
1018 panic("Is stalled should have been cleared by stalling
load!\n");
1019 stalled = false;
1020 stallingStoreIsn = 0;
1021 }
Below are the details of some example simulation runs that fail, alongside
the full stderr output.
Aarch64 Full System (restoring from checkpoint made with atomic cpu):
Config: configs/example/fs.py
System args:
--mem-size="8000MB" \
--cpu-type="O3CPU" \
--restore-with-cpu="O3CPU" \
--caches \
--l2cache \
-I="10000000000" \
Error:
build/ARM/cpu/o3/lsq_unit.cc:1018: panic: Is stalled should have been
cleared by stalling load!
Memory Usage: 8979848 KBytes
Program aborted at tick 5929446216500
--- BEGIN LIBC BACKTRACE ---
/work/a00558011/gem5-internal/build/ARM/gem5.fast(+0x1a94570)[0x55a0cd625570]
/work/a00558011/gem5-internal/build/ARM/gem5.fast(+0x1aa8f4e)[0x55a0cd639f4e]
/lib/x86_64-linux-gnu/libpthread.so.0(+0x143c0)[0x7f70b6d913c0]
/lib/x86_64-linux-gnu/libc.so.6(gsignal+0xcb)[0x7f70b5f3703b]
/lib/x86_64-linux-gnu/libc.so.6(abort+0x12b)[0x7f70b5f16859]
/work/a00558011/gem5-internal/build/ARM/gem5.fast(+0x576a35)[0x55a0cc107a35]
/work/a00558011/gem5-internal/build/ARM/gem5.fast(+0xb2c153)[0x55a0cc6bd153]
/work/a00558011/gem5-internal/build/ARM/gem5.fast(+0xb0dd36)[0x55a0cc69ed36]
/work/a00558011/gem5-internal/build/ARM/gem5.fast(+0xb0eaa3)[0x55a0cc69faa3]
/work/a00558011/gem5-internal/build/ARM/gem5.fast(+0xb0ffbe)[0x55a0cc6a0fbe]
/work/a00558011/gem5-internal/build/ARM/gem5.fast(+0xaedc7f)[0x55a0cc67ec7f]
/work/a00558011/gem5-internal/build/ARM/gem5.fast(+0x1a9ca68)[0x55a0cd62da68]
/work/a00558011/gem5-internal/build/ARM/gem5.fast(+0x1abf564)[0x55a0cd650564]
/work/a00558011/gem5-internal/build/ARM/gem5.fast(+0x1ac022e)[0x55a0cd65122e]
/work/a00558011/gem5-internal/build/ARM/gem5.fast(+0xb92c22)[0x55a0cc723c22]
/work/a00558011/gem5-internal/build/ARM/gem5.fast(+0xa6a1cd)[0x55a0cc5fb1cd]
/lib/x86_64-linux-gnu/libpython3.8.so.1.0(+0x2a8738)[0x7f70b7048738]
/lib/x86_64-linux-gnu/libpython3.8.so.1.0(_PyEval_EvalFrameDefault+0x8dd8)[0x7f70b6e1df48]
/lib/x86_64-linux-gnu/libpython3.8.so.1.0(_PyEval_EvalCodeWithName+0x8fb)[0x7f70b6f6ae3b]
/lib/x86_64-linux-gnu/libpython3.8.so.1.0(_PyFunction_Vectorcall+0x94)[0x7f70b7048114]
/lib/x86_64-linux-gnu/libpython3.8.so.1.0(+0x74d6d)[0x7f70b6e14d6d]
/lib/x86_64-linux-gnu/libpython3.8.so.1.0(_PyEval_EvalFrameDefault+0x7d86)[0x7f70b6e1cef6]
/lib/x86_64-linux-gnu/libpython3.8.so.1.0(+0x8006b)[0x7f70b6e2006b]
/lib/x86_64-linux-gnu/libpython3.8.so.1.0(+0x74d6d)[0x7f70b6e14d6d]
/lib/x86_64-linux-gnu/libpython3.8.so.1.0(_PyEval_EvalFrameDefault+0x12fd)[0x7f70b6e1646d]
/lib/x86_64-linux-gnu/libpython3.8.so.1.0(_PyEval_EvalCodeWithName+0x8fb)[0x7f70b6f6ae3b]
/lib/x86_64-linux-gnu/libpython3.8.so.1.0(_PyFunction_Vectorcall+0x94)[0x7f70b7048114]
/lib/x86_64-linux-gnu/libpython3.8.so.1.0(+0x74d6d)[0x7f70b6e14d6d]
/lib/x86_64-linux-gnu/libpython3.8.so.1.0(_PyEval_EvalFrameDefault+0x7d86)[0x7f70b6e1cef6]
/lib/x86_64-linux-gnu/libpython3.8.so.1.0(_PyEval_EvalCodeWithName+0x8fb)[0x7f70b6f6ae3b]
/lib/x86_64-linux-gnu/libpython3.8.so.1.0(PyEval_EvalCodeEx+0x42)[0x7f70b6f6b1c2]
/lib/x86_64-linux-gnu/libpython3.8.so.1.0(PyEval_EvalCode+0x1f)[0x7f70b6f6b5af]
--- END LIBC BACKTRACE ---
Aarch64 syscall emulation:
Config: configs/example/se.py
System args:
--mem-size="6000MB" \
--cpu-type="O3CPU" \
--caches \
--l2cache \
-I="10000000000" \
--cmd="cpugcc_r_base.default" \
--options="ref32.c -O5 -o ref32.opts-O5.s" \
output:
build/ARM/sim/syscall_emul.hh:1313: warn: mremapping to new vaddr
0x3fff082000-0x3fff183000, adding 524288
build/ARM/sim/syscall_emul.hh:1332: warn: returning 0x3fff082000 as start
build/ARM/sim/syscall_emul.hh:1313: warn: mremapping to new vaddr
0x3ffef81000-0x3fff082000, adding 524288
build/ARM/sim/syscall_emul.hh:1332: warn: returning 0x3ffef81000 as start
build/ARM/cpu/o3/lsq_unit.cc:1018: panic: Is stalled should have been
cleared by stalling load!
Memory Usage: 6447992 KBytes
Program aborted at tick 5330662028500
--- BEGIN LIBC BACKTRACE ---
/work/a00558011/gem5-internal/build/ARM/gem5.fast(+0x1a94570)[0x562435997570]
/work/a00558011/gem5-internal/build/ARM/gem5.fast(+0x1aa8f4e)[0x5624359abf4e]
/lib/x86_64-linux-gnu/libpthread.so.0(+0x143c0)[0x7fccfddbd3c0]
/lib/x86_64-linux-gnu/libc.so.6(gsignal+0xcb)[0x7fccfcf6303b]
/lib/x86_64-linux-gnu/libc.so.6(abort+0x12b)[0x7fccfcf42859]
/work/a00558011/gem5-internal/build/ARM/gem5.fast(+0x576a35)[0x562434479a35]
/work/a00558011/gem5-internal/build/ARM/gem5.fast(+0xb2c153)[0x562434a2f153]
/work/a00558011/gem5-internal/build/ARM/gem5.fast(+0xb0dd36)[0x562434a10d36]
/work/a00558011/gem5-internal/build/ARM/gem5.fast(+0xb0eaa3)[0x562434a11aa3]
/work/a00558011/gem5-internal/build/ARM/gem5.fast(+0xb0ffbe)[0x562434a12fbe]
/work/a00558011/gem5-internal/build/ARM/gem5.fast(+0xaedc7f)[0x5624349f0c7f]
/work/a00558011/gem5-internal/build/ARM/gem5.fast(+0x1a9ca68)[0x56243599fa68]
/work/a00558011/gem5-internal/build/ARM/gem5.fast(+0x1abf564)[0x5624359c2564]
/work/a00558011/gem5-internal/build/ARM/gem5.fast(+0x1ac022e)[0x5624359c322e]
/work/a00558011/gem5-internal/build/ARM/gem5.fast(+0xb92c22)[0x562434a95c22]
/work/a00558011/gem5-internal/build/ARM/gem5.fast(+0xa6a1cd)[0x56243496d1cd]
/lib/x86_64-linux-gnu/libpython3.8.so.1.0(+0x2a8738)[0x7fccfe074738]
/lib/x86_64-linux-gnu/libpython3.8.so.1.0(_PyEval_EvalFrameDefault+0x8dd8)[0x7fccfde49f48]
/lib/x86_64-linux-gnu/libpython3.8.so.1.0(_PyEval_EvalCodeWithName+0x8fb)[0x7fccfdf96e3b]
/lib/x86_64-linux-gnu/libpython3.8.so.1.0(_PyFunction_Vectorcall+0x94)[0x7fccfe074114]
/lib/x86_64-linux-gnu/libpython3.8.so.1.0(+0x74d6d)[0x7fccfde40d6d]
/lib/x86_64-linux-gnu/libpython3.8.so.1.0(_PyEval_EvalFrameDefault+0x7d86)[0x7fccfde48ef6]
/lib/x86_64-linux-gnu/libpython3.8.so.1.0(+0x8006b)[0x7fccfde4c06b]
/lib/x86_64-linux-gnu/libpython3.8.so.1.0(+0x74d6d)[0x7fccfde40d6d]
/lib/x86_64-linux-gnu/libpython3.8.so.1.0(_PyEval_EvalFrameDefault+0x12fd)[0x7fccfde4246d]
/lib/x86_64-linux-gnu/libpython3.8.so.1.0(_PyEval_EvalCodeWithName+0x8fb)[0x7fccfdf96e3b]
/lib/x86_64-linux-gnu/libpython3.8.so.1.0(_PyFunction_Vectorcall+0x94)[0x7fccfe074114]
/lib/x86_64-linux-gnu/libpython3.8.so.1.0(+0x74d6d)[0x7fccfde40d6d]
/lib/x86_64-linux-gnu/libpython3.8.so.1.0(_PyEval_EvalFrameDefault+0x7d86)[0x7fccfde48ef6]
/lib/x86_64-linux-gnu/libpython3.8.so.1.0(_PyEval_EvalCodeWithName+0x8fb)[0x7fccfdf96e3b]
/lib/x86_64-linux-gnu/libpython3.8.so.1.0(PyEval_EvalCodeEx+0x42)[0x7fccfdf971c2]
/lib/x86_64-linux-gnu/libpython3.8.so.1.0(PyEval_EvalCode+0x1f)[0x7fccfdf975af]
--- END LIBC BACKTRACE ---
Any help with this issue would be sincerely appreciated. Thanks in advance,
Norbert K
gem5-users mailing list -- gem5-users@gem5.org
To unsubscribe send an email to gem5-users-leave@gem5.org