gem5-users@gem5.org

The gem5 Users mailing list

View all threads

Debugging Gem5 Segmentation Fault in x86 Decoder

SG
Saileshwar, Gururaj
Mon, Jun 8, 2020 11:02 PM

Hi All,

I am trying to run SPEC-2017 benchmarks (compiled with Clang-3.5), on Gem5 in SE mode (unmodified stable version v20).

However, I am getting segmentation faults in the x86 Decoder of Gem5, (free of uninitialized pointer by tcmalloc).
The segmentation fault arises from a line in an auto-generated file for the x86 instruction decoder.

The exact line where I get the error is decode-method.cc.inc:20696:
return new X86Macroop::PSLLW_XMM_M(machInst, EmulEnv((MODRM_REG | (REX_R << 3)),
0,
OPSIZE,
ADDRSIZE,
STACKSIZE));

I am confused whether the compiler is generating an unsupported instruction or whether there is a bug in the Gem5 x86 Decoder. I was hoping for some help to debug this. Please see the stack-trace from GDB after Gem5 crashes, below.

Thanks,
Gururaj

ERROR: src/tcmalloc.cc:283 Attempt to free invalid pointer 0x5555564c83f4

------- STACK TRACE ---------
#0  __GI_raise (sig=sig(a)entry=6) at ../sysdeps/unix/sysv/linux/raise.c:51
#1  0x00007ffff5e5b801 in __GI_abort () at abort.c:79
#2  0x00007ffff6b6afb9 in tcmalloc::Log(tcmalloc::LogMode, char const*, int, tcmalloc::LogItem, tcmalloc::LogItem, tcmalloc::LogItem, tcmalloc::LogItem) () from /usr/lib/x86_64-linux-gnu/libtcmalloc.so.4
#3  0x00007ffff6b5ed49 in ?? () from /usr/lib/x86_64-linux-gnu/libtcmalloc.so.4
#4  0x00007ffff6b7e6b9 in tc_free () from /usr/lib/x86_64-linux-gnu/libtcmalloc.so.4
#5  0x000055555602fead in X86ISA::Decoder::decodeInst (this=this(a)entry=0x555557d1ac40, machInst=...) at build/X86/arch/x86/generated/decode-method.cc.inc:20696
#6  0x0000555555f98d04 in X86ISA::Decoder::decode (this=this(a)entry=0x555557d1ac40, mach_inst=..., addr=<optimized out>) at build/X86/arch/x86/decoder.cc:686
#7  0x0000555555f98ffb in X86ISA::Decoder::decode (this=0x555557d1ac40, nextPC=...) at build/X86/arch/x86/decoder.cc:731
#8  0x0000555555b79eff in DefaultFetch<O3CPUImpl>::fetch (this=this(a)entry=0x55555888b428, status_change=@0x7fffffffcbe5: true) at build/X86/cpu/o3/fetch_impl.hh:1297
#9  0x0000555555b7b28b in DefaultFetch<O3CPUImpl>::tick (this=this(a)entry=0x55555888b428) at build/X86/cpu/o3/fetch_impl.hh:930
#10 0x0000555555b52aab in FullO3CPU<O3CPUImpl>::tick (this=0x55555888b000) at build/X86/cpu/o3/cpu.cc:531

decode-method.cc.inc:20696:
return new X86Macroop::PSLLW_XMM_M(machInst, EmulEnv((MODRM_REG | (REX_R << 3)),
0,
OPSIZE,
ADDRSIZE,
STACKSIZE));

This line is called from build/X86/arch/x86/decoder.cc:696 :
StaticInstPtr
Decoder::decode(ExtMachInst mach_inst, Addr addr)
{
auto iter = instMap->find(mach_inst);
if (iter != instMap->end())
return iter->second;

=>    StaticInstPtr si = decodeInst(mach_inst);

Some additional details:
-- Error only occurs when my benchmarks are compiled with Clang-3.5 (have tried multiple other versions of clang including 5,8,10; same error). Error does not occur when benchmarks are compiled with GCC 6.
-- Error only occurs with gem5.opt. gem5.debug runs without the error.
-- I am running Ubuntu 18.04, with Gem5 compiled with GCC 4.8 (error occurs when Gem5 is compiled with GCC 6 as well)

Hi All, I am trying to run SPEC-2017 benchmarks (compiled with Clang-3.5), on Gem5 in SE mode (unmodified stable version v20). However, I am getting segmentation faults in the x86 Decoder of Gem5, (free of uninitialized pointer by tcmalloc). The segmentation fault arises from a line in an auto-generated file for the x86 instruction decoder. The exact line where I get the error is decode-method.cc.inc:20696: return new X86Macroop::PSLLW_XMM_M(machInst, EmulEnv((MODRM_REG | (REX_R << 3)), 0, OPSIZE, ADDRSIZE, STACKSIZE)); I am confused whether the compiler is generating an unsupported instruction or whether there is a bug in the Gem5 x86 Decoder. I was hoping for some help to debug this. Please see the stack-trace from GDB after Gem5 crashes, below. Thanks, Gururaj -------------- ERROR: src/tcmalloc.cc:283 Attempt to free invalid pointer 0x5555564c83f4 ------- STACK TRACE --------- #0 __GI_raise (sig=sig(a)entry=6) at ../sysdeps/unix/sysv/linux/raise.c:51 #1 0x00007ffff5e5b801 in __GI_abort () at abort.c:79 #2 0x00007ffff6b6afb9 in tcmalloc::Log(tcmalloc::LogMode, char const*, int, tcmalloc::LogItem, tcmalloc::LogItem, tcmalloc::LogItem, tcmalloc::LogItem) () from /usr/lib/x86_64-linux-gnu/libtcmalloc.so.4 #3 0x00007ffff6b5ed49 in ?? () from /usr/lib/x86_64-linux-gnu/libtcmalloc.so.4 #4 0x00007ffff6b7e6b9 in tc_free () from /usr/lib/x86_64-linux-gnu/libtcmalloc.so.4 #5 0x000055555602fead in X86ISA::Decoder::decodeInst (this=this(a)entry=0x555557d1ac40, machInst=...) at build/X86/arch/x86/generated/decode-method.cc.inc:20696 #6 0x0000555555f98d04 in X86ISA::Decoder::decode (this=this(a)entry=0x555557d1ac40, mach_inst=..., addr=<optimized out>) at build/X86/arch/x86/decoder.cc:686 #7 0x0000555555f98ffb in X86ISA::Decoder::decode (this=0x555557d1ac40, nextPC=...) at build/X86/arch/x86/decoder.cc:731 #8 0x0000555555b79eff in DefaultFetch<O3CPUImpl>::fetch (this=this(a)entry=0x55555888b428, status_change=@0x7fffffffcbe5: true) at build/X86/cpu/o3/fetch_impl.hh:1297 #9 0x0000555555b7b28b in DefaultFetch<O3CPUImpl>::tick (this=this(a)entry=0x55555888b428) at build/X86/cpu/o3/fetch_impl.hh:930 #10 0x0000555555b52aab in FullO3CPU<O3CPUImpl>::tick (this=0x55555888b000) at build/X86/cpu/o3/cpu.cc:531 decode-method.cc.inc:20696: return new X86Macroop::PSLLW_XMM_M(machInst, EmulEnv((MODRM_REG | (REX_R << 3)), 0, OPSIZE, ADDRSIZE, STACKSIZE)); This line is called from build/X86/arch/x86/decoder.cc:696 : StaticInstPtr Decoder::decode(ExtMachInst mach_inst, Addr addr) { auto iter = instMap->find(mach_inst); if (iter != instMap->end()) return iter->second; => StaticInstPtr si = decodeInst(mach_inst); Some additional details: -- Error only occurs when my benchmarks are compiled with Clang-3.5 (have tried multiple other versions of clang including 5,8,10; same error). Error does not occur when benchmarks are compiled with GCC 6. -- Error only occurs with gem5.opt. gem5.debug runs without the error. -- I am running Ubuntu 18.04, with Gem5 compiled with GCC 4.8 (error occurs when Gem5 is compiled with GCC 6 as well)
SG
Saileshwar, Gururaj
Tue, Jun 9, 2020 11:08 PM

Hello,

It seems like the Gem5 segmentation fault I am getting is due to an SSE2 instruction.

So I tried disabling SSE2 while compiling my benchmark with mno-sse2 compiler flag - but with this flag, my benchmark refused to refused to compile.  It looks like my shared libraries such as libc and libm are using SSE registers (e.g. for floating point operations). I am also seeing this segmentation fault for benchmarks compiled with GCC 4.8. So I guess, this is not specific to compiling benchmarks with Clang, as I thought earlier.

I am confused how this has not been faced by others before, since I believe SSE 2 is pretty old and  assumed as a given in x86-64 ISA.

The only work around I can think of is compiling my glibc without SIMD instructions (I am not quite sure how I would do that). But is this something that the Gem5 experts usually do/require ??

Would appreciate thoughts / comments from the experts !

Thanks,
Gururaj

Get Outlook for Androidhttps://aka.ms/ghei36


From: Saileshwar, Gururaj via gem5-users <gem5-users(a)gem5.org>
Sent: Monday, June 8, 2020 7:02:18 PM
To: gem5 users mailing list <gem5-users(a)gem5.org>
Cc: Saileshwar, Gururaj <gururaj.s(a)gatech.edu>
Subject: [gem5-users] Debugging Gem5 Segmentation Fault in x86 Decoder

Hi All,

I am trying to run SPEC-2017 benchmarks (compiled with Clang-3.5), on Gem5 in SE mode (unmodified stable version v20).

However, I am getting segmentation faults in the x86 Decoder of Gem5, (free of uninitialized pointer by tcmalloc).
The segmentation fault arises from a line in an auto-generated file for the x86 instruction decoder.

The exact line where I get the error is decode-method.cc.inc:20696:
return new X86Macroop::PSLLW_XMM_M(machInst, EmulEnv((MODRM_REG | (REX_R << 3)),
0,
OPSIZE,
ADDRSIZE,
STACKSIZE));

I am confused whether the compiler is generating an unsupported instruction or whether there is a bug in the Gem5 x86 Decoder. I was hoping for some help to debug this. Please see the stack-trace from GDB after Gem5 crashes, below.

Thanks,
Gururaj

ERROR: src/tcmalloc.cc:283 Attempt to free invalid pointer 0x5555564c83f4

------- STACK TRACE ---------
#0  __GI_raise (sig=sig(a)entry=6) at ../sysdeps/unix/sysv/linux/raise.c:51
#1  0x00007ffff5e5b801 in __GI_abort () at abort.c:79
#2  0x00007ffff6b6afb9 in tcmalloc::Log(tcmalloc::LogMode, char const*, int, tcmalloc::LogItem, tcmalloc::LogItem, tcmalloc::LogItem, tcmalloc::LogItem) () from /usr/lib/x86_64-linux-gnu/libtcmalloc.so.4
#3  0x00007ffff6b5ed49 in ?? () from /usr/lib/x86_64-linux-gnu/libtcmalloc.so.4
#4  0x00007ffff6b7e6b9 in tc_free () from /usr/lib/x86_64-linux-gnu/libtcmalloc.so.4
#5  0x000055555602fead in X86ISA::Decoder::decodeInst (this=this(a)entry=0x555557d1ac40, machInst=...) at build/X86/arch/x86/generated/decode-method.cc.inc:20696
#6  0x0000555555f98d04 in X86ISA::Decoder::decode (this=this(a)entry=0x555557d1ac40, mach_inst=..., addr=<optimized out>) at build/X86/arch/x86/decoder.cc:686
#7  0x0000555555f98ffb in X86ISA::Decoder::decode (this=0x555557d1ac40, nextPC=...) at build/X86/arch/x86/decoder.cc:731
#8  0x0000555555b79eff in DefaultFetch<O3CPUImpl>::fetch (this=this(a)entry=0x55555888b428, status_change=@0x7fffffffcbe5: true) at build/X86/cpu/o3/fetch_impl.hh:1297
#9  0x0000555555b7b28b in DefaultFetch<O3CPUImpl>::tick (this=this(a)entry=0x55555888b428) at build/X86/cpu/o3/fetch_impl.hh:930
#10 0x0000555555b52aab in FullO3CPU<O3CPUImpl>::tick (this=0x55555888b000) at build/X86/cpu/o3/cpu.cc:531

decode-method.cc.inc:20696:
return new X86Macroop::PSLLW_XMM_M(machInst, EmulEnv((MODRM_REG | (REX_R << 3)),
0,
OPSIZE,
ADDRSIZE,
STACKSIZE));

This line is called from build/X86/arch/x86/decoder.cc:696 :
StaticInstPtr
Decoder::decode(ExtMachInst mach_inst, Addr addr)
{
auto iter = instMap->find(mach_inst);
if (iter != instMap->end())
return iter->second;

=>    StaticInstPtr si = decodeInst(mach_inst);

Some additional details:
-- Error only occurs when my benchmarks are compiled with Clang-3.5 (have tried multiple other versions of clang including 5,8,10; same error). Error does not occur when benchmarks are compiled with GCC 6.
-- Error only occurs with gem5.opt. gem5.debug runs without the error.
-- I am running Ubuntu 18.04, with Gem5 compiled with GCC 4.8 (error occurs when Gem5 is compiled with GCC 6 as well)

Hello, It seems like the Gem5 segmentation fault I am getting is due to an SSE2 instruction. So I tried disabling SSE2 while compiling my benchmark with mno-sse2 compiler flag - but with this flag, my benchmark refused to refused to compile. It looks like my shared libraries such as libc and libm are using SSE registers (e.g. for floating point operations). I am also seeing this segmentation fault for benchmarks compiled with GCC 4.8. So I guess, this is not specific to compiling benchmarks with Clang, as I thought earlier. I am confused how this has not been faced by others before, since I believe SSE 2 is pretty old and assumed as a given in x86-64 ISA. The only work around I can think of is compiling my glibc without SIMD instructions (I am not quite sure how I would do that). But is this something that the Gem5 experts usually do/require ?? Would appreciate thoughts / comments from the experts ! Thanks, Gururaj Get Outlook for Android<https://aka.ms/ghei36> ________________________________ From: Saileshwar, Gururaj via gem5-users <gem5-users(a)gem5.org> Sent: Monday, June 8, 2020 7:02:18 PM To: gem5 users mailing list <gem5-users(a)gem5.org> Cc: Saileshwar, Gururaj <gururaj.s(a)gatech.edu> Subject: [gem5-users] Debugging Gem5 Segmentation Fault in x86 Decoder Hi All, I am trying to run SPEC-2017 benchmarks (compiled with Clang-3.5), on Gem5 in SE mode (unmodified stable version v20). However, I am getting segmentation faults in the x86 Decoder of Gem5, (free of uninitialized pointer by tcmalloc). The segmentation fault arises from a line in an auto-generated file for the x86 instruction decoder. The exact line where I get the error is decode-method.cc.inc:20696: return new X86Macroop::PSLLW_XMM_M(machInst, EmulEnv((MODRM_REG | (REX_R << 3)), 0, OPSIZE, ADDRSIZE, STACKSIZE)); I am confused whether the compiler is generating an unsupported instruction or whether there is a bug in the Gem5 x86 Decoder. I was hoping for some help to debug this. Please see the stack-trace from GDB after Gem5 crashes, below. Thanks, Gururaj -------------- ERROR: src/tcmalloc.cc:283 Attempt to free invalid pointer 0x5555564c83f4 ------- STACK TRACE --------- #0 __GI_raise (sig=sig(a)entry=6) at ../sysdeps/unix/sysv/linux/raise.c:51 #1 0x00007ffff5e5b801 in __GI_abort () at abort.c:79 #2 0x00007ffff6b6afb9 in tcmalloc::Log(tcmalloc::LogMode, char const*, int, tcmalloc::LogItem, tcmalloc::LogItem, tcmalloc::LogItem, tcmalloc::LogItem) () from /usr/lib/x86_64-linux-gnu/libtcmalloc.so.4 #3 0x00007ffff6b5ed49 in ?? () from /usr/lib/x86_64-linux-gnu/libtcmalloc.so.4 #4 0x00007ffff6b7e6b9 in tc_free () from /usr/lib/x86_64-linux-gnu/libtcmalloc.so.4 #5 0x000055555602fead in X86ISA::Decoder::decodeInst (this=this(a)entry=0x555557d1ac40, machInst=...) at build/X86/arch/x86/generated/decode-method.cc.inc:20696 #6 0x0000555555f98d04 in X86ISA::Decoder::decode (this=this(a)entry=0x555557d1ac40, mach_inst=..., addr=<optimized out>) at build/X86/arch/x86/decoder.cc:686 #7 0x0000555555f98ffb in X86ISA::Decoder::decode (this=0x555557d1ac40, nextPC=...) at build/X86/arch/x86/decoder.cc:731 #8 0x0000555555b79eff in DefaultFetch<O3CPUImpl>::fetch (this=this(a)entry=0x55555888b428, status_change=@0x7fffffffcbe5: true) at build/X86/cpu/o3/fetch_impl.hh:1297 #9 0x0000555555b7b28b in DefaultFetch<O3CPUImpl>::tick (this=this(a)entry=0x55555888b428) at build/X86/cpu/o3/fetch_impl.hh:930 #10 0x0000555555b52aab in FullO3CPU<O3CPUImpl>::tick (this=0x55555888b000) at build/X86/cpu/o3/cpu.cc:531 decode-method.cc.inc:20696: return new X86Macroop::PSLLW_XMM_M(machInst, EmulEnv((MODRM_REG | (REX_R << 3)), 0, OPSIZE, ADDRSIZE, STACKSIZE)); This line is called from build/X86/arch/x86/decoder.cc:696 : StaticInstPtr Decoder::decode(ExtMachInst mach_inst, Addr addr) { auto iter = instMap->find(mach_inst); if (iter != instMap->end()) return iter->second; => StaticInstPtr si = decodeInst(mach_inst); Some additional details: -- Error only occurs when my benchmarks are compiled with Clang-3.5 (have tried multiple other versions of clang including 5,8,10; same error). Error does not occur when benchmarks are compiled with GCC 6. -- Error only occurs with gem5.opt. gem5.debug runs without the error. -- I am running Ubuntu 18.04, with Gem5 compiled with GCC 4.8 (error occurs when Gem5 is compiled with GCC 6 as well)
GB
Gabe Black
Tue, Jun 9, 2020 11:28 PM

SSE2 should work with no problems. It's very unlikely this has anything to
do with the compiler you're using for your benchmark, other than that it
might coincidentally set up a scenario which exposes a bug. If your program
has a bad instruction in it somehow, the decoder should handle that just
fine and return an instruction which will produce an undefined instruction
fault if actually executed and committed. The function and line you pointed
out is allocating a new instruction object, so it's not clear why anything
would be being freed there to begin with, although that is what that
backtrace suggests is happening. The difference between opt and debug is
likely just because they put things in slightly different places and have
slightly different code which might make a memory error not cause any
visible damage, or crash the program, depending on what got corrupted.

There are a few things you can do to try to find out where the memory error
is, assuming that there is one (which I think is likely). One would be to
use GDB to try to catch the corruption as it happens with watchpoints, etc.
This might be tricky, but might be the only way to catch the error in some
circumstances.

The other slower but more automatic way would be to use a tool called
valgrind. I would suggest trying it with both the opt and debug versions of
the binary since debug will give you better, more readable output, but opt
is the one that's crashing. Be aware that valgrind runs the program
slightly different, so it may no longer crash. To get good results, you
need to turn off tcmalloc with the --without-tcmalloc scons argument.
valgrind works in part by replacing/instrumenting certain heap related
functions, and if you've replaced them with tcmalloc it doesn't work any
more. Also, you'll want to pass valgrind suppressions
in util/valgrind-suppressions. This tells valgrind to ignore some well
known false positives in the python interpreter which produce a lot of
noise in the output and can drown out actual, useful messages.

So, gdb with watchpoints, breakpoints, etc, or valgrind, built with
--without-tcmalloc, run with util/valgrind-suppressions, try both opt and
debug.

That should give you a little more information to work with. With memory
issues, it's quite possible you're seeing a symptom crop up in a place that
has little or even nothing to do with where the actual problem is. You're
going to need to dig a bit more to find out the real cause. It can be hard
to help with these problems too, since they might magically go away when
somebody else tries the same thing with a slightly different setup.

Gabe

On Tue, Jun 9, 2020 at 4:09 PM Saileshwar, Gururaj via gem5-users <
gem5-users(a)gem5.org> wrote:

Hello,

It seems like the Gem5 segmentation fault I am getting is due to an SSE2
instruction.

So I tried disabling SSE2 while compiling my benchmark with mno-sse2
compiler flag - but with this flag, my benchmark refused to refused to
compile.  It looks like my shared libraries such as libc and libm are using
SSE registers (e.g. for floating point operations). I am also seeing this
segmentation fault for benchmarks compiled with GCC 4.8. So I guess, this
is not specific to compiling benchmarks with Clang, as I thought earlier.

I am confused how this has not been faced by others before, since I
believe SSE 2 is pretty old and  assumed as a given in x86-64 ISA.

The only work around I can think of is compiling my glibc without SIMD
instructions (I am not quite sure how I would do that). But is this
something that the Gem5 experts usually do/require ??

Would appreciate thoughts / comments from the experts !

Thanks,
Gururaj

Get Outlook for Android https://aka.ms/ghei36


From: Saileshwar, Gururaj via gem5-users <gem5-users(a)gem5.org>
Sent: Monday, June 8, 2020 7:02:18 PM
To: gem5 users mailing list <gem5-users(a)gem5.org>
Cc: Saileshwar, Gururaj <gururaj.s(a)gatech.edu>
Subject: [gem5-users] Debugging Gem5 Segmentation Fault in x86 Decoder

Hi All,

I am trying to run SPEC-2017 benchmarks (compiled with Clang-3.5), on Gem5
in SE mode (unmodified stable version v20).

However, I am getting segmentation faults in the x86 Decoder of Gem5,
(free of uninitialized pointer by tcmalloc).
The segmentation fault arises from a line in an auto-generated file for
the x86 instruction decoder.

The exact line where I get the error is decode-method.cc.inc:20696:
return new X86Macroop::PSLLW_XMM_M(machInst, EmulEnv((MODRM_REG | (REX_R
<< 3)),
0,
OPSIZE,
ADDRSIZE,
STACKSIZE));

I am confused whether the compiler is generating an unsupported
instruction or whether there is a bug in the Gem5 x86 Decoder. I was
hoping for some help to debug this. Please see the stack-trace from GDB
after Gem5 crashes, below.

Thanks,
Gururaj

ERROR: src/tcmalloc.cc:283 Attempt to free invalid pointer 0x5555564c83f4

------- STACK TRACE ---------
#0  __GI_raise (sig=sig(a)entry=6) at ../sysdeps/unix/sysv/linux/raise.c:51
#1  0x00007ffff5e5b801 in __GI_abort () at abort.c:79
#2  0x00007ffff6b6afb9 in tcmalloc::Log(tcmalloc::LogMode, char const*,
int, tcmalloc::LogItem, tcmalloc::LogItem, tcmalloc::LogItem,
tcmalloc::LogItem) () from /usr/lib/x86_64-linux-gnu/libtcmalloc.so.4
#3  0x00007ffff6b5ed49 in ?? () from
/usr/lib/x86_64-linux-gnu/libtcmalloc.so.4
#4  0x00007ffff6b7e6b9 in tc_free () from
/usr/lib/x86_64-linux-gnu/libtcmalloc.so.4
#5  0x000055555602fead in X86ISA::Decoder::decodeInst (this=this(a)entry=0x555557d1ac40,
machInst=...) at build/X86/arch/x86/generated/decode-method.cc.inc:20696
#6  0x0000555555f98d04 in X86ISA::Decoder::decode (this=this(a)entry=0x555557d1ac40,
mach_inst=..., addr=<optimized out>) at build/X86/arch/x86/decoder.cc:686
#7  0x0000555555f98ffb in X86ISA::Decoder::decode (this=0x555557d1ac40,
nextPC=...) at build/X86/arch/x86/decoder.cc:731
#8  0x0000555555b79eff in DefaultFetch<O3CPUImpl>::fetch (this=this(a)entry=0x55555888b428,
status_change=@0x7fffffffcbe5: true) at build/X86/cpu/o3/fetch_impl.hh:1297
#9  0x0000555555b7b28b in DefaultFetch<O3CPUImpl>::tick (this=this(a)entry=0x55555888b428)
at build/X86/cpu/o3/fetch_impl.hh:930
#10 0x0000555555b52aab in FullO3CPU<O3CPUImpl>::tick (this=0x55555888b000)
at build/X86/cpu/o3/cpu.cc:531

decode-method.cc.inc:20696:
return new X86Macroop::PSLLW_XMM_M(machInst, EmulEnv((MODRM_REG | (REX_R
<< 3)),
0,
OPSIZE,
ADDRSIZE,
STACKSIZE));

This line is called from build/X86/arch/x86/decoder.cc:696 :
StaticInstPtr
Decoder::decode(ExtMachInst mach_inst, Addr addr)
{
auto iter = instMap->find(mach_inst);
if (iter != instMap->end())
return iter->second;

=>    StaticInstPtr si = decodeInst(mach_inst);

Some additional details:
-- Error only occurs when my benchmarks are compiled with Clang-3.5 (have
tried multiple other versions of clang including 5,8,10; same error). Error
does not occur when benchmarks are compiled with GCC 6.
-- Error only occurs with gem5.opt. gem5.debug runs without the error.
-- I am running Ubuntu 18.04, with Gem5 compiled with GCC 4.8 (error
occurs when Gem5 is compiled with GCC 6 as well)


gem5-users mailing list -- gem5-users(a)gem5.org
To unsubscribe send an email to gem5-users-leave(a)gem5.org
%(web_page_url)slistinfo%(cgiext)s/%(_internal_name)s

SSE2 should work with no problems. It's very unlikely this has anything to do with the compiler you're using for your benchmark, other than that it might coincidentally set up a scenario which exposes a bug. If your program has a bad instruction in it somehow, the decoder should handle that just fine and return an instruction which will produce an undefined instruction fault if actually executed and committed. The function and line you pointed out is allocating a new instruction object, so it's not clear why anything would be being freed there to begin with, although that is what that backtrace suggests is happening. The difference between opt and debug is likely just because they put things in slightly different places and have slightly different code which might make a memory error not cause any visible damage, or crash the program, depending on what got corrupted. There are a few things you can do to try to find out where the memory error is, assuming that there is one (which I think is likely). One would be to use GDB to try to catch the corruption as it happens with watchpoints, etc. This might be tricky, but might be the only way to catch the error in some circumstances. The other slower but more automatic way would be to use a tool called valgrind. I would suggest trying it with both the opt and debug versions of the binary since debug will give you better, more readable output, but opt is the one that's crashing. Be aware that valgrind runs the program slightly different, so it may no longer crash. To get good results, you need to turn off tcmalloc with the --without-tcmalloc scons argument. valgrind works in part by replacing/instrumenting certain heap related functions, and if you've replaced them with tcmalloc it doesn't work any more. Also, you'll want to pass valgrind suppressions in util/valgrind-suppressions. This tells valgrind to ignore some well known false positives in the python interpreter which produce a lot of noise in the output and can drown out actual, useful messages. So, gdb with watchpoints, breakpoints, etc, or valgrind, built with --without-tcmalloc, run with util/valgrind-suppressions, try both opt and debug. That should give you a little more information to work with. With memory issues, it's quite possible you're seeing a symptom crop up in a place that has little or even nothing to do with where the actual problem is. You're going to need to dig a bit more to find out the real cause. It can be hard to help with these problems too, since they might magically go away when somebody else tries the same thing with a slightly different setup. Gabe On Tue, Jun 9, 2020 at 4:09 PM Saileshwar, Gururaj via gem5-users < gem5-users(a)gem5.org> wrote: > Hello, > > It seems like the Gem5 segmentation fault I am getting is due to an SSE2 > instruction. > > So I tried disabling SSE2 while compiling my benchmark with mno-sse2 > compiler flag - but with this flag, my benchmark refused to refused to > compile. It looks like my shared libraries such as libc and libm are using > SSE registers (e.g. for floating point operations). I am also seeing this > segmentation fault for benchmarks compiled with GCC 4.8. So I guess, this > is not specific to compiling benchmarks with Clang, as I thought earlier. > > > I am confused how this has not been faced by others before, since I > believe SSE 2 is pretty old and assumed as a given in x86-64 ISA. > > The only work around I can think of is compiling my glibc without SIMD > instructions (I am not quite sure how I would do that). But is this > something that the Gem5 experts usually do/require ?? > > Would appreciate thoughts / comments from the experts ! > > Thanks, > Gururaj > > > > Get Outlook for Android <https://aka.ms/ghei36> > > ------------------------------ > *From:* Saileshwar, Gururaj via gem5-users <gem5-users(a)gem5.org> > *Sent:* Monday, June 8, 2020 7:02:18 PM > *To:* gem5 users mailing list <gem5-users(a)gem5.org> > *Cc:* Saileshwar, Gururaj <gururaj.s(a)gatech.edu> > *Subject:* [gem5-users] Debugging Gem5 Segmentation Fault in x86 Decoder > > Hi All, > > I am trying to run SPEC-2017 benchmarks (compiled with Clang-3.5), on Gem5 > in SE mode (unmodified stable version v20). > > However, I am getting segmentation faults in the x86 Decoder of Gem5, > (free of uninitialized pointer by tcmalloc). > The segmentation fault arises from a line in an auto-generated file for > the x86 instruction decoder. > > The exact line where I get the error is decode-method.cc.inc:20696: > return new X86Macroop::PSLLW_XMM_M(machInst, EmulEnv((MODRM_REG | (REX_R > << 3)), > 0, > OPSIZE, > ADDRSIZE, > STACKSIZE)); > > I am confused whether the compiler is generating an unsupported > instruction or whether there is a bug in the Gem5 x86 Decoder. I was > hoping for some help to debug this. Please see the stack-trace from GDB > after Gem5 crashes, below. > > Thanks, > Gururaj > -------------- > > ERROR: src/tcmalloc.cc:283 Attempt to free invalid pointer 0x5555564c83f4 > > > ------- STACK TRACE --------- > #0 __GI_raise (sig=sig(a)entry=6) at ../sysdeps/unix/sysv/linux/raise.c:51 > #1 0x00007ffff5e5b801 in __GI_abort () at abort.c:79 > #2 0x00007ffff6b6afb9 in tcmalloc::Log(tcmalloc::LogMode, char const*, > int, tcmalloc::LogItem, tcmalloc::LogItem, tcmalloc::LogItem, > tcmalloc::LogItem) () from /usr/lib/x86_64-linux-gnu/libtcmalloc.so.4 > #3 0x00007ffff6b5ed49 in ?? () from > /usr/lib/x86_64-linux-gnu/libtcmalloc.so.4 > #4 0x00007ffff6b7e6b9 in tc_free () from > /usr/lib/x86_64-linux-gnu/libtcmalloc.so.4 > #5 0x000055555602fead in X86ISA::Decoder::decodeInst (this=this(a)entry=0x555557d1ac40, > machInst=...) at build/X86/arch/x86/generated/decode-method.cc.inc:20696 > #6 0x0000555555f98d04 in X86ISA::Decoder::decode (this=this(a)entry=0x555557d1ac40, > mach_inst=..., addr=<optimized out>) at build/X86/arch/x86/decoder.cc:686 > #7 0x0000555555f98ffb in X86ISA::Decoder::decode (this=0x555557d1ac40, > nextPC=...) at build/X86/arch/x86/decoder.cc:731 > #8 0x0000555555b79eff in DefaultFetch<O3CPUImpl>::fetch (this=this(a)entry=0x55555888b428, > status_change=@0x7fffffffcbe5: true) at build/X86/cpu/o3/fetch_impl.hh:1297 > #9 0x0000555555b7b28b in DefaultFetch<O3CPUImpl>::tick (this=this(a)entry=0x55555888b428) > at build/X86/cpu/o3/fetch_impl.hh:930 > #10 0x0000555555b52aab in FullO3CPU<O3CPUImpl>::tick (this=0x55555888b000) > at build/X86/cpu/o3/cpu.cc:531 > > > decode-method.cc.inc:20696: > return new X86Macroop::PSLLW_XMM_M(machInst, EmulEnv((MODRM_REG | (REX_R > << 3)), > 0, > OPSIZE, > ADDRSIZE, > STACKSIZE)); > > > This line is called from build/X86/arch/x86/decoder.cc:696 : > StaticInstPtr > Decoder::decode(ExtMachInst mach_inst, Addr addr) > { > auto iter = instMap->find(mach_inst); > if (iter != instMap->end()) > return iter->second; > > *=> StaticInstPtr si = decodeInst(mach_inst);* > > > > Some additional details: > -- Error only occurs when my benchmarks are compiled with Clang-3.5 (have > tried multiple other versions of clang including 5,8,10; same error). Error > does not occur when benchmarks are compiled with GCC 6. > -- Error only occurs with gem5.opt. gem5.debug runs without the error. > -- I am running Ubuntu 18.04, with Gem5 compiled with GCC 4.8 (error > occurs when Gem5 is compiled with GCC 6 as well) > > > > _______________________________________________ > gem5-users mailing list -- gem5-users(a)gem5.org > To unsubscribe send an email to gem5-users-leave(a)gem5.org > %(web_page_url)slistinfo%(cgiext)s/%(_internal_name)s
SG
Saileshwar, Gururaj
Wed, Jun 10, 2020 10:52 PM

Hi Gabe,
Your pointers on debugging with Valgrind was indeed helpful ! As you rightly said, the seg-fault in the decoder was not because of an error in the decoder.

The seg-fault was happening because the PC was pointing to a stack address (e.g. 0x7fff....). and garbage instructions were being fetched and decoded. A previous ret instruction is causing the PC to be changed to the garbage value. I think the return address on the stack that this ret uses, is getting corrupted at some previous stage. Please see debug trace below for more details.

Any suggestions on how to identify at what point in Gem5 execution the return address on the stack is getting corrupted?  Is there a way I can monitor writes to this particular stack-location in the virtual address space of the program using GDB, as the program executes inside Gem5 in SE mode? I don't think there is any issue with the program itself, since it is a SPEC benchmark and it runs natively without issue. I think the stack is getting corrupted only when the program is being run with Gem5.

Thanks,
Gururaj


Here is the execution trace with the debug-flag IEW turned on a few instructions before the segmentation fault, where the PC is redirected to a stack-address.

5335010724: system.switch_cpus.iew: [tid:0] [sn:3974660] Execute: Branch mispredict detected.
5335010724: system.switch_cpus.iew: [tid:0] [sn:3974660] Predicted target was PC: (0x4f6cf2=>0x4f6cfa).(0=>1)
5335010724: system.switch_cpus.iew: [tid:0] [sn:3974660] Execute: Redirecting fetch to PC: (0x4f6d01=>0x7fffffffe700).(2=>3)
5335010724: system.switch_cpus.iew: [tid:0] [sn:3974660] Squashing from a specific instruction, PC: (0x4f6d01=>0x7fffffffe700).(2=>3)

Using the Exec debug flag, I was able to identify that a ret instruction is the cause of the branch at [sn:3974660]:

5334246198: system.switch_cpus: T0 : 0x4f6d01    : ret
5334246198: system.switch_cpus: T0 : 0x4f6d01. 0 :  RET_NEAR : ld  t1, SS:[rsp] : MemRead :  D=0x000000000040e201 A=0x7fffffffe708
5334246198: system.switch_cpus: T0 : 0x4f6d01. 1 :  RET_NEAR : addi  rsp, rsp, 0x8 : IntAlu :  D=0x00007fffffffe710
5334246198: system.switch_cpus: T0 : 0x4f6d01. 2 :  RET_NEAR : wripi  , t1, 0 : IntAlu :


From: Gabe Black <gabeblack(a)google.com>
Sent: Tuesday, June 9, 2020 7:28 PM
To: gem5 users mailing list <gem5-users(a)gem5.org>
Cc: Saileshwar, Gururaj <gururaj.s(a)gatech.edu>
Subject: Re: [gem5-users] Re: Debugging Gem5 Segmentation Fault in x86 Decoder

SSE2 should work with no problems. It's very unlikely this has anything to do with the compiler you're using for your benchmark, other than that it might coincidentally set up a scenario which exposes a bug. If your program has a bad instruction in it somehow, the decoder should handle that just fine and return an instruction which will produce an undefined instruction fault if actually executed and committed. The function and line you pointed out is allocating a new instruction object, so it's not clear why anything would be being freed there to begin with, although that is what that backtrace suggests is happening. The difference between opt and debug is likely just because they put things in slightly different places and have slightly different code which might make a memory error not cause any visible damage, or crash the program, depending on what got corrupted.

There are a few things you can do to try to find out where the memory error is, assuming that there is one (which I think is likely). One would be to use GDB to try to catch the corruption as it happens with watchpoints, etc. This might be tricky, but might be the only way to catch the error in some circumstances.

The other slower but more automatic way would be to use a tool called valgrind. I would suggest trying it with both the opt and debug versions of the binary since debug will give you better, more readable output, but opt is the one that's crashing. Be aware that valgrind runs the program slightly different, so it may no longer crash. To get good results, you need to turn off tcmalloc with the --without-tcmalloc scons argument. valgrind works in part by replacing/instrumenting certain heap related functions, and if you've replaced them with tcmalloc it doesn't work any more. Also, you'll want to pass valgrind suppressions in util/valgrind-suppressions. This tells valgrind to ignore some well known false positives in the python interpreter which produce a lot of noise in the output and can drown out actual, useful messages.

So, gdb with watchpoints, breakpoints, etc, or valgrind, built with --without-tcmalloc, run with util/valgrind-suppressions, try both opt and debug.

That should give you a little more information to work with. With memory issues, it's quite possible you're seeing a symptom crop up in a place that has little or even nothing to do with where the actual problem is. You're going to need to dig a bit more to find out the real cause. It can be hard to help with these problems too, since they might magically go away when somebody else tries the same thing with a slightly different setup.

Gabe

On Tue, Jun 9, 2020 at 4:09 PM Saileshwar, Gururaj via gem5-users <gem5-users(a)gem5.orgmailto:gem5-users(a)gem5.org> wrote:
Hello,

It seems like the Gem5 segmentation fault I am getting is due to an SSE2 instruction.

So I tried disabling SSE2 while compiling my benchmark with mno-sse2 compiler flag - but with this flag, my benchmark refused to refused to compile.  It looks like my shared libraries such as libc and libm are using SSE registers (e.g. for floating point operations). I am also seeing this segmentation fault for benchmarks compiled with GCC 4.8. So I guess, this is not specific to compiling benchmarks with Clang, as I thought earlier.

I am confused how this has not been faced by others before, since I believe SSE 2 is pretty old and  assumed as a given in x86-64 ISA.

The only work around I can think of is compiling my glibc without SIMD instructions (I am not quite sure how I would do that). But is this something that the Gem5 experts usually do/require ??

Would appreciate thoughts / comments from the experts !

Thanks,
Gururaj

Get Outlook for Androidhttps://aka.ms/ghei36


From: Saileshwar, Gururaj via gem5-users <gem5-users(a)gem5.orgmailto:gem5-users(a)gem5.org>
Sent: Monday, June 8, 2020 7:02:18 PM
To: gem5 users mailing list <gem5-users(a)gem5.orgmailto:gem5-users(a)gem5.org>
Cc: Saileshwar, Gururaj <gururaj.s(a)gatech.edumailto:gururaj.s(a)gatech.edu>
Subject: [gem5-users] Debugging Gem5 Segmentation Fault in x86 Decoder

Hi All,

I am trying to run SPEC-2017 benchmarks (compiled with Clang-3.5), on Gem5 in SE mode (unmodified stable version v20).

However, I am getting segmentation faults in the x86 Decoder of Gem5, (free of uninitialized pointer by tcmalloc).
The segmentation fault arises from a line in an auto-generated file for the x86 instruction decoder.

The exact line where I get the error is decode-method.cc.inc:20696:
return new X86Macroop::PSLLW_XMM_M(machInst, EmulEnv((MODRM_REG | (REX_R << 3)),
0,
OPSIZE,
ADDRSIZE,
STACKSIZE));

I am confused whether the compiler is generating an unsupported instruction or whether there is a bug in the Gem5 x86 Decoder. I was hoping for some help to debug this. Please see the stack-trace from GDB after Gem5 crashes, below.

Thanks,
Gururaj

ERROR: src/tcmalloc.cc:283 Attempt to free invalid pointer 0x5555564c83f4

------- STACK TRACE ---------
#0  __GI_raise (sig=sig(a)entry=6) at ../sysdeps/unix/sysv/linux/raise.c:51
#1  0x00007ffff5e5b801 in __GI_abort () at abort.c:79
#2  0x00007ffff6b6afb9 in tcmalloc::Log(tcmalloc::LogMode, char const*, int, tcmalloc::LogItem, tcmalloc::LogItem, tcmalloc::LogItem, tcmalloc::LogItem) () from /usr/lib/x86_64-linux-gnu/libtcmalloc.so.4
#3  0x00007ffff6b5ed49 in ?? () from /usr/lib/x86_64-linux-gnu/libtcmalloc.so.4
#4  0x00007ffff6b7e6b9 in tc_free () from /usr/lib/x86_64-linux-gnu/libtcmalloc.so.4
#5  0x000055555602fead in X86ISA::Decoder::decodeInst (this=this(a)entry=0x555557d1ac40, machInst=...) at build/X86/arch/x86/generated/decode-method.cc.inc:20696
#6  0x0000555555f98d04 in X86ISA::Decoder::decode (this=this(a)entry=0x555557d1ac40, mach_inst=..., addr=<optimized out>) at build/X86/arch/x86/decoder.cc:686
#7  0x0000555555f98ffb in X86ISA::Decoder::decode (this=0x555557d1ac40, nextPC=...) at build/X86/arch/x86/decoder.cc:731
#8  0x0000555555b79eff in DefaultFetch<O3CPUImpl>::fetch (this=this(a)entry=0x55555888b428, status_change=@0x7fffffffcbe5: true) at build/X86/cpu/o3/fetch_impl.hh:1297
#9  0x0000555555b7b28b in DefaultFetch<O3CPUImpl>::tick (this=this(a)entry=0x55555888b428) at build/X86/cpu/o3/fetch_impl.hh:930
#10 0x0000555555b52aab in FullO3CPU<O3CPUImpl>::tick (this=0x55555888b000) at build/X86/cpu/o3/cpu.cc:531

decode-method.cc.inc:20696:
return new X86Macroop::PSLLW_XMM_M(machInst, EmulEnv((MODRM_REG | (REX_R << 3)),
0,
OPSIZE,
ADDRSIZE,
STACKSIZE));

This line is called from build/X86/arch/x86/decoder.cc:696 :
StaticInstPtr
Decoder::decode(ExtMachInst mach_inst, Addr addr)
{
auto iter = instMap->find(mach_inst);
if (iter != instMap->end())
return iter->second;

=>    StaticInstPtr si = decodeInst(mach_inst);

Some additional details:
-- Error only occurs when my benchmarks are compiled with Clang-3.5 (have tried multiple other versions of clang including 5,8,10; same error). Error does not occur when benchmarks are compiled with GCC 6.
-- Error only occurs with gem5.opt. gem5.debug runs without the error.
-- I am running Ubuntu 18.04, with Gem5 compiled with GCC 4.8 (error occurs when Gem5 is compiled with GCC 6 as well)


gem5-users mailing list -- gem5-users(a)gem5.orgmailto:gem5-users(a)gem5.org
To unsubscribe send an email to gem5-users-leave(a)gem5.orgmailto:gem5-users-leave(a)gem5.org
%(web_page_url)slistinfo%(cgiext)s/%(_internal_name)s

Hi Gabe, Your pointers on debugging with Valgrind was indeed helpful ! As you rightly said, the seg-fault in the decoder was not because of an error in the decoder. The seg-fault was happening because the PC was pointing to a stack address (e.g. 0x7fff....). and garbage instructions were being fetched and decoded. A previous ret instruction is causing the PC to be changed to the garbage value. I think the return address on the stack that this ret uses, is getting corrupted at some previous stage. Please see debug trace below for more details. Any suggestions on how to identify at what point in Gem5 execution the return address on the stack is getting corrupted? Is there a way I can monitor writes to this particular stack-location in the virtual address space of the program using GDB, as the program executes inside Gem5 in SE mode? I don't think there is any issue with the program itself, since it is a SPEC benchmark and it runs natively without issue. I think the stack is getting corrupted only when the program is being run with Gem5. Thanks, Gururaj --------------------------------------------------------- Here is the execution trace with the debug-flag IEW turned on a few instructions before the segmentation fault, where the PC is redirected to a stack-address. 5335010724: system.switch_cpus.iew: [tid:0] [sn:3974660] Execute: Branch mispredict detected. 5335010724: system.switch_cpus.iew: [tid:0] [sn:3974660] Predicted target was PC: (0x4f6cf2=>0x4f6cfa).(0=>1) 5335010724: system.switch_cpus.iew: [tid:0] [sn:3974660] Execute: Redirecting fetch to PC: (0x4f6d01=>0x7fffffffe700).(2=>3) 5335010724: system.switch_cpus.iew: [tid:0] [sn:3974660] Squashing from a specific instruction, PC: (0x4f6d01=>0x7fffffffe700).(2=>3) Using the Exec debug flag, I was able to identify that a ret instruction is the cause of the branch at [sn:3974660]: 5334246198: system.switch_cpus: T0 : 0x4f6d01 : ret 5334246198: system.switch_cpus: T0 : 0x4f6d01. 0 : RET_NEAR : ld t1, SS:[rsp] : MemRead : D=0x000000000040e201 A=0x7fffffffe708 5334246198: system.switch_cpus: T0 : 0x4f6d01. 1 : RET_NEAR : addi rsp, rsp, 0x8 : IntAlu : D=0x00007fffffffe710 5334246198: system.switch_cpus: T0 : 0x4f6d01. 2 : RET_NEAR : wripi , t1, 0 : IntAlu : ________________________________ From: Gabe Black <gabeblack(a)google.com> Sent: Tuesday, June 9, 2020 7:28 PM To: gem5 users mailing list <gem5-users(a)gem5.org> Cc: Saileshwar, Gururaj <gururaj.s(a)gatech.edu> Subject: Re: [gem5-users] Re: Debugging Gem5 Segmentation Fault in x86 Decoder SSE2 should work with no problems. It's very unlikely this has anything to do with the compiler you're using for your benchmark, other than that it might coincidentally set up a scenario which exposes a bug. If your program has a bad instruction in it somehow, the decoder should handle that just fine and return an instruction which will produce an undefined instruction fault if actually executed and committed. The function and line you pointed out is allocating a new instruction object, so it's not clear why anything would be being freed there to begin with, although that is what that backtrace suggests is happening. The difference between opt and debug is likely just because they put things in slightly different places and have slightly different code which might make a memory error not cause any visible damage, or crash the program, depending on what got corrupted. There are a few things you can do to try to find out where the memory error is, assuming that there is one (which I think is likely). One would be to use GDB to try to catch the corruption as it happens with watchpoints, etc. This might be tricky, but might be the only way to catch the error in some circumstances. The other slower but more automatic way would be to use a tool called valgrind. I would suggest trying it with both the opt and debug versions of the binary since debug will give you better, more readable output, but opt is the one that's crashing. Be aware that valgrind runs the program slightly different, so it may no longer crash. To get good results, you need to turn off tcmalloc with the --without-tcmalloc scons argument. valgrind works in part by replacing/instrumenting certain heap related functions, and if you've replaced them with tcmalloc it doesn't work any more. Also, you'll want to pass valgrind suppressions in util/valgrind-suppressions. This tells valgrind to ignore some well known false positives in the python interpreter which produce a lot of noise in the output and can drown out actual, useful messages. So, gdb with watchpoints, breakpoints, etc, or valgrind, built with --without-tcmalloc, run with util/valgrind-suppressions, try both opt and debug. That should give you a little more information to work with. With memory issues, it's quite possible you're seeing a symptom crop up in a place that has little or even nothing to do with where the actual problem is. You're going to need to dig a bit more to find out the real cause. It can be hard to help with these problems too, since they might magically go away when somebody else tries the same thing with a slightly different setup. Gabe On Tue, Jun 9, 2020 at 4:09 PM Saileshwar, Gururaj via gem5-users <gem5-users(a)gem5.org<mailto:gem5-users(a)gem5.org>> wrote: Hello, It seems like the Gem5 segmentation fault I am getting is due to an SSE2 instruction. So I tried disabling SSE2 while compiling my benchmark with mno-sse2 compiler flag - but with this flag, my benchmark refused to refused to compile. It looks like my shared libraries such as libc and libm are using SSE registers (e.g. for floating point operations). I am also seeing this segmentation fault for benchmarks compiled with GCC 4.8. So I guess, this is not specific to compiling benchmarks with Clang, as I thought earlier. I am confused how this has not been faced by others before, since I believe SSE 2 is pretty old and assumed as a given in x86-64 ISA. The only work around I can think of is compiling my glibc without SIMD instructions (I am not quite sure how I would do that). But is this something that the Gem5 experts usually do/require ?? Would appreciate thoughts / comments from the experts ! Thanks, Gururaj Get Outlook for Android<https://aka.ms/ghei36> ________________________________ From: Saileshwar, Gururaj via gem5-users <gem5-users(a)gem5.org<mailto:gem5-users(a)gem5.org>> Sent: Monday, June 8, 2020 7:02:18 PM To: gem5 users mailing list <gem5-users(a)gem5.org<mailto:gem5-users(a)gem5.org>> Cc: Saileshwar, Gururaj <gururaj.s(a)gatech.edu<mailto:gururaj.s(a)gatech.edu>> Subject: [gem5-users] Debugging Gem5 Segmentation Fault in x86 Decoder Hi All, I am trying to run SPEC-2017 benchmarks (compiled with Clang-3.5), on Gem5 in SE mode (unmodified stable version v20). However, I am getting segmentation faults in the x86 Decoder of Gem5, (free of uninitialized pointer by tcmalloc). The segmentation fault arises from a line in an auto-generated file for the x86 instruction decoder. The exact line where I get the error is decode-method.cc.inc:20696: return new X86Macroop::PSLLW_XMM_M(machInst, EmulEnv((MODRM_REG | (REX_R << 3)), 0, OPSIZE, ADDRSIZE, STACKSIZE)); I am confused whether the compiler is generating an unsupported instruction or whether there is a bug in the Gem5 x86 Decoder. I was hoping for some help to debug this. Please see the stack-trace from GDB after Gem5 crashes, below. Thanks, Gururaj -------------- ERROR: src/tcmalloc.cc:283 Attempt to free invalid pointer 0x5555564c83f4 ------- STACK TRACE --------- #0 __GI_raise (sig=sig(a)entry=6) at ../sysdeps/unix/sysv/linux/raise.c:51 #1 0x00007ffff5e5b801 in __GI_abort () at abort.c:79 #2 0x00007ffff6b6afb9 in tcmalloc::Log(tcmalloc::LogMode, char const*, int, tcmalloc::LogItem, tcmalloc::LogItem, tcmalloc::LogItem, tcmalloc::LogItem) () from /usr/lib/x86_64-linux-gnu/libtcmalloc.so.4 #3 0x00007ffff6b5ed49 in ?? () from /usr/lib/x86_64-linux-gnu/libtcmalloc.so.4 #4 0x00007ffff6b7e6b9 in tc_free () from /usr/lib/x86_64-linux-gnu/libtcmalloc.so.4 #5 0x000055555602fead in X86ISA::Decoder::decodeInst (this=this(a)entry=0x555557d1ac40, machInst=...) at build/X86/arch/x86/generated/decode-method.cc.inc:20696 #6 0x0000555555f98d04 in X86ISA::Decoder::decode (this=this(a)entry=0x555557d1ac40, mach_inst=..., addr=<optimized out>) at build/X86/arch/x86/decoder.cc:686 #7 0x0000555555f98ffb in X86ISA::Decoder::decode (this=0x555557d1ac40, nextPC=...) at build/X86/arch/x86/decoder.cc:731 #8 0x0000555555b79eff in DefaultFetch<O3CPUImpl>::fetch (this=this(a)entry=0x55555888b428, status_change=@0x7fffffffcbe5: true) at build/X86/cpu/o3/fetch_impl.hh:1297 #9 0x0000555555b7b28b in DefaultFetch<O3CPUImpl>::tick (this=this(a)entry=0x55555888b428) at build/X86/cpu/o3/fetch_impl.hh:930 #10 0x0000555555b52aab in FullO3CPU<O3CPUImpl>::tick (this=0x55555888b000) at build/X86/cpu/o3/cpu.cc:531 decode-method.cc.inc:20696: return new X86Macroop::PSLLW_XMM_M(machInst, EmulEnv((MODRM_REG | (REX_R << 3)), 0, OPSIZE, ADDRSIZE, STACKSIZE)); This line is called from build/X86/arch/x86/decoder.cc:696 : StaticInstPtr Decoder::decode(ExtMachInst mach_inst, Addr addr) { auto iter = instMap->find(mach_inst); if (iter != instMap->end()) return iter->second; => StaticInstPtr si = decodeInst(mach_inst); Some additional details: -- Error only occurs when my benchmarks are compiled with Clang-3.5 (have tried multiple other versions of clang including 5,8,10; same error). Error does not occur when benchmarks are compiled with GCC 6. -- Error only occurs with gem5.opt. gem5.debug runs without the error. -- I am running Ubuntu 18.04, with Gem5 compiled with GCC 4.8 (error occurs when Gem5 is compiled with GCC 6 as well) _______________________________________________ gem5-users mailing list -- gem5-users(a)gem5.org<mailto:gem5-users(a)gem5.org> To unsubscribe send an email to gem5-users-leave(a)gem5.org<mailto:gem5-users-leave(a)gem5.org> %(web_page_url)slistinfo%(cgiext)s/%(_internal_name)s