gem5-users@gem5.org

The gem5 Users mailing list

View all threads

Alpha ROB, IQ, LSQ overflow, or something related to cpu->instcount <= 1500' failed

CC
Chao Chen
Tue, May 7, 2013 10:10 PM

Hi,

I am using the up-to-date gem5 version (repo.gem5.org/gem5). and run
the ALPHA arch with FS mode.
The system have many stalls due to lack of free ROB, IQ or LSQ,
although the simulation still procedes slowly.
I increase the ROB and IQ to 4048 entries each, but the stalls didn't
disappear. I increase the Load Queue to 4048 (I know it may be not
reasonable), but I cannot increase Store Queue to more than 128
entries, the simulation would halt.

I tried to debug with gem5.debug.
The simulation is aborted for "void BaseDynInst<
<template-parameter-1-1> >::initVars() [with Impl = O3CPUImpl]:
Assertion `cpu->instcount <= 1500' failed."
This was ignored when using gem5.fast earlier.

I see the other posts that it means the in flight instructions
increases beyond a reasonable limit

I have some questions I cannot understand.

  1. Is it reasonable that the in flight instructions increases while I
    use large ROB and IQ size? I was expecting in-flight instructions
    didn't increase and the stalled caused by lack of buffer would
    disappear.
  2. I tried to debug it myself, and find that some instructions are not
    destroyed after committing. It seems they are never destroyed, and
    kept somewhere in some queue (I don't know which queue).
    For most (regular) instructions, when
    "instList.erase(removeList.front())" is called in o3/cpu.cc, the
    instruction would be destroyed.
    For some instructions, they are not destroyed properly after
    calling instList.erase. I debug to find the reason is the reference
    counter of the instruction object does not reach zero, and the system
    would destroy the instruction object while nobody is still referencing
    (using) this instruction. According to my understanding, there must be
    somewhere in the pipeline that didn't decrease the reference
    correctly, or some special operations (like squash) didn't call
    functions to remove the reference.
  3. I am not sure how this would cause the ROB and IQ to overflow (no
    matter how big I set the values of these buffers).

Is there anyone see this kind of issues before? Is it really a bug or
did I do something wrong? If it is a bug, can you give me some hint to
detect which part may not have decrement the reference counter of the
instruction object? Any suspect?

Thanks,
Chao

Hi, I am using the up-to-date gem5 version (repo.gem5.org/gem5). and run the ALPHA arch with FS mode. The system have many stalls due to lack of free ROB, IQ or LSQ, although the simulation still procedes slowly. I increase the ROB and IQ to 4048 entries each, but the stalls didn't disappear. I increase the Load Queue to 4048 (I know it may be not reasonable), but I cannot increase Store Queue to more than 128 entries, the simulation would halt. I tried to debug with gem5.debug. The simulation is aborted for "void BaseDynInst< <template-parameter-1-1> >::initVars() [with Impl = O3CPUImpl]: Assertion `cpu->instcount <= 1500' failed." This was ignored when using gem5.fast earlier. I see the other posts that it means the in flight instructions increases beyond a reasonable limit I have some questions I cannot understand. 1. Is it reasonable that the in flight instructions increases while I use large ROB and IQ size? I was expecting in-flight instructions didn't increase and the stalled caused by lack of buffer would disappear. 2. I tried to debug it myself, and find that some instructions are not destroyed after committing. It seems they are never destroyed, and kept somewhere in some queue (I don't know which queue). For most (regular) instructions, when "instList.erase(removeList.front())" is called in o3/cpu.cc, the instruction would be destroyed. For some instructions, they are not destroyed properly after calling instList.erase. I debug to find the reason is the reference counter of the instruction object does not reach zero, and the system would destroy the instruction object while nobody is still referencing (using) this instruction. According to my understanding, there must be somewhere in the pipeline that didn't decrease the reference correctly, or some special operations (like squash) didn't call functions to remove the reference. 3. I am not sure how this would cause the ROB and IQ to overflow (no matter how big I set the values of these buffers). Is there anyone see this kind of issues before? Is it really a bug or did I do something wrong? If it is a bug, can you give me some hint to detect which part may not have decrement the reference counter of the instruction object? Any suspect? Thanks, Chao
AS
Ali Saidi
Fri, May 24, 2013 3:40 AM

On May 7, 2013, at 5:10 PM, Chao Chen <chen9810(a)gmail.com> wrote:

Hi,

I am using the up-to-date gem5 version (repo.gem5.org/gem5). and run
the ALPHA arch with FS mode.
The system have many stalls due to lack of free ROB, IQ or LSQ,
although the simulation still procedes slowly.
I increase the ROB and IQ to 4048 entries each, but the stalls didn't
disappear. I increase the Load Queue to 4048 (I know it may be not
reasonable), but I cannot increase Store Queue to more than 128
entries, the simulation would halt.

I tried to debug with gem5.debug.
The simulation is aborted for "void BaseDynInst<
<template-parameter-1-1> >::initVars() [with Impl = O3CPUImpl]:
Assertion `cpu->instcount <= 1500' failed."
This was ignored when using gem5.fast earlier.

Since you're increased the size of the ROB to 4K, you'll need to increase this size too.

I see the other posts that it means the in flight instructions
increases beyond a reasonable limit

I have some questions I cannot understand.

  1. Is it reasonable that the in flight instructions increases while I
    use large ROB and IQ size? I was expecting in-flight instructions
    didn't increase and the stalled caused by lack of buffer would
    disappear.
  2. I tried to debug it myself, and find that some instructions are not
    destroyed after committing. It seems they are never destroyed, and
    kept somewhere in some queue (I don't know which queue).
    For most (regular) instructions, when
    "instList.erase(removeList.front())" is called in o3/cpu.cc, the
    instruction would be destroyed.

I don't know what ISA you're using, but rather recently a bug was fixed in ARM and x86 that might be the problem.

For some instructions, they are not destroyed properly after

calling instList.erase. I debug to find the reason is the reference
counter of the instruction object does not reach zero, and the system
would destroy the instruction object while nobody is still referencing
(using) this instruction. According to my understanding, there must be
somewhere in the pipeline that didn't decrease the reference
correctly, or some special operations (like squash) didn't call
functions to remove the reference.
3. I am not sure how this would cause the ROB and IQ to overflow (no
matter how big I set the values of these buffers).

Is there anyone see this kind of issues before? Is it really a bug or
did I do something wrong? If it is a bug, can you give me some hint to
detect which part may not have decrement the reference counter of the
instruction object? Any suspect?

Thanks,
Chao


gem5-users mailing list
gem5-users(a)gem5.org
http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users

On May 7, 2013, at 5:10 PM, Chao Chen <chen9810(a)gmail.com> wrote: > Hi, > > I am using the up-to-date gem5 version (repo.gem5.org/gem5). and run > the ALPHA arch with FS mode. > The system have many stalls due to lack of free ROB, IQ or LSQ, > although the simulation still procedes slowly. > I increase the ROB and IQ to 4048 entries each, but the stalls didn't > disappear. I increase the Load Queue to 4048 (I know it may be not > reasonable), but I cannot increase Store Queue to more than 128 > entries, the simulation would halt. > > I tried to debug with gem5.debug. > The simulation is aborted for "void BaseDynInst< > <template-parameter-1-1> >::initVars() [with Impl = O3CPUImpl]: > Assertion `cpu->instcount <= 1500' failed." > This was ignored when using gem5.fast earlier. Since you're increased the size of the ROB to 4K, you'll need to increase this size too. > > I see the other posts that it means the in flight instructions > increases beyond a reasonable limit > > I have some questions I cannot understand. > 1. Is it reasonable that the in flight instructions increases while I > use large ROB and IQ size? I was expecting in-flight instructions > didn't increase and the stalled caused by lack of buffer would > disappear. > 2. I tried to debug it myself, and find that some instructions are not > destroyed after committing. It seems they are never destroyed, and > kept somewhere in some queue (I don't know which queue). > For most (regular) instructions, when > "instList.erase(removeList.front())" is called in o3/cpu.cc, the > instruction would be destroyed. I don't know what ISA you're using, but rather recently a bug was fixed in ARM and x86 that might be the problem. > For some instructions, they are not destroyed properly after > calling instList.erase. I debug to find the reason is the reference > counter of the instruction object does not reach zero, and the system > would destroy the instruction object while nobody is still referencing > (using) this instruction. According to my understanding, there must be > somewhere in the pipeline that didn't decrease the reference > correctly, or some special operations (like squash) didn't call > functions to remove the reference. > 3. I am not sure how this would cause the ROB and IQ to overflow (no > matter how big I set the values of these buffers). > > Is there anyone see this kind of issues before? Is it really a bug or > did I do something wrong? If it is a bug, can you give me some hint to > detect which part may not have decrement the reference counter of the > instruction object? Any suspect? > > Thanks, > Chao > _______________________________________________ > gem5-users mailing list > gem5-users(a)gem5.org > http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users >