gem5-dev@gem5.org

The gem5 Developer List

View all threads

Possible MemCtrl bug. DRAM interface writeQueueSize not being drained.

O
Olumide
Mon, Jul 28, 2025 11:50 PM

Hi-

I'm repeatedly encountering situations where the MemCtrl stops
scheduling requests even though mem_intr->writeQueueSize is not zero.

Unfortunately I can neither reproduce it in a MWE or by capturing and
playing back a packet sequence. (As a last resort I may resort to time
travel debug capture with the rr utility.)

The problem:
In my simulations the path of execution often ends up on line 964 of
mem_ctrl.cc
https://github.com/gem5/gem5/blob/stable/src/mem/mem_ctrl.cc#L964

where no more events are scheduled even though

(gdb) p mem_intr->writeQueueSize
$1 = 15

And

(gdb) call drainState()
$2 = gem5::DrainState::Running

(gdb) p writeLowThreshold
$3 = 32

(gdb) p mem_intr->writeQueueSize > writeLowThreshold
$4 = false

(gdb) p mem_intr->readQueueSize
$5 = 0

(gdb) p mem_intr->busState
$6 = gem5::memory::qos::MemCtrl::READ

I suspect the culprit is line 1127 of mem_ctrl.cc
https://github.com/gem5/gem5/blob/stable/src/mem/mem_ctrl.cc#L1127

The test:

(below_threshold && drainState() != DrainState::Draining)

returns true and switches the next bus state to READ even through
mem_intr->readQueueSize == 0.

When I explicitly force

mem_intr->busStateNext = MemCtrl::WRITE

MemCtrl processes one write message but sets the next bus state to
MemCtrl::READ. As such I've had to repeatedly force the next state to
MemCtrl::WRITE until mem_intr->writeQueueSize == 0.

Please advise.

Regards,

  • Olumide
Hi- I'm repeatedly encountering situations where the MemCtrl stops scheduling requests even though mem_intr->writeQueueSize is not zero. Unfortunately I can neither reproduce it in a MWE or by capturing and playing back a packet sequence. (As a last resort I may resort to time travel debug capture with the rr utility.) The problem: In my simulations the path of execution often ends up on line 964 of mem_ctrl.cc https://github.com/gem5/gem5/blob/stable/src/mem/mem_ctrl.cc#L964 where no more events are scheduled even though (gdb) p mem_intr->writeQueueSize $1 = 15 And (gdb) call drainState() $2 = gem5::DrainState::Running (gdb) p writeLowThreshold $3 = 32 (gdb) p mem_intr->writeQueueSize > writeLowThreshold $4 = false (gdb) p mem_intr->readQueueSize $5 = 0 (gdb) p mem_intr->busState $6 = gem5::memory::qos::MemCtrl::READ I suspect the culprit is line 1127 of mem_ctrl.cc https://github.com/gem5/gem5/blob/stable/src/mem/mem_ctrl.cc#L1127 The test: (below_threshold && drainState() != DrainState::Draining) returns true and switches the next bus state to READ even through mem_intr->readQueueSize == 0. When I explicitly force mem_intr->busStateNext = MemCtrl::WRITE MemCtrl processes one write message but sets the next bus state to MemCtrl::READ. As such I've had to repeatedly force the next state to MemCtrl::WRITE until mem_intr->writeQueueSize == 0. Please advise. Regards, - Olumide
JL
Jason Lowe-Power
Tue, Jul 29, 2025 3:20 PM

Hi Olumide,

This sounds like a potential bug, though I'm a bit surprised this hasn't
been found before. I assume this is happening when you're trying to take a
checkpoint since it's occurring during the drain state.

Could you make an issue on the gem5 github issue page?
https://github.com/gem5/gem5/issues

Thanks!
Jason

On Mon, Jul 28, 2025 at 4:50 PM Olumide via gem5-dev gem5-dev@gem5.org
wrote:

Hi-

I'm repeatedly encountering situations where the MemCtrl stops
scheduling requests even though mem_intr->writeQueueSize is not zero.

Unfortunately I can neither reproduce it in a MWE or by capturing and
playing back a packet sequence. (As a last resort I may resort to time
travel debug capture with the rr utility.)

The problem:
In my simulations the path of execution often ends up on line 964 of
mem_ctrl.cc
https://github.com/gem5/gem5/blob/stable/src/mem/mem_ctrl.cc#L964

where no more events are scheduled even though

(gdb) p mem_intr->writeQueueSize
$1 = 15

And

(gdb) call drainState()
$2 = gem5::DrainState::Running

(gdb) p writeLowThreshold
$3 = 32

(gdb) p mem_intr->writeQueueSize > writeLowThreshold
$4 = false

(gdb) p mem_intr->readQueueSize
$5 = 0

(gdb) p mem_intr->busState
$6 = gem5::memory::qos::MemCtrl::READ

I suspect the culprit is line 1127 of mem_ctrl.cc
https://github.com/gem5/gem5/blob/stable/src/mem/mem_ctrl.cc#L1127

The test:

(below_threshold && drainState() != DrainState::Draining)

returns true and switches the next bus state to READ even through
mem_intr->readQueueSize == 0.

When I explicitly force

mem_intr->busStateNext = MemCtrl::WRITE

MemCtrl processes one write message but sets the next bus state to
MemCtrl::READ. As such I've had to repeatedly force the next state to
MemCtrl::WRITE until mem_intr->writeQueueSize == 0.

Please advise.

Regards,

  • Olumide

gem5-dev mailing list -- gem5-dev@gem5.org
To unsubscribe send an email to gem5-dev-leave@gem5.org

Hi Olumide, This sounds like a potential bug, though I'm a bit surprised this hasn't been found before. I assume this is happening when you're trying to take a checkpoint since it's occurring during the drain state. Could you make an issue on the gem5 github issue page? https://github.com/gem5/gem5/issues Thanks! Jason On Mon, Jul 28, 2025 at 4:50 PM Olumide via gem5-dev <gem5-dev@gem5.org> wrote: > Hi- > > I'm repeatedly encountering situations where the MemCtrl stops > scheduling requests even though mem_intr->writeQueueSize is not zero. > > Unfortunately I can neither reproduce it in a MWE or by capturing and > playing back a packet sequence. (As a last resort I may resort to time > travel debug capture with the rr utility.) > > The problem: > In my simulations the path of execution often ends up on line 964 of > mem_ctrl.cc > https://github.com/gem5/gem5/blob/stable/src/mem/mem_ctrl.cc#L964 > > where no more events are scheduled even though > > (gdb) p mem_intr->writeQueueSize > $1 = 15 > > And > > (gdb) call drainState() > $2 = gem5::DrainState::Running > > (gdb) p writeLowThreshold > $3 = 32 > > (gdb) p mem_intr->writeQueueSize > writeLowThreshold > $4 = false > > (gdb) p mem_intr->readQueueSize > $5 = 0 > > (gdb) p mem_intr->busState > $6 = gem5::memory::qos::MemCtrl::READ > > I suspect the culprit is line 1127 of mem_ctrl.cc > https://github.com/gem5/gem5/blob/stable/src/mem/mem_ctrl.cc#L1127 > > The test: > > (below_threshold && drainState() != DrainState::Draining) > > returns true and switches the next bus state to READ even through > mem_intr->readQueueSize == 0. > > When I explicitly force > > mem_intr->busStateNext = MemCtrl::WRITE > > MemCtrl processes one write message but sets the next bus state to > MemCtrl::READ. As such I've had to repeatedly force the next state to > MemCtrl::WRITE until mem_intr->writeQueueSize == 0. > > Please advise. > > Regards, > > - Olumide > _______________________________________________ > gem5-dev mailing list -- gem5-dev@gem5.org > To unsubscribe send an email to gem5-dev-leave@gem5.org >
O
Olumide
Tue, Jul 29, 2025 4:32 PM

Hi Jason,

Thanks for your reply.

I've succeeded in creating a packet sequence that replicates the problem
or something similar.

Basically the MemCtrl is stuck in a loop ignoring WRITEs and waiting for
READs that will never arrive.

In an hour or two, I'll isolate the sim object illustrating the problem
and post it as a github issue.

Regards,

  • Olumide

On 29/07/2025 16:20, Jason Lowe-Power wrote:

Hi Olumide,

This sounds like a potential bug, though I'm a bit surprised this hasn't
been found before. I assume this is happening when you're trying to take
a checkpoint since it's occurring during the drain state.

Could you make an issue on the gem5 github issue page? https://
github.com/gem5/gem5/issues https://github.com/gem5/gem5/issues

Thanks!
Jason

On Mon, Jul 28, 2025 at 4:50 PM Olumide via gem5-dev <gem5-dev@gem5.org
mailto:gem5-dev@gem5.org> wrote:

 Hi-

 I'm repeatedly encountering situations where the MemCtrl stops
 scheduling requests even though mem_intr->writeQueueSize is not zero.

 Unfortunately I can neither reproduce it in a MWE or by capturing and
 playing back a packet sequence. (As a last resort I may resort to time
 travel debug capture with the rr utility.)

 The problem:
 In my simulations the path of execution often ends up on line 964 of
 mem_ctrl.cc
 https://github.com/gem5/gem5/blob/stable/src/mem/mem_ctrl.cc#L964
 <https://github.com/gem5/gem5/blob/stable/src/mem/mem_ctrl.cc#L964>

 where no more events are scheduled even though

 (gdb) p mem_intr->writeQueueSize
 $1 = 15

 And

 (gdb) call drainState()
 $2 = gem5::DrainState::Running

 (gdb) p writeLowThreshold
 $3 = 32

 (gdb) p mem_intr->writeQueueSize > writeLowThreshold
 $4 = false

 (gdb) p mem_intr->readQueueSize
 $5 = 0

 (gdb) p mem_intr->busState
 $6 = gem5::memory::qos::MemCtrl::READ

 I suspect the culprit is line 1127 of mem_ctrl.cc
 https://github.com/gem5/gem5/blob/stable/src/mem/mem_ctrl.cc#L1127
 <https://github.com/gem5/gem5/blob/stable/src/mem/mem_ctrl.cc#L1127>

 The test:

 (below_threshold && drainState() != DrainState::Draining)

 returns true and switches the next bus state to READ even through
 mem_intr->readQueueSize == 0.

 When I explicitly force

 mem_intr->busStateNext = MemCtrl::WRITE

 MemCtrl processes one write message but sets the next bus state to
 MemCtrl::READ. As such I've had to repeatedly force the next state to
 MemCtrl::WRITE until mem_intr->writeQueueSize == 0.

 Please advise.

 Regards,

 - Olumide
 _______________________________________________
 gem5-dev mailing list -- gem5-dev@gem5.org <mailto:gem5-dev@gem5.org>
 To unsubscribe send an email to gem5-dev-leave@gem5.org
 <mailto:gem5-dev-leave@gem5.org>
Hi Jason, Thanks for your reply. I've succeeded in creating a packet sequence that replicates the problem or something similar. Basically the MemCtrl is stuck in a loop ignoring WRITEs and waiting for READs that will never arrive. In an hour or two, I'll isolate the sim object illustrating the problem and post it as a github issue. Regards, - Olumide On 29/07/2025 16:20, Jason Lowe-Power wrote: > Hi Olumide, > > This sounds like a potential bug, though I'm a bit surprised this hasn't > been found before. I assume this is happening when you're trying to take > a checkpoint since it's occurring during the drain state. > > Could you make an issue on the gem5 github issue page? https:// > github.com/gem5/gem5/issues <https://github.com/gem5/gem5/issues> > > Thanks! > Jason > > On Mon, Jul 28, 2025 at 4:50 PM Olumide via gem5-dev <gem5-dev@gem5.org > <mailto:gem5-dev@gem5.org>> wrote: > > Hi- > > I'm repeatedly encountering situations where the MemCtrl stops > scheduling requests even though mem_intr->writeQueueSize is not zero. > > Unfortunately I can neither reproduce it in a MWE or by capturing and > playing back a packet sequence. (As a last resort I may resort to time > travel debug capture with the rr utility.) > > The problem: > In my simulations the path of execution often ends up on line 964 of > mem_ctrl.cc > https://github.com/gem5/gem5/blob/stable/src/mem/mem_ctrl.cc#L964 > <https://github.com/gem5/gem5/blob/stable/src/mem/mem_ctrl.cc#L964> > > where no more events are scheduled even though > > (gdb) p mem_intr->writeQueueSize > $1 = 15 > > And > > (gdb) call drainState() > $2 = gem5::DrainState::Running > > (gdb) p writeLowThreshold > $3 = 32 > > (gdb) p mem_intr->writeQueueSize > writeLowThreshold > $4 = false > > (gdb) p mem_intr->readQueueSize > $5 = 0 > > (gdb) p mem_intr->busState > $6 = gem5::memory::qos::MemCtrl::READ > > I suspect the culprit is line 1127 of mem_ctrl.cc > https://github.com/gem5/gem5/blob/stable/src/mem/mem_ctrl.cc#L1127 > <https://github.com/gem5/gem5/blob/stable/src/mem/mem_ctrl.cc#L1127> > > The test: > > (below_threshold && drainState() != DrainState::Draining) > > returns true and switches the next bus state to READ even through > mem_intr->readQueueSize == 0. > > When I explicitly force > > mem_intr->busStateNext = MemCtrl::WRITE > > MemCtrl processes one write message but sets the next bus state to > MemCtrl::READ. As such I've had to repeatedly force the next state to > MemCtrl::WRITE until mem_intr->writeQueueSize == 0. > > Please advise. > > Regards, > > - Olumide > _______________________________________________ > gem5-dev mailing list -- gem5-dev@gem5.org <mailto:gem5-dev@gem5.org> > To unsubscribe send an email to gem5-dev-leave@gem5.org > <mailto:gem5-dev-leave@gem5.org> >
O
Olumide
Tue, Jul 29, 2025 9:45 PM

Done
https://github.com/gem5/gem5/issues/2493

On 29/07/2025 17:32, Olumide wrote:

Hi Jason,

Thanks for your reply.

I've succeeded in creating a packet sequence that replicates the problem
or something similar.

Basically the MemCtrl is stuck in a loop ignoring WRITEs and waiting for
READs that will never arrive.

In an hour or two, I'll isolate the sim object illustrating the problem
and post it as a github issue.

Regards,

  • Olumide

On 29/07/2025 16:20, Jason Lowe-Power wrote:

Hi Olumide,

This sounds like a potential bug, though I'm a bit surprised this
hasn't been found before. I assume this is happening when you're
trying to take a checkpoint since it's occurring during the drain state.

Could you make an issue on the gem5 github issue page? https://
github.com/gem5/gem5/issues https://github.com/gem5/gem5/issues

Thanks!
Jason

On Mon, Jul 28, 2025 at 4:50 PM Olumide via gem5-dev <gem5-
dev@gem5.org mailto:gem5-dev@gem5.org> wrote:

    Hi-

    I'm repeatedly encountering situations where the MemCtrl stops
    scheduling requests even though mem_intr->writeQueueSize is not zero.

    Unfortunately I can neither reproduce it in a MWE or by capturing and
    playing back a packet sequence. (As a last resort I may resort to
time
    travel debug capture with the rr utility.)

    The problem:
    In my simulations the path of execution often ends up on line 964 of
    mem_ctrl.cc
    https://github.com/gem5/gem5/blob/stable/src/mem/mem_ctrl.cc#L964
    https://github.com/gem5/gem5/blob/stable/src/mem/mem_ctrl.cc#L964

    where no more events are scheduled even though

    (gdb) p mem_intr->writeQueueSize
    $1 = 15

    And

    (gdb) call drainState()
    $2 = gem5::DrainState::Running

    (gdb) p writeLowThreshold
    $3 = 32

    (gdb) p mem_intr->writeQueueSize > writeLowThreshold
    $4 = false

    (gdb) p mem_intr->readQueueSize
    $5 = 0

    (gdb) p mem_intr->busState
    $6 = gem5::memory::qos::MemCtrl::READ

    I suspect the culprit is line 1127 of mem_ctrl.cc
    https://github.com/gem5/gem5/blob/stable/src/mem/mem_ctrl.cc#L1127
    https://github.com/gem5/gem5/blob/stable/src/mem/mem_ctrl.cc#L1127

    The test:

    (below_threshold && drainState() != DrainState::Draining)

    returns true and switches the next bus state to READ even through
    mem_intr->readQueueSize == 0.

    When I explicitly force

    mem_intr->busStateNext = MemCtrl::WRITE

    MemCtrl processes one write message but sets the next bus state to
    MemCtrl::READ. As such I've had to repeatedly force the next state to
    MemCtrl::WRITE until mem_intr->writeQueueSize == 0.

    Please advise.

    Regards,

    - Olumide
    _______________________________________________
    gem5-dev mailing list -- gem5-dev@gem5.org mailto:gem5-dev@gem5.org
    To unsubscribe send an email to gem5-dev-leave@gem5.org
    mailto:gem5-dev-leave@gem5.org

Done https://github.com/gem5/gem5/issues/2493 On 29/07/2025 17:32, Olumide wrote: > Hi Jason, > > Thanks for your reply. > > I've succeeded in creating a packet sequence that replicates the problem > or something similar. > > Basically the MemCtrl is stuck in a loop ignoring WRITEs and waiting for > READs that will never arrive. > > In an hour or two, I'll isolate the sim object illustrating the problem > and post it as a github issue. > > Regards, > > - Olumide > > > On 29/07/2025 16:20, Jason Lowe-Power wrote: >> Hi Olumide, >> >> This sounds like a potential bug, though I'm a bit surprised this >> hasn't been found before. I assume this is happening when you're >> trying to take a checkpoint since it's occurring during the drain state. >> >> Could you make an issue on the gem5 github issue page? https:// >> github.com/gem5/gem5/issues <https://github.com/gem5/gem5/issues> >> >> Thanks! >> Jason >> >> On Mon, Jul 28, 2025 at 4:50 PM Olumide via gem5-dev <gem5- >> dev@gem5.org <mailto:gem5-dev@gem5.org>> wrote: >> >>     Hi- >> >>     I'm repeatedly encountering situations where the MemCtrl stops >>     scheduling requests even though mem_intr->writeQueueSize is not zero. >> >>     Unfortunately I can neither reproduce it in a MWE or by capturing and >>     playing back a packet sequence. (As a last resort I may resort to >> time >>     travel debug capture with the rr utility.) >> >>     The problem: >>     In my simulations the path of execution often ends up on line 964 of >>     mem_ctrl.cc >>     https://github.com/gem5/gem5/blob/stable/src/mem/mem_ctrl.cc#L964 >>     <https://github.com/gem5/gem5/blob/stable/src/mem/mem_ctrl.cc#L964> >> >>     where no more events are scheduled even though >> >>     (gdb) p mem_intr->writeQueueSize >>     $1 = 15 >> >>     And >> >>     (gdb) call drainState() >>     $2 = gem5::DrainState::Running >> >>     (gdb) p writeLowThreshold >>     $3 = 32 >> >>     (gdb) p mem_intr->writeQueueSize > writeLowThreshold >>     $4 = false >> >>     (gdb) p mem_intr->readQueueSize >>     $5 = 0 >> >>     (gdb) p mem_intr->busState >>     $6 = gem5::memory::qos::MemCtrl::READ >> >>     I suspect the culprit is line 1127 of mem_ctrl.cc >>     https://github.com/gem5/gem5/blob/stable/src/mem/mem_ctrl.cc#L1127 >>     <https://github.com/gem5/gem5/blob/stable/src/mem/mem_ctrl.cc#L1127> >> >>     The test: >> >>     (below_threshold && drainState() != DrainState::Draining) >> >>     returns true and switches the next bus state to READ even through >>     mem_intr->readQueueSize == 0. >> >>     When I explicitly force >> >>     mem_intr->busStateNext = MemCtrl::WRITE >> >>     MemCtrl processes one write message but sets the next bus state to >>     MemCtrl::READ. As such I've had to repeatedly force the next state to >>     MemCtrl::WRITE until mem_intr->writeQueueSize == 0. >> >>     Please advise. >> >>     Regards, >> >>     - Olumide >>     _______________________________________________ >>     gem5-dev mailing list -- gem5-dev@gem5.org <mailto:gem5-dev@gem5.org> >>     To unsubscribe send an email to gem5-dev-leave@gem5.org >>     <mailto:gem5-dev-leave@gem5.org> >> >