Hi
I am currently using Simpoint to make some application running checkpoints on SPEC2006. According to some guidance online, my steps are as following:
Running benchmark on gem5 in SE mode using options “--simpoint-profile --simpoint-interval 100000” to generate simpoint.bb.gz file
Running Simpoint options like this to generate test.simpoint and test.weight “-loadFVFile simpoint.bb.gz -maxK 30 -saveSimpoints test.simpoint -saveSimpointWeights test.weight -inputVectorsGzipped”
Rerun gem5 with options: “--take-simpoint-checkpoint=./445.gobmk-simpoint/test.simpoint,./445.gobmk-simpoint/test.weight,1000000,0 ” to generate checkpoint like: “cpt.simpoint_00_inst_1000000_weight_0.193939_interval_1000000_warmup_0”
Restore using checkpoint generate by step 3 using options: “--restore-simpoint-checkpoint -r 2 --checkpoint-dir ./445.gobmk-simpoint \“
My problem is I can run Step1-3 successfully and checkpoints of the benchmark is generated.
However, when I try running Step4 with whichever checkpoint, the simulation process just terminate after execution of ONLY 1 INSTRUCTION, here’s the output:
Resuming from ./445.gobmk-simpoint/cpt.simpoint_04_inst_90000000_weight_0.018182_interval_1000000_warmup_0
Resuming from SimPoint #4, start_inst:90000000, weight:0.018182, interval:1000000, warmup:0
Global frequency set at 1000000000000 ticks per second
warn: No dot file generated. Please install pydot to generate the dot file and pdf.
src/mem/dram_interface.cc:690: warn: DRAM device capacity (8192 Mbytes) does not match the address range assigned (512 Mbytes)
src/arch/riscv/isa.cc:275: info: RVV enabled, VLEN = 256 bits, ELEN = 64 bits
src/base/statistics.hh:279: warn: One of the stats is a legacy stat. Legacy stat is a stat that does not belong to any statistics::Group. Legacy stat is deprecated.
system.remote_gdb: Listening for connections on port 7003
src/sim/process.cc:396: warn: Checkpoints for pipes, device drivers and sockets do not work.
Switch at curTick count:10000
src/sim/simulate.cc:199: info: Entering event queue @ 58886401000. Starting simulation...
Switched CPUS @ tick 58886401500
switching cpus
src/sim/simulate.cc:199: info: Entering event queue @ 58886401500. Starting simulation...
src/sim/power_state.cc:105: warn: PowerState: Already in the requested power state, request ignored
src/sim/simulate.cc:199: info: Entering event queue @ 58886402000. Starting simulation...
Warmed up! Dumping and resetting stats!
src/sim/simulate.cc:199: info: Entering event queue @ 58886402500. Starting simulation...
Exiting @ tick 58886411000 because simulate() limit reached
As shown above, simulation ENCOUNTER EXITEVENT immediately after entering eventq and stats.txt shows only 1 INSTRUCTION is done.
My Question is: Is there somthing wrong about my procedure or Did I do something wrong?
Im running under gem5v23.1, Simpoint 3.2 in RISCV arch.
I do some PDB debug on configs/common/Simulation.py and found:
restoreSimpointCheckpoint() is executed
I can run m5.simulate() in pdb to enable simulation go beyond this Exit :
src/sim/simulate.cc:199: info: Entering event queue @ 58886402500. Starting simulation...
Exiting @ tick 58886411000 because simulate() limit reached
Run m5.simulate() in PDB
(Pdb) m5.simulate()
src/sim/simulate.cc:199: info: Entering event queue @ 58886449000. Starting simulation...
<_m5.event.GlobalSimLoopExitEvent object at 0x7fba289f8af0>
(Pdb) m5.simulate()
src/sim/simulate.cc:199: info: Entering event queue @ 64395564000. Starting simulation...
I don’t know if this is correct for restoring checkpoint because the simulation should continute to exceed on 58886449000 instead of 64395564000. And no stats generated although I use m5.stats.dump() or m5.stats.reset() on PDB.
Any help would be appreciated!
Thanks.
Can you share your full command used in Step-4?
On Tue, Apr 30, 2024 at 3:07 AM shuhao ling via gem5-users <
gem5-users@gem5.org> wrote:
I do some PDB debug on configs/common/Simulation.py and found:
1.
restoreSimpointCheckpoint() is executed
2.
I can run m5.simulate() in pdb to enable simulation go beyond this
Exit :
src/sim/simulate.cc:199: info: Entering event queue @ 58886402500.
Starting simulation...
Exiting @ tick 58886411000 because simulate() limit reached
Run m5.simulate() in PDB
(Pdb) m5.simulate()
src/sim/simulate.cc:199: info: Entering event queue @ 58886449000.
Starting simulation...
<_m5.event.GlobalSimLoopExitEvent object at 0x7fba289f8af0>
(Pdb) m5.simulate()
I don’t know if this is correct for restoring checkpoint because the
simulation should continute to exceed on 58886449000 instead of
64395564000. And no stats generated although I use m5.stats.dump() or
m5.stats.reset() on PDB.
Any help would be appreciated!
Thanks.
gem5-users mailing list -- gem5-users@gem5.org
To unsubscribe send an email to gem5-users-leave@gem5.org
Hi
Very sorry for the late reply, I am using command line like this, I try restore with MinorCPU with `—caches` and AtomicSimpleCPU without `—caches`, it resualted in the same situation though.
build/RISCV/gem5.fast \ --outdir=/home/Documents/gem5/spec/445.gobmk-simpoint \ configs/deprecated/example/se.py --num-cpus=1 --cpu-type=MinorCPU --caches --mem-size=512MB \ --restore-simpoint-checkpoint -r 2 --checkpoint-dir ./spec/445.gobmk-simpoint \ --cmd=/home/Documents/gem5/spec/run/445.gobmk/riscv-445.gobmk-input1-240423-162455/run/gobmk_base.gcc43-64bit '--options=--quiet --mode gtp' \ --output=/home/Documents/gem5/spec/run/445.gobmk/riscv-445.gobmk-input1-240423-162455/run/13x13.out \ --input=/home/Documents/gem5/spec/run/445.gobmk/riscv-445.gobmk-input1-240423-162455/run/13x13.tst \ --errout=/home/Documents/gem5/spec/run/445.gobmk/riscv-445.gobmk-input1-240423-162455/run/13x13.err
And I did some debug for event queue and found that the CPU actually running TickEvent 1 by 1 before the ExitEvent was Triggered. The ExitEvent was triggered when its Tick was hit.
Thanks for your help
Regards
Ling
Hi
I guess I found out where this ExitEvent is inserted, in configs/common/Simulation.py:751:
else:
print(f"Switch at curTick count:{str(10000)}")
exit_event = m5.simulate(10000)
This line return a ExitEvent afert 10000 Ticks after checkpoint restore, when the ExitEvent is triggered. In my case, it is 13073282500, 10000 Ticks afeter my checkpoint curTick=13073272500. I change it to m5.simulate() and ExitEvent is scheduled to further Ticks(MAXTICK i think). Finally, the checkpoint restore successfully.
Why is there a m5.simulate(10000) working as deault? Am i restoring correctly?
Any help would be appreciate!
Regards
Ling