-
Notifications
You must be signed in to change notification settings - Fork 78
Stream stats #71
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Stream stats #71
Conversation
code changes for adding per-stream status.
Added optional arg to cache_stats::print_stats, cache_stats::print_fail_stats and their upstream functions. When streamID is specified, print stats from that stream. When not specified, print all stats. NOTE: current implementation depending on streamid never equals -1
Per-stream stats tracking feature
Multi-Stream stats
@JRPan let me know if you need something from UW here |
@JRPan just checking again if you need anything from us? |
Sorry for the delay, this will be included in the next major release. accel-sim/accel-sim-framework#317 |
@FJShen Lets try get this merged asap |
I need some more description of this PR. E.g., What feature does it add? What bug does it fix? Does "stream" refer to CUDA stream? |
@FJShen : @JRPan took the original commit my students submitted to Accel-Sim and packaged it into this. Yes, "stream" means "CUDA stream". We have a full tech report that describes the support we added: https://arxiv.org/abs/2304.11136 Ultimately, what these commits do is it adds support to Accel-Sim/GPGPU-Sim to track statistics at a per-CUDA stream level. |
Thank you, Matt! @FJShen For each kernel, a stream id is always given (0 by default). The stats container is usually a This mainly affects cache stats (L1 and L2), and cycles. Other stats like occupancy, DRAM, and IPC are not changed. These stats does not make sense to be separated. I correlated GPU ubench and rodinia. Functionally, this is correct. I assigned you as the reviewer to check more about the C++ side of this :) I did not correlate this with any workload that runs with concurrent enabled, tho. Maybe we should write a ubench. |
I believe we did correlate with concurrent streams in the report I linked, FWIW. |
I found some possibly erroneous code (and sub-optimal coding practice). Please kindly address the reviews above. |
I only see one comment related to the config file. Is this correct? |
@JRPan I forgot to "submit" the reviews. I hope they are now visible. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me.
* Temp commit for Justin and Cassie to sync on code changes for adding per-stream status. * Resolved compile errors. * Removed redundant parameter * Passed cuda_stream_id from accelsim to gpgpusim * Cleaned up unused changes * Changed vector to map, having operator problems. * StreamID defaults to zero * Implemented streams to inc_stats and so on * Fixed TOTAL_ACCESS counts * Implemented GLOBAL_TIMER. * Fixed m_shader->get_kernel SEGFAULT issue in shader.cc. * Use warp_init to track streamID instead of issue_warp * Removed temp debug print * Modified cache_stats to only print data from latest finished stream Added optional arg to cache_stats::print_stats, cache_stats::print_fail_stats and their upstream functions. When streamID is specified, print stats from that stream. When not specified, print all stats. NOTE: current implementation depending on streamid never equals -1 * Removed default arg values of streamID * modified constructor of mem_fetch to pass in streamID * changed get_streamid to get_streamID * Added TODO to gpgpusim_entrypoint.cc and power_stat.cc * Only collect power stats when enabled * print last finished stream in PTX mode using last_streamID * take out additional printf * Add a field to baseline cache to indicate cache level * save gpu object in cache * Print stream ID only once per kernel * rm test print * use -1 for default stream id * cleanup debug prints * remove GLOABL_TIMER * Automated clang-format * Should be correct to print everything in power model * addressing concerns & errors * Automated clang-format * add m_stats_pw in operator+ * Automated Format --------- Co-authored-by: Justin Qiao <[email protected]> Co-authored-by: Justin Qiao <[email protected]> Co-authored-by: Tim Rogers <[email protected]> Co-authored-by: JRPan <[email protected]> Co-authored-by: purdue-jenkins <[email protected]>
Collect stats based on streams.
99% Ready. Creating a PR to trigger GitHub Actions to generate correlations.
Consult @JRPan before merging.