Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a test to ensure the report_stats hook is actually called internally #18

Open
wants to merge 10 commits into
base: main
Choose a base branch
from

Conversation

doctorlai-msrc
Copy link
Collaborator

The report_stats hook is internally called at some interval at jbpf_perf.c This PR adds a test to make sure it actually calls it.

sem_post(&sem);
} else {
// we don't get any output from the codelet, meaning the report_stats hook has not been called
assert(false);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Couldn't this test fail due to timing issues? What if the output thread is called right before the maintenance thread is called? Wouldn't io_channel_check_output() fail in this case although everything might be working fine?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We use sem_post(lcm_ipc_sem) to make sure the codelet is loaded only after maintenance thread is created. By default, a dummy output_handler is registered, before io_channel_check_output is registered.

I've ran the test 500 times, all passing.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But how can we guarantee that the maintenance thread is called? What if it is delayed for a long time (e.g., 1s), because it runs in a very resource constrained environment? Can we somehow make this test deterministic and avoid relying on the timer?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this is a problem. As the test use the semphore to ensure the maintenance thread is called see line 47 and 109.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The semaphore of lines 47 and 109 ensures that the test will only terminate when the data is received. But what if the maintenance thread is delayed for a long time as I described above? Doesn't that mean that when the io_channel_check_output() function is called, execution might go to line 50? This would lead to a test failure, not because there is any issue, but just because the maintenance thread was late to be called. Or am I missing something?

Copy link
Collaborator Author

@doctorlai-msrc doctorlai-msrc Jan 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My understanding is that won't happen, as shown here and here the maintenance thread will be ready when _jbpf_handle_out_bufs is called. Correct me if I am wrong.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The thread will be ready, but there is no guarantee that it will be called before _jbpf_handle_out_bufs(). For example if both threads are running on the same CPU core and there is high contention, it could be that maintenance thread runs later than the io thread. As such, there could be scenarios where this test could fail simply due to high contention.

I would suggest to revisit this test, to somehow not rely on the synchronization order of the two threads.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants