Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

logger: allow starting before the driver is loaded #5027

Merged
merged 2 commits into from
Aug 2, 2022

Conversation

marc-hb
Copy link
Collaborator

@marc-hb marc-hb commented Nov 25, 2021

See commit messages.

Fixes a small part of thesofproject/linux#3275

Draft until it's been tested more. I used this for months

Copy link
Contributor

@ujfalusi ujfalusi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@marc-hb, this is a great idea, but I think the implementation can be made a bit more readable

@@ -78,7 +79,7 @@ static void usage(void)

static int snapshot(const char *name)
{
const char *path = "/sys/kernel/debug/sof";
const char *path = SYS_DEBUG "/sof";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it does not help readability, it was much cleaner when the whole path was humanly readable.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed but sizeof("/sys/kernel/debug"/) seemed ugly. I guess I can use a local variable there.

config.version_file, strerror(ret));
goto out;
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is fine, but the commit message is somehow hard to digest:
"So when the driver is not loaded, we get the same (missing trace) error
trace message whether we use the -n option or not."

When the driver is not loaded then the entire /sys/kernel/debug/sof directory is non existent, so it does not matter which file you try to read, none of them will be present.

However, if the dtrace is disabled in kernel then the trace and filter files are going to be missing and they will only appear if you reload the modules with options snd_sof sof_debug=1 at minimum.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When the driver is not loaded then the entire /sys/kernel/debug/sof directory is non existent, so it does not matter which file you try to read, none of them will be present.

It matters a lot for the next commit but I tried to avoid a reference to the future.

However, if the dtrace is disabled in kernel then the trace and filter files are going to be missing and they will only appear if you reload the modules with options snd_sof sof_debug=1 at minimum.

and? Not sure I see your point here sorry, is this a better commit message?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@marc-hb, what about printing something like "driver is not loaded" when the driver is not loaded (the sof debugfs directory is not present along with the fw_version file)?

If the version file is optional then why we fail when it is not present?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the version file is optional then why we fail when it is not present?

We don't, this code is not run when config.version_fw is false.

I moved the version check after waiting for the trace file. I left the error handling because it does not hurt, opening could fail for some other reason (like a race)

There is a "waiting for /sys/..." message above.

const int dwatch = inotify_add_watch(iqueue, watched_dir, IN_CREATE);
FILE *fstream;

char * const fpath = malloc(strlen(watched_dir) + 1 + strlen(expected_file) + 1);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we have the declarations first?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why, do we need to support any legacy, pre-C99 compiler here? Leaving variables uninitialized makes the code more verbose and less safe: not possible to use const and more risk of errors.
This is not the Linux kernel here, no need to support a baroque bad habit that never even existed in any other programming language.

https://stackoverflow.com/a/4105334/317623

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is just that Linux coding standard is the native language for so many of us here. The Linux coding standard does not forbid stack variable initializations, just mixing declarations (with or without initialization) and code, so it is possible satisfy both the security and the dinosaurs with more code blocks. Just saying, this is fine by me as it is.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Linux standard was stuck to C89 which forbid C99 mixing. It was recently upgraded to C11.

Leaving both security and dinosaurs aside for a minute, how would you rewrite this without mixing and without losing a few const? { } blocks can often help but here these constants and variables are used in the entire function. So I'm afraid some constants would have to be downgraded to variables, no?

Copy link
Contributor

@jsarha jsarha Aug 3, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do not see the problem. Am I missing something?

`
static void open_file(const char *watched_dir, const char *expected_file)
{
const int iqueue = inotify_init();
const int dwatch = inotify_add_watch(iqueue, watched_dir, IN_CREATE);
struct stat expected_stat;
void *ret_stream = NULL;
char * const fpath = malloc(strlen(watched_dir) + 1 + strlen(expected_file) + 1);

/* ... */
return ret_stream;
}

static void *wait_open(const char *watched_dir, const char *expected_file)
{
if (access(watched_dir, R_OK)) {
fprintf(stderr, "error: %s() cannot read %s\n", func, watched_dir);
return NULL;
}

return open_file(watched_dir, expected_file);

}
`

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed, you can always add extra functions until mixing is avoided, artificially creating as many blocks as you need. However having a tiny "public" function that merely checks whether an argument exists and then delegates all the rest to another function that actually does all the work is an awkward solution for a C89 problem that does not exist anymore.

BTW many people don't even understand this old C89 rule, in the past I had review comments asking me not to do this:

f(...)
{
   statement1;
    { int i;

    }
}

Don't get me wrong: giant functions are bad too but that's off topic.


const int iqueue = inotify_init();
const int dwatch = inotify_add_watch(iqueue, watched_dir, IN_CREATE);
FILE *fstream;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you initialize it to NULL then you don't need to do it in all error paths?

config.in_fd = fopen(config.in_file, "rb");
else
config.in_fd = wait_fopen(SYS_DEBUG,
config.in_file + strlen(SYS_DEBUG) + 1);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would be tempted to move this whole logic down into a custom logger_fopen() or something.
That way you would not need to re-construct the config.in_file into a new malloced buffer?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Future possible improvement? Keep in mind we want to discard this logger entirely in the longer term.

Copy link
Collaborator

@kv2019i kv2019i left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @marc-hb ! A few minor comments, but the main logic seems solid.

@@ -78,7 +79,7 @@ static void usage(void)

static int snapshot(const char *name)
{
const char *path = "/sys/kernel/debug/sof";
const char *path = SYS_DEBUG "/sof";
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed but sizeof("/sys/kernel/debug"/) seemed ugly. I guess I can use a local variable there.

const int dwatch = inotify_add_watch(iqueue, watched_dir, IN_CREATE);
FILE *fstream;

char * const fpath = malloc(strlen(watched_dir) + 1 + strlen(expected_file) + 1);
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why, do we need to support any legacy, pre-C99 compiler here? Leaving variables uninitialized makes the code more verbose and less safe: not possible to use const and more risk of errors.
This is not the Linux kernel here, no need to support a baroque bad habit that never even existed in any other programming language.

https://stackoverflow.com/a/4105334/317623

config.in_fd = fopen(config.in_file, "rb");
else
config.in_fd = wait_fopen(SYS_DEBUG,
config.in_file + strlen(SYS_DEBUG) + 1);
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Future possible improvement? Keep in mind we want to discard this logger entirely in the longer term.

config.version_file, strerror(ret));
goto out;
}
}
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When the driver is not loaded then the entire /sys/kernel/debug/sof directory is non existent, so it does not matter which file you try to read, none of them will be present.

It matters a lot for the next commit but I tried to avoid a reference to the future.

However, if the dtrace is disabled in kernel then the trace and filter files are going to be missing and they will only appear if you reload the modules with options snd_sof sof_debug=1 at minimum.

and? Not sure I see your point here sorry, is this a better commit message?

* used by some distros but broken in gcc:
* https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66425
*/
if (read(iqueue, evlist, sizeof(evlist)))
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if (read(iqueue, evlist, sizeof(evlist)))
(void) !read(iqueue, evlist, sizeof(evlist)))

Copy link
Member

@lgirdwood lgirdwood left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lets get this in v2.0

fflush(stdout);

while (1) {
char evlist[1 * sizeof(struct inotify_event)];
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

mmh, how about struct inotify_event event;?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See below.

while (1) {
char evlist[1 * sizeof(struct inotify_event)];
const int timeout_min = 5;
struct pollfd pfd[1] = {{ .fd = iqueue, .events = POLLIN, .revents = 0 }};
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

struct pollfd pfd would do too

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I prefer to stick to the exact API signatures and not confuse arrays with their first element. It's also more future proof.

@gkbldcig
Copy link
Collaborator

gkbldcig commented Dec 4, 2021

Can one of the admins verify this patch?

Copy link
Collaborator Author

@marc-hb marc-hb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've been testing this and it has been working fine for me but I noticed a logical error. Right now it's using inotify to watch /sys/kernel/debug until /sys/kernel/debug/sof/[e]trace exists. However inotify is not recursive. So, it gets notified when /sys/kernel/debug/sof gets created, but if /sys/kernel/debug/sof/[e]trace gets created long enough after /sys/kernel/debug/sof then the logger could hang forever. Any /sys idea why it seems to be working fine so far anyway? A race condition maybe?

config.version_file, strerror(ret));
goto out;
}
}
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the version file is optional then why we fail when it is not present?

We don't, this code is not run when config.version_fw is false.

I moved the version check after waiting for the trace file. I left the error handling because it does not hurt, opening could fail for some other reason (like a race)

There is a "waiting for /sys/..." message above.

while (1) {
char evlist[1 * sizeof(struct inotify_event)];
const int timeout_min = 5;
struct pollfd pfd[1] = {{ .fd = iqueue, .events = POLLIN, .revents = 0 }};
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I prefer to stick to the exact API signatures and not confuse arrays with their first element. It's also more future proof.

fflush(stdout);

while (1) {
char evlist[1 * sizeof(struct inotify_event)];
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See below.

@marc-hb marc-hb requested a review from aiChaoSONG December 7, 2021 07:30
@lgirdwood
Copy link
Member

I've been testing this and it has been working fine for me but I noticed a logical error. Right now it's using inotify to watch /sys/kernel/debug until /sys/kernel/debug/sof/[e]trace exists. However inotify is not recursive. So, it gets notified when /sys/kernel/debug/sof gets created, but if /sys/kernel/debug/sof/[e]trace gets created long enough after /sys/kernel/debug/sof then the logger could hang forever. Any /sys idea why it seems to be working fine so far anyway? A race condition maybe?

Does it make sense to simplify and monitor only 1 file per process i.e. logger can be invoked with either etrace or trace arguments. This would simpligy, any output can always be stitched together later on with a script.

@marc-hb
Copy link
Collaborator Author

marc-hb commented Dec 23, 2021

logger can be invoked with either etrace or trace arguments

This is already the case, I think you didn't get my concern.

My concern is this PR works thanks to a race.

  • /sys/kernel/debug/sof/ gets created, sof-logger gets notified and looks for /sys/kernel/debug/sof/trace, does not find it YET.
  • A LONG enough time later, /sys/kernel/debug/sof/trace gets created but that's one level too far down so NO /sys/kernel/debug/ notification. sof-logger is stuck forever.

I will experiment a bit more.

marc-hb added 2 commits July 23, 2022 03:48
Open /sys/kernel/debug/sof/fw_version _after_
/sys/kernel/debug/sof/[e]trace because reading the former is optional
and the latter is not.

So when the driver is not loaded, we get the same (missing trace) error
trace message whether we use the -n option or not.

Signed-off-by: Marc Herbert <[email protected]>
Don't fail immediately when the driver is not loaded. Use inotify
instead to wait for /sys/kernel/debug/sof/[e]trace to appear.

This makes it possible to start before the driver is loaded which
reduces considerably the chances of missing early logs.

Fixes a small part of thesofproject/linux#3275

Signed-off-by: Marc Herbert <[email protected]>
@marc-hb marc-hb force-pushed the early-logger-start branch from 064d4f6 to dce72f3 Compare July 23, 2022 03:49
@marc-hb marc-hb marked this pull request as ready for review July 23, 2022 05:02
@marc-hb marc-hb requested a review from ujfalusi July 23, 2022 05:03
Copy link
Member

@lgirdwood lgirdwood left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@marc-hb can you give us an update about what has changed since last push (as some time has passes and many changes have been made to align the logging with Zephyr). What is the plan here ?

@marc-hb
Copy link
Collaborator Author

marc-hb commented Jul 30, 2022

My concern is this PR works thanks to a race.

I fixed that some time ago but never shared, now I did. I also addressed some minor review comments.

I've been testing this PR in my workspace for months, it's very solid.

@lgirdwood lgirdwood merged commit dcf0577 into thesofproject:main Aug 2, 2022
@marc-hb marc-hb deleted the early-logger-start branch August 3, 2022 07:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants