Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Restore phandles from binary representations #151

Open
wants to merge 8 commits into
base: main
Choose a base branch
from

Conversation

ukleinek
Copy link
Contributor

device trees generated with options -@ and -L contain information about the label names used originally and which values are phandle values. This information can be reused when compiling back to dts format to make the result much more human readable.

Also dtdiff is adapted to make use of this to generate smaller diffs without hunks e.g. for a different phandle allocation.

This superseeds pull request #93.

@ukleinek
Copy link
Contributor Author

A possible followup improvement would be to restore phandles from __fixup__ .

@ukleinek
Copy link
Contributor Author

Just a quick headsup: I found a dtbo file where my code crashes with -@ -L. So don't merge yet please.

@ukleinek
Copy link
Contributor Author

OK, I think that's unrelated to my changes and just presents a situation to the decompiler that didn't happen before. Here is a reproducer:

$ git diff
diff --git a/treesource.c b/treesource.c
index 067647b60a17..fa1a48579489 100644
--- a/treesource.c
+++ b/treesource.c
@@ -104,6 +104,12 @@ static void write_propval_string(FILE *f, const char *s, size_t len)
 static void write_propval_int(FILE *f, const char *p, size_t len, size_t width)
 {
        const char *end = p + len;
+
+       if (len % width) {
+               fprintf(stderr, "Huh, len = %zu, width = %zu\n", len, width);
+               exit(1);
+       }
+
        assert(len % width == 0);
 
        for (; p < end; p += width) {

uwe@taurus:~/dts-decompile-labels$ cat test3.dts
/dts-v1/;

/ {
	tralala = <&somelabel>, "text making the next label unaligned",
		<&somelabel>, "some more text";

	somelabel: somenode {

	};
};
$ dtc -@ -L -I dts -O dtb test3.dts > test3.dtb
$ dtc -@ -L -I dtb -O dts test3.dtb 
/dts-v1/;

/ {
Huh, len = 37, width = 4
	tralala = <&somelabel 

So I think some heuristic considers the property "tralala" an array of ints but that doesn't handle that it ends at offset 41 which is unaligned.

Copy link
Owner

@dgibson dgibson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is a great idea. The implementation looks mostly sound but there are a few minor issues that need addressing.

dtc.c Outdated
generate_label_tree(dti, "__symbols__", true);
generate_labels_from_tree(dti, "__symbols__");
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd not immediately sure if it make sense to only do this when generate_symbols is set.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe do it unconditionally and remove the __symbols__ node if generate_symbols is set?

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I think that makes sense.

livetree.c Outdated
if (labeled_node)
add_label(&labeled_node->labels, p->name);
else if (quiet < 1)
fprintf(stderr, "Warning: Path %s referenced in %s missing",
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd suggest suggest an explicit "referenced in symbol %s" to make it clearer what's going on without context.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess you expect that p->name is passed then for your %s? I think mentioning name is still a good idea, so I suggest to make this

fprintf(stderr, "Warning: Path %s referenced in property %s/%s missing",
        p->val.val, name, p->name);

dtc.c Outdated
@@ -344,6 +344,7 @@ int main(int argc, char *argv[])
if (generate_fixups) {
generate_fixups_tree(dti, "__fixups__");
generate_local_fixups_tree(dti, "__local_fixups__");
local_fixup_phandles(dti, "__local_fixups__");
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again, not sure if it only makes sense to do this with -L.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As with __symbols__ I think it probably makes sense to do this unconditionally, then strip it out of -L is not given.

livetree.c Outdated
@@ -3,6 +3,8 @@
* (C) Copyright David Gibson <[email protected]>, IBM Corporation. 2005.
*/

#include <libfdt/libfdt.h>

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oof. We don't currently use libfdt within dtc, which is sort-of deliberate (they provide a cross-check to each other). There's a case to be made to drop that policy and instead use libfdt for much of the flat tree logic within dtc. However, I'd really prefer to do that as an overall change, using it for as much as we can, rather than adding small pieces of libfdt usage ad-hoc.

..also, I can't actually see what you're using this for.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe this is only a development relict, will try without.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FTR: Just dropping the #include doesn't work. Have to take a look at a more appropriate time.

livetree.c Outdated Show resolved Hide resolved
livetree.c Outdated Show resolved Hide resolved
livetree.c Outdated Show resolved Hide resolved
treesource.c Outdated
m->ref = refn->fullpath;
} else if (quiet < 1) {
fprintf(stderr, "Warning: node referenced by phandle 0x%x in property %s not found\n",
phandle, prop->name);
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you do need to consider __fixups__ as well. Without that, this error will be triggered if you're decompiling a dtbo with any external phandle references.

dtdiff Show resolved Hide resolved
@dgibson
Copy link
Owner

dgibson commented Oct 28, 2024

A possible followup improvement would be to restore phandles from __fixup__ .

I think you do need to implement this. For one thing processing but __local_fixups__ but not __fixups__ would just be surprising behaviour to a user, I think. Secondly, it could cause some bogus errors in some cases (see detailed comments).

@ukleinek
Copy link
Contributor Author

Do you have an idea for the unaligned phandle problem?

@dgibson
Copy link
Owner

dgibson commented Oct 28, 2024

Do you have an idea for the unaligned phandle problem?

Sorry, I'm not sure exactly what problem you mean.

@ukleinek
Copy link
Contributor Author

Do you have an idea for the unaligned phandle problem?

Sorry, I'm not sure exactly what problem you mean.

I mean the crash reported in #151 (comment) with the follow up in the next comment.

@dgibson
Copy link
Owner

dgibson commented Oct 30, 2024

Do you have an idea for the unaligned phandle problem?

Sorry, I'm not sure exactly what problem you mean.

I mean the crash reported in #151 (comment) with the follow up in the next comment.

Ah, right, sorry.

I think the basic problem is that previously the only case where we'd have phandle markers during decompile is with -I dts -O dts, in which case we'd also have type markers helping us figure out how to print each part of each property. The new code is putting in phandle markers, but without type markers. We should handle that, of course, but looks like there are some edge cases.

More specifically, it looks like write_propval() calls guess_value_type() to figure out a missing type only once for the whole property. Having a phandle marker does tell us that those specific 4 bytes should be in integer < ... > context. But we'll need to separately guess the type for each chunk between phandles. I suspect that might be irritatingly fiddly, but I'm afraid it's going to be a prerequisite for implementing this.

@ukleinek
Copy link
Contributor Author

This addresses the crash mentioned in #151 (comment) and a few suggestions by @dgibson (which I marked as resolved). Still missing is restoring of phandles from __fixups__.

treesource.c Outdated Show resolved Hide resolved
treesource.c Outdated
@@ -178,7 +177,9 @@ static enum markertype guess_value_type(struct property *prop)
}

for_each_marker_of_type(m, LABEL) {
if ((m->offset > 0) && (prop->val.val[m->offset - 1] != '\0'))
if (m->offset < offset || m->offset >= offset + len)
continue;
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there's another thing to consider here too. IIUC this is now attempting to guess the type of just the sub-piece of the property given by offset and len.

The tests below are classifying things as "not a string" or "not cells" based on labels which aren't plausibly aligned for a string or label type. However the presence of the label affects of the classification of the data before the label, not after the label. So because you've already excluded those labels, they'll no longer have the effect they need to.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Either I'm not following you, or you're wrong. If there is a label in the middle of a data chunk, I consider that a boundary and guess the types of the two chunks separately. IMHO that makes sense because a label (or a phandle) is very likely a boundary for a type change.

dtc.c Outdated
@@ -344,6 +344,7 @@ int main(int argc, char *argv[])
if (generate_fixups) {
generate_fixups_tree(dti, "__fixups__");
generate_local_fixups_tree(dti, "__local_fixups__");
local_fixup_phandles(dti, "__local_fixups__");
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As with __symbols__ I think it probably makes sense to do this unconditionally, then strip it out of -L is not given.

@ukleinek
Copy link
Contributor Author

OK, this got bigger now than I anticipated and also contains a few fixes, but I'm happy that I did it now and I can get it out of my head and happily use it.

Quick demo:

$ cat test.dts
/dts-v1/;
/plugin/;

/ {
	tralal = <&somelabel>, "text making the",
		<&somelabel>, "some more text";

	tralala = <&somelabel>, "text making the next label unaligned",
		<&somelabel>, "some more text";

	somelabel: somenode {
		property = <&nonexisting>;
	};
};
$ dtc -@ test.dts > test.dtb

Decompiling this dtb with dtc 1.7.2 gives you:

$ /usr/bin/dtc test.dtb
/dts-v1/;

/ {
	tralal = [00 00 00 01 74 65 78 74 20 6d 61 6b 69 6e 67 20 74 68 65 00 00 00 00 01 73 6f 6d 65 20 6d 6f 72 65 20 74 65 78 74 00];
	tralala = <0x01 0x74657874 0x206d616b 0x696e6720 0x74686520 0x6e657874 0x206c6162 0x656c2075 0x6e616c69 0x676e6564 0x00 0x1736f6d 0x65206d6f 0x72652074 0x65787400>;

	somenode {
		property = <0xffffffff>;
		phandle = <0x01>;
	};

	__symbols__ {
		somelabel = "/somenode";
	};

	__fixups__ {
		nonexisting = "/somenode:property:0";
	};

	__local_fixups__ {
		tralal = <0x00 0x14>;
		tralala = <0x00 0x29>;
	};
};

With the changes from this PR applied it gets:

$ dtc test.dtb
/dts-v1/;
/plugin/;

/ {
	tralal = <&somelabel>, "text making the", <&somelabel>, "some more text";
	tralala = <&somelabel>, "text making the next label unaligned", <&somelabel>, "some more text";

	somelabel: somenode {
		property = <&nonexisting>;
		phandle = <0x01>;
	};

	__symbols__ {
		somelabel = "/somenode";
	};

	__fixups__ {
		nonexisting = "/somenode:property:0";
	};

	__local_fixups__ {
		tralal = <0x00 0x14>;
		tralala = <0x00 0x29>;
	};
};

and if test.dtb was compiled without -@ it uses &{/somenode} instead of &somelabel.

@ukleinek
Copy link
Contributor Author

BTW, I didn't implement stripping out the __fixups__, __local_fixups__ and __symbols__ nodes. Mostly because it doesn't seem to be right for -I dts -O dts compilations. Instead the two fixup nodes are removed before they are generated to make sure that the information isn't duplicated.

Copy link
Owner

@dgibson dgibson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've applied the first three patches, because they're sensible fixes independent of the rest of the series. I've made some comments on the next few, but I haven't completed a full review yet.

fprintf(f, "/dts-v1/;\n");
if (any_fixup_tree(dti, dti->dt))
fprintf(f, "/plugin/;\n");
fprintf(f, "\n");
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is correct, but I seems like a very indirect way of diong it. We're taking the __fixups__ node, distributing across the tree as a bunch of fixup markers, then scanning the entire tree for fixup markers to mark it as /plugin/.

I think a clear way to do this would be to set the DTSF_PLUGIN flag on dtb input if there is a (non empty) __fixups__ or __local_fixups__ node, then on dts output output the /plugin/ tag based on that flag.

@@ -0,0 +1,23 @@
/dts-v1/;
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This commit definitely needs a proper commit message. What's the rationale for this test?

treesource.c Outdated
@@ -139,26 +139,49 @@ static const char *delim_end[] = {
[TYPE_STRING] = "",
};

/*
* The invariants in the marker list are:
* - offsets are monotonically rising
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe "non-strictly monotonically increasing", since adjacent offsets can be equal.

treesource.c Outdated
* The invariants in the marker list are:
* - offsets are monotonically rising
* - for a single offset there is at most one type marker
* - for a single offset there is at most one non-type marker
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This one isn't true.

prop = &somenode, <&othernode>;

Will have the REF_PATH and REF_PHANDLE markers at the same offset, until the REF_PATH marker is resolved (before that it's essentially marking a region of zero length).

Uwe Kleine-König added 2 commits January 24, 2025 22:33
With the next few commits phandles are restored when compiling to dts.
Depending on the source these phandles might be undefined. To allow
recompiling the generated source, mark the dts as "plugin" in the
presence of undefined phandles. Otherwise compilation fails with

    Reference to non-existent node or label "nonexisting"

.

Signed-off-by: Uwe Kleine-König <[email protected]>
A dts file that is over-determined in the sense that it already contains
both a __local_fixups__ and/or a __fixups__ node and phandles that
result in entries in these nodes, should not compiled to a device tree
that has duplicate entries. This was a problem before commit
915daad ("Start with empty __local_fixups__ and __fixups__ nodes").

Add a test that ensures this issue isn't reintroduced later.

Signed-off-by: Uwe Kleine-König <[email protected]>
@ukleinek
Copy link
Contributor Author

I've applied the first three patches, because they're sensible fixes independent of the rest of the series. I've made some comments on the next few, but I haven't completed a full review yet.

The topmost commit is also simple and orthogonal to the rest of the series. That one might be eligible for fast tracking, too.
I'll look into your other feedback later.

Uwe Kleine-König added 6 commits January 25, 2025 16:45
The add_marker() function is used to create a new marker and add it at
the right spot to the relevant marker list. Use it in the
add_string_markers() helper (which gets slightly quicker by it).

Signed-off-by: Uwe Kleine-König <[email protected]>
In the presence of (non-type) markers guess the type of each chunk
between markers individually instead of only once for the whole
property.

Note that this only gets relevant with the next few commits that restore
labels and phandles. Note further that this rework is necessary with
these further changes, because phandle markers are currently not
considered for type guessing and so a phandle at an offset that isn't a
multiple of 4 triggers an assertion if the property was guessed to have
type TYPE_UINT32.

Now that guess_value_type() is only called for data chunks without
markers, the function can be simplified a bit.

Signed-off-by: Uwe Kleine-König <[email protected]>
If the input has a __symbols__ node, restore the named labels for the
respective nodes.

Signed-off-by: Uwe Kleine-König <[email protected]>
The __local_fixups__ node contains information about phandles. Parse it
to improve the result when decompiling a device tree blob.

Signed-off-by: Uwe Kleine-König <[email protected]>
The __fixups__ node contains information about labels. Parse its
properties to create phandle markers which improve the resulting dts
when decompiling a device tree blob.

Signed-off-by: Uwe Kleine-König <[email protected]>
The file ending .dtbo is typically used for device tree overlays. These
are in the dtb input format, too. So assume this input format for *.dtbo
as is already done for *.dtb.

Signed-off-by: Uwe Kleine-König <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants