-
Notifications
You must be signed in to change notification settings - Fork 82
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Modular match fields for tables. #1284
Comments
@jonathan-dilorenzo I wrote some notes about my perspective on maintaining similar-but-not-identical collections of P4 programs here: https://github.com/jafingerhut/p4-guide/blob/master/program-variants/README.md As a bonus, I figured out how to configure VScode to do "inactive region highlighting" for C/C++ programs based upon arbitrary settings of C preprocessor symbols that you specify in a config file. If you rename your P4 programs to have a file name suffix of |
My take on these options:
Since table are not really self-contained (their keys read control's scope and the action they call read and write variables from the control's scope), it probably does not make sense to allow table declarations to stand separate from a control. That is a little unfortunate, as otherwise some way of inheriting or composing tables might make sense for this. As is, if we wanted to go in this direction [kind of brainstorming], we would either need to define some way of inheriting controls and then extending tables in them, or to make key sets separate entities that can be inherited/extended and that can be control's parameters (used in the |
Adding another possible syntax into the discussion, this one using a compile-time conditional "statement" which I've proposed naming
Yes, syntactically this is not much different than We would probably let There could also be a |
Sorry for being late, but where can I find a clear problem statement? So far, all the discussions about "dynamic keys" that I've seen were easily resolvable by explicitly declaring two or more standard tables and letting the backend to place them into some shared resources. How is this specific case different? |
Jonathan can clarify in case I am misinterpreting anything here, but I think the briefest clear statement is that there are sometimes scenarios when you want to write multiple different P4 programs, but they have about 90% of their code the same, and differ in about 10%, including sometimes exactly what set of keys a particular named table has, or which actions a table has, or sometimes part of the control flow. The stuff that most developers would use the C preprocessor and #ifdef for today. Again, if I have not misinterpreted Jonathan, this article might give a little more background, but I suspect it is familiar to you from what I have said already: https://github.com/jafingerhut/p4-guide/blob/master/program-variants/README.md |
@jafingerhut, I think the So I believe that what you propose would go in a direction of adding templates to P4 (especially if the content of the If we want to have such an expressive/powerful capability, it is worth investigating if sanitary macros could be actually a better solution -- with a huge caveat I have no experience with them, so I could be missing important downsides (I have a lot of experience with C++ templates, and they are very powerful, but I thing there was quite an agreement we don't want to go in that direction for P4). (feeling rebellious: Maybe if we go in this direction, we should start thinking about P426 that we design ground up with all these things in mind...) |
My understanding was that this goes in the different direction -- completely independent tables that have similar structure. |
@vlstill In my example, the conditional expression in the |
@jafingerhut -- thank you for the clarification. I am definitely familiar with the need and I am familiar with the doc you wrote. In fact, this is precisely what 99% of P4 developers do -- they use C preprocessor. Besides being a very familiar tool, it also has a very important additional benefit -- it allows the critical configuration files (all those I do certainly understand the limitations of the C preprocessor and I thought we had several discussions with respect to improving the preprocessing abilities of P4. Is that where we are going? If so, I think it would still be nice to better understand the additional requirements/limitations before adding all these features directly into the language, such as:
|
@jafingerhut In that case, I don't really see how that would be used to be significantly better then use of preprocessor. You would need to duplicate this declaration multiple times, right? Possibly by preprocessor, but then it still uses preprocessor and is not much better then just having |
That is a very good point in my opinion. If we design a solution that would be totally disconnected from CPP, we would need another way to share configuration between P4 and C/C++. Although if the configuration can be expressed just in simple defines, then we are probably fine keeping that in P4. In my opinion though, if we want to add modularity/parametrization to P4 it should be significantly better (cleaner, easier to work with) then using CPP. Otherwise the advantages @vgurevich had pointed out for CPP may be more important. As for the points @vgurevich raised, my opinion is:
|
@vlstill I am not claiming it is significantly better than using the preprocessor. I am trying to provide alternatives for thought here. Me, I don't mind the C preprocessor. Others seem to dislike it for various reasons. Some differences between my
|
I would agree with you, but it is definitely not quite the case today and CPP provides several advantages over Let's assume we have a table definition today, e.g:
Now, let's assume that the programmer's job is to fit the program (which is so far one of the most common reasons why people do not hardcode the size in the first place). Which definition would you prefer for
If you ask me, most of the time I use (3) or (4), because in this case I can control the value of the Also, for example, if you have the code such as:
the only way to specify the value of the The point is not that I am advocating against |
Let's compile the list of those reasons, so that we can see whether a particular solution is better or worse.
Let's compile the list of these reasons too. I am not against the idea, but I want to make sure we understand exactly what we are trying to achieve in the first place. |
Since the possibility of approaching this problem with a first-class macro system came up in a recent LDWG discussion, I thought I would provide some of my thoughts in that direction. First-class macro systems often execute after the AST has been generated. This means macros are a part of the language syntax; they operate directly on the AST of a program and can only produce valid AST fragments. A second property of some first-class macro systems is that macros have an explicit and bounded scope with respect to the AST (sometimes referred to as hygienic macros), and it's clear when program source changes, what macros need to be executed and what part of the AST will potentially be transformed. Contrast this to preprocessor systems that execute before any sort of compilation takes place, have a scope that is defined by textual inclusion, and can change symbol definitions independent of any syntactic scoping rules via directives such as Some of the practical impacts of these differences are:
On the topic of having macro-like features of the language being syntactically distinct versus fully integrated, my (slight) preference would be for the former, as I think it makes reading and understanding code a bit easier. With the distinct syntax approach, it's obvious what's happening at compile time versus runtime. |
@rcgoodfellow, I think these are great ideas, and I especially want to talk about language bindings (and API definitions in general). I totally agree that common CPP definitions are a very small portion of what need to be happening. Having said that, your proposal calls for the first-class macros that somehow can process P4 AST, but then output something non-P4, since (at least today) P4 does not include any facilities for API definition and thus there is no AST to generate. Thus, realistically, we can explore a different approach, to which you alluded in the third bullet and that is a formal definition of a tool, that can explore the AST and output anything in response to it. So, for example an API generator will consume the definition of the table, plus all the underlying stuff (actions, types, etc.) and, while using some templates create a definition in some language (e.g. P4info, although that would not be my preference), while substituting the relevant portions of the template with the name of the table, the names of the key fields, etc. This is, by the way, how the API generation was performed by the original compiler (AST was basically a set of well-defined Python objects). But this takes us pretty far away from the original problem, which is having some common source out of which multiple different (but somehow related) data plane programs can be generated. I know, you might be speaking from Rust experience, but P4 is a much less powerful language with a lot of special cases. The main problem is probably that there are declarations and the code, and declarations are often using very specialized syntax. Most of the things there cannot be expressions (at least today) and things like if() statements are not expressions in P4. Also, declarations and there use are separated, meaning that you often need to generate code in 2 or more places. I can see how attribute-based conditional compilation approach that is available in Rust can be introduced in P4, by defining something annotation-like that will work the same way as the conditional compilation |
I was not suggesting that the macros generate non-P4 code. On the contrary, I think it's important that macros only take P4 AST elements as inputs and produce P4 AST elements as outputs. The point I was making about code generation was that if the compile-time programming facilities are an intrinsic part of the language rather than a preprocessor phase, then code generation frameworks would benefit from that as they could treat the AST as the interface to P4 program representation rather than source code directly. Would love to have a conversation about language binding generation and API more broadly, as I think having standardized API definitions for P4 at a lower level than the P4 Runtime (thinking hardware/software interface) could have a lot of potential. But as you said, @vgurevich, that's starting to get far afield from the focus of this issue. I'll work toward putting together a concrete example of what a macro-based approach might look like for the code @jonathan-dilorenzo linked above. |
I asked the question below live during the 2024-Aug language design work group meeting to Jonathan, and his reply was (a), i.e. that he was intending the source code to behave as K similar, but different, P4 programs at compile time, with K different "binaries" produced, and each network switch would load exactly one of those K binaries. @jonathan-dilorenzo One short question on the code snippets in your original issue: Do you think of the conditions "[some_metadata_saying_that_this_is_a_middleblock] == true" as involving: (a) compile-time configuration: only values that are compile-time known, and restricted to be compile-time known? If (a), then it seems you are fundamentally describing K similar, but different, P4 programs, which you would execute a P4 compiler on K times to get K different "binaries". Each network switch is expected to load exactly one of those binaries. This way requires some way to invoke the compiler in K different ways, otherwise you do not get K different binaries. This approach is exactly what the C preprocessor and #ifdef's can be used to implement, although of course one can invent other ways (perhaps exactly the topic of this issue). If (b), then perhaps you are thinking that you want to execute the P4 compiler exactly once on the source code, producing exactly one "binary", and all switches load this binary. In this scenario, each switch must be configured at run time to indicate it which of the K types of behavior it should implement while processing packets. I think both approaches (a) and (b) are on the table for consideration, and perhaps there are other possibilities besides (a) and (b) that you or others have in mind. A program could mix both approaches (a) and (b), for example, but if they do, then they need to produce multiple binaries, and some or all of those binaries require configuration to choose among the run-time configuration options that a single binary implements. I think in this discussion it is worth making it clear whether one is thinking of approaches to solve problem (a), or problem (b), or to define the restrictions and results of some alternate idea they are considering. For myself, I jumped to the assumption that the scenario we are talking about is (a), but I could be wrong in that is what you were looking for. |
I am thinking about one relatively lightweight extension to the current state of P4. It could allow both parametrization of single table use case (i.e. there is a P4 program that can be compile-time configured for different use cases that require different keys in table) as well as parametrization of reuse (i.e. multiple places in the same P4 code use similar table definitions). The key point is that we could introduce a new match kind Example: 1 #include <core.p4>
2
3 #ifdef FOO
4 const bool foo_enabled = FOO;
5 #else
6 const bool foo_enabled = false;
7 #endif
8
9 match_kind { none } // TODO: name? TODO: add to spec
10
11 header hdr_t {
12 bit<8> a;
13 bit<8> b;
14 bit<8> none;
15 }
16
17 control SubCtrl(inout hdr_t hdr)(match_kind key1_kind) {
18 action a0() {}
19 action a1() { hdr.none = 4; }
20
21 table t0 {
22 key = {
23 hdr.a : exact;
24 hdr.b : (foo_enabled ? ternary : none); // TODO: no expressions allowed
25 }
26 actions = { a0; a1; }
27 }
28
29 table t1 {
30 key = {
31 hdr.a : lpm;
32 hdr.b : key1_kind; // TODO: decl not found
33 }
34 actions = { a0; a1; }
35 }
36
37 apply {
38 t0.apply();
39 t1.apply();
40 }
41 }
42
43 control cntr(inout hdr_t hdr) {
44 SubCtrl(lpm) sub;
45
46 apply {
47 sub.apply(hdr);
48 }
49 }
This does not compile currently. The grammar does not permit expressions in place of match kind and P4C does not do lookups for the names there (or probably it looks in up only within the The nice part is that inlining, constant folding and constant propagation are not needed to type check A midend pass would later remove the One important thing to figure out would be the name of the new match kind. I like Of course, there are many other problems that would be solved my macro system and not solved by this (e.g. removing some actions from action list). But in my opinion this would be rather clean and would go significant way. It would also not hurt the macro-based extension in future, it would just give us more freedom in how tables can be defined. |
Given the question and answer in this comment: #1284 (comment) it leads to this minor (but important) follow-up question: Since you want to produce K different binaries from the compiler, presumably you want to invoke the compiler K times, each producing a different binary. Thus the command line options for each of those K compiler invocations should be different, because ideally two invocations of the compiler with the same command line options and the same input files would produce the same binary. (option 1) I could imagine introducing NEW command line options that let one assign values to P4 named constants, and then using the values of those constants in compile-time macro invocations in the P4 source code. (option 2) If one of the goals is to completely avoid the use of the C preprocessor (that might not be your goal), then (option 2) does not seem to me to be an option available to you, becuase you cannot use #include. (If it is NOT one of the goals to completely avoid the use of the C preprocessor, then I wonder idly aloud: You are OK with using #include, but you really, really want to avoid #ifdef?) (option 3) |
For P4 LDWG, I promised I'd produce a toy example to illustrate this need, but I found a real example of ours from a year or so ago was open-sourced (though it looks much worse now).
Some possible thoughts on how I might like it to look:
bit<0>
to effectively act as a non-match key in this world:bit<0>
behavior as above. I'm actually not sure where this would fail today:The text was updated successfully, but these errors were encountered: