Skip to content

Commit 6710ade

Browse files
Merge pull request #1299 from vlstill/string-concat
Allow concatenation of string literals at compile time
2 parents 75aca49 + 2352751 commit 6710ade

File tree

1 file changed

+64
-17
lines changed

1 file changed

+64
-17
lines changed

p4-16/spec/P4-16-spec.adoc

Lines changed: 64 additions & 17 deletions
Original file line numberDiff line numberDiff line change
@@ -1225,11 +1225,18 @@ number of backslash characters (ASCII code 92). P4 does not make any
12251225
validity checks on strings (i.e., it does not check that strings
12261226
represent legal UTF-8 encodings).
12271227

1228-
Since P4 does not provide any operations on strings,
1229-
string literals are generally passed unchanged through the P4 compiler to
1230-
other third-party tools or compiler-backends, including the
1231-
terminating quotes. These tools can define their own handling of
1232-
escape sequences (e.g., how to specify Unicode characters, or handle
1228+
Since P4 does not allow strings to exist at runtime, string literals
1229+
are generally passed unchanged through the P4 compiler to
1230+
other third-party tools or compiler-backends. The compiler can, however,
1231+
perform compile-time concatenation (constant-folding) of concatenation
1232+
expressions into single literal. When such concatenation is performed,
1233+
the binary representation of the string literals (excluding the quotes)
1234+
is concatenated in the order they appears in the source code. There are
1235+
no escape sequences that would be treated specially when strings are
1236+
concatenated.
1237+
1238+
The backends and other tools can define their own handling of escape
1239+
sequences (e.g., how to specify Unicode characters, or handle
12331240
unprintable ASCII characters).
12341241

12351242
Here are 3 examples of string literals:
@@ -1242,6 +1249,17 @@ Here are 3 examples of string literals:
12421249
line terminator"
12431250
----
12441251

1252+
Here is an example of concatenation expression and an equivalent string
1253+
literal:
1254+
1255+
[source,p4]
1256+
----
1257+
"one string \" with a quote inside;" ++ (" " ++ "another string")
1258+
// can be constant folded to
1259+
"one string \" with a quote inside; another string"
1260+
----
1261+
1262+
12451263
[#sec-trailing-commas]
12461264
==== Optional trailing commas
12471265

@@ -1867,15 +1885,20 @@ Operations on values of type `match_kind` are described in
18671885
==== The Boolean type
18681886

18691887
The Boolean type `bool` contains just two values, `false` and `true`.
1870-
Boolean values are not integers or bit-strings.
1888+
Boolean values are not integers or bit-strings. Operations that can
1889+
be performed on booleans are described in Section <<sec-bool-exprs>>.
18711890

18721891
[#sec-string-type]
18731892
==== Strings
18741893

1875-
The type `string` represents strings. There are no operations on
1876-
string values; one cannot declare variables with a `string` type.
1877-
Parameters with type `string` can be only directionless (see
1878-
<<sec-calling-convention>>).
1894+
The type `string` represents strings. The values of type `string` are
1895+
either string literals, or concatenations of multiple `string`-typed
1896+
expression. Operations that can be performed on strings are described in
1897+
Section <<sec-string-ops>>.
1898+
1899+
One cannot declare variables with a `string` type. Parameters with
1900+
type `string` can be only directionless (see Section
1901+
<<#sec-calling-convention>>).
18791902
P4 does not support string manipulation
18801903
in the dataplane; the `string` type is only allowed for describing
18811904
compile-time known values (i.e., string literals, as discussed in
@@ -3487,6 +3510,25 @@ itself can be evaluated at compilation time. This restriction is
34873510
designed to ensure that the width of the result of the conditional
34883511
expression can be inferred statically at compile time.
34893512

3513+
[#sec-string-ops]
3514+
=== Operations on strings
3515+
3516+
The only operation allowed on strings is concatenation, denoted by
3517+
`++`. For string concatenation, both operands must be strings and
3518+
the result is also a string. String concatenation can only be
3519+
performed at compile time.
3520+
3521+
[source,p4]
3522+
----
3523+
extern void log(string message);
3524+
3525+
void foo(int<8> v) {
3526+
// ...
3527+
log("my log message " ++
3528+
"continuation of the log message");
3529+
}
3530+
----
3531+
34903532
[#sec-bit-ops]
34913533
=== Operations on fixed-width bit types (unsigned integers)
34923534

@@ -8903,9 +8945,10 @@ table t {
89038945

89048946
The `@name` annotation directs the compiler to use a different
89058947
local name when generating the external APIs used to manipulate a
8906-
language element from the control plane. This annotation takes a string literal
8907-
body. In the
8908-
following example, the fully-qualified name of the table is `c_inst.t1`.
8948+
language element from the control plane. This annotation takes a local
8949+
compile-time known value of type `string` (typically a string literal).
8950+
In the following example, the fully-qualified name of the table is
8951+
`c_inst.t1`.
89098952

89108953
[source,p4]
89118954
----
@@ -9036,12 +9079,14 @@ absence), allowing architecture-independent analysis of P4 programs.
90369079

90379080
The `deprecated` annotation has a required string argument that is a
90389081
message that will be printed by a compiler when a program is using the
9039-
deprecated construct. This is mostly useful for annotating library
9040-
constructs, such as externs.
9082+
deprecated construct. This is mostly useful for annotating library
9083+
constructs, such as externs. The parameter must be a local
9084+
compile-time known value of type `string`.
90419085

90429086
[source,p4]
90439087
----
9044-
@deprecated("Please use the 'check' function instead")
9088+
#define DEPR_V1_2_2 "Deprecated in v1.2.2"
9089+
@deprecated("Please use the 'check' function instead." ++ DEPR_V1_2_2)
90459090
extern Checker {
90469091
/* body omitted */
90479092
}
@@ -9053,7 +9098,8 @@ extern Checker {
90539098
The `noWarn` annotation has a required string argument that indicates
90549099
a compiler warning that will be inhibited. For example
90559100
`@noWarn("unused")` on a declaration will prevent a compiler warning
9056-
if that declaration is not used.
9101+
if that declaration is not used. The parameter must be a local
9102+
compile-time known value of type `string`.
90579103

90589104
=== Target-specific annotations
90599105

@@ -9078,6 +9124,7 @@ The P4 compiler should provide:
90789124

90799125
* Clarified that numeric priorities cannot be assigned to entries of a table that has `const entries` (<<sec-entries>>).
90809126
* Clarified that `switch` statements are allowed in action and function bodies, and that `switch` statements with `action_run` expressions are only allowed in control `apply` blocks (<<sec-stmts>> and <<sec-switch-stmt>>).
9127+
* Added support for compile-time string concatenation using `++` operator (<<sec-string-type>> and <<sec-string-ops>>).
90819128

90829129
=== Summary of changes made in version 1.2.5, released October 11, 2024
90839130

0 commit comments

Comments
 (0)