Skip to content

Commit 0ce07ae

Browse files
committed
Define Linux Network Devices
The proposed "netdevices" field provides a declarative way to specify which host network devices should be moved into a container's network namespace. This approach is similar than the existing "devices" field used for block devices but uses a dictionary keyed by the interface name instead. The proposed scheme is based on the existing representation of network device by the `struct net_device` https://docs.kernel.org/networking/netdevices.html. This proposal focuses solely on moving existing network devices into the container namespace. It does not cover the complexities of network configuration or network interface creation, emphasizing the separation of device management and network configuration. Signed-off-by: Antonio Ojea <[email protected]>
1 parent d61dee6 commit 0ce07ae

10 files changed

+121
-0
lines changed

config-linux.md

+38
Original file line numberDiff line numberDiff line change
@@ -189,6 +189,42 @@ In addition to any devices configured with this setting, the runtime MUST also s
189189
* [`/dev/ptmx`][pts.4].
190190
A [bind-mount or symlink of the container's `/dev/pts/ptmx`][devpts].
191191

192+
## <a name="configLinuxNetworkDevices" />Network Devices
193+
194+
Linux network devices are entities that send and receive data packets.
195+
They are not represented as files in the /dev directory, unlike block devices, network devices are represented with the [`net_device`][net_device] data structure in the Linux kernel.
196+
Network devices can belong to only one network namespace and use a set of operations distinct from regular file operations. Examples of network devices include Ethernet cards, loopback devices, and virtual devices like bridges, VLANs, and MACVLANs.
197+
198+
This schema focuses solely on moving existing network devices identified by name from the host network namespace into the container network namespace. It does not cover the complexities of network device creation or network configuration, such as IP address assignment, routing, and DNS setup.
199+
200+
**`netDevices`** (object, OPTIONAL) set of network devices that MUST be made available in the container. The runtime is responsible for providing these devices; the underlying mechanism is implementation-defined.
201+
202+
The runtime MUST check that is possible to move the network interface to the container namespace and MUST [generate an error](runtime.md#errors) if the check fails.
203+
204+
The runtime MUST set the network device state to "up" after moving it to the network namespace to allow the container to send and receive network traffic through that device.
205+
206+
For proper container termination, the runtime must first set the device's state to "down" and then move it out of the namespace before the namespace is deleted. This ensures the device is inactive and avoids conflicts. If the container abnormally terminates and the runtime does not participate in the termination process, these steps might be skipped, and the kernel will handle the process, described in [network_namespaces(7)][net_namespaces.7] "When a network namespace is freed (i.e., when the last process in the namespace terminates), its physical network devices are moved back to the initial network namespace" . Notice that after deleting a network namespace, all its migratable network devices are moved to the default network namespace, but virtual devices (veth, macvlan, ...) are destroyed.
207+
208+
The name of the network device is the entry key.
209+
Entry values are objects with the following properties:
210+
211+
* **`name`** *(string, OPTIONAL)* - the name of the network device inside the container namespace. If not specified, the host name is used. The network device name is unique per network namespace, if an existing network device with the same name exists that rename operation will fail. The runtime MAY check that the name is unique before the rename operation.
212+
The runtime, when participating on the container termination, must revert back the original name to guarantee the idempotence of operations, so a container that moves an interface and renames it can be created and destroyed multiple times with the same result.
213+
214+
### Example
215+
216+
#### Moving a device with a renamed interface inside the container:
217+
218+
```json
219+
"netDevices": {
220+
"eth0" : {
221+
"name": "container_eth0"
222+
}
223+
}
224+
```
225+
226+
This configuration will move the device named "eth0" from the host into the container's network namespace. Inside the container, the device will be named "container_eth0".
227+
192228
## <a name="configLinuxControlGroups" />Control groups
193229

194230
Also known as cgroups, they are used to restrict resource usage for a container and handle device access.
@@ -982,6 +1018,8 @@ subset of the available options.
9821018
[mknod.1]: https://man7.org/linux/man-pages/man1/mknod.1.html
9831019
[mknod.2]: https://man7.org/linux/man-pages/man2/mknod.2.html
9841020
[namespaces.7_2]: https://man7.org/linux/man-pages/man7/namespaces.7.html
1021+
[net_device]: https://docs.kernel.org/networking/netdevices.html
1022+
[net_namespaces.7]: https://man7.org/linux/man-pages/man7/network_namespaces.7.html
9851023
[null.4]: https://man7.org/linux/man-pages/man4/null.4.html
9861024
[personality.2]: https://man7.org/linux/man-pages/man2/personality.2.html
9871025
[pts.4]: https://man7.org/linux/man-pages/man4/pts.4.html

features-linux.md

+14
Original file line numberDiff line numberDiff line change
@@ -228,3 +228,17 @@ Irrelevant to the availability of Intel RDT on the host operating system.
228228
}
229229
}
230230
```
231+
232+
## <a name="linuxFeaturesNetDevices" />NetDevices
233+
234+
**`netDevices`** (object, OPTIONAL) represents the runtime's implementation status of Linux network devices.
235+
236+
* **`enabled`** (bool, OPTIONAL) represents whether the runtime supports the capability to move Linux network devices into the container's network namespace.
237+
238+
### Example
239+
240+
```json
241+
"netDevices": {
242+
"enabled": true
243+
}
244+
```

schema/config-linux.json

+6
Original file line numberDiff line numberDiff line change
@@ -9,6 +9,12 @@
99
"$ref": "defs-linux.json#/definitions/Device"
1010
}
1111
},
12+
"netDevices": {
13+
"type": "object",
14+
"additionalProperties": {
15+
"$ref": "defs-linux.json#/definitions/NetDevice"
16+
}
17+
},
1218
"uidMappings": {
1319
"type": "array",
1420
"items": {

schema/defs-linux.json

+8
Original file line numberDiff line numberDiff line change
@@ -189,6 +189,14 @@
189189
}
190190
}
191191
},
192+
"NetDevice": {
193+
"type": "object",
194+
"properties": {
195+
"name": {
196+
"type": "string"
197+
}
198+
}
199+
},
192200
"weight": {
193201
"$ref": "defs.json#/definitions/uint16"
194202
},

schema/features-linux.json

+8
Original file line numberDiff line numberDiff line change
@@ -110,6 +110,14 @@
110110
}
111111
}
112112
}
113+
},
114+
"netDevices": {
115+
"type": "object",
116+
"properties": {
117+
"enabled": {
118+
"type": "boolean"
119+
}
120+
}
113121
}
114122
}
115123
}
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,13 @@
1+
{
2+
"ociVersion": "1.0.0",
3+
"root": {
4+
"path": "rootfs"
5+
},
6+
"linux": {
7+
"netDevices": {
8+
"eth0": {
9+
"name": 23
10+
}
11+
}
12+
}
13+
}
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,15 @@
1+
{
2+
"ociVersion": "1.0.0",
3+
"root": {
4+
"path": "rootfs"
5+
},
6+
"linux": {
7+
"netDevices": {
8+
"eth0": {
9+
"name": "container_eth0"
10+
},
11+
"ens4": {},
12+
"ens5": {}
13+
}
14+
}
15+
}

schema/test/features/good/runc.json

+3
Original file line numberDiff line numberDiff line change
@@ -182,6 +182,9 @@
182182
},
183183
"selinux": {
184184
"enabled": true
185+
},
186+
"netDevices": {
187+
"enabled": true
185188
}
186189
},
187190
"annotations": {

specs-go/config.go

+8
Original file line numberDiff line numberDiff line change
@@ -236,6 +236,8 @@ type Linux struct {
236236
Namespaces []LinuxNamespace `json:"namespaces,omitempty"`
237237
// Devices are a list of device nodes that are created for the container
238238
Devices []LinuxDevice `json:"devices,omitempty"`
239+
// NetDevices are key-value pairs, keyed by network device name on the host, moved to the container's network namespace.
240+
NetDevices map[string]LinuxNetDevice `json:"netDevices,omitempty"`
239241
// Seccomp specifies the seccomp security settings for the container.
240242
Seccomp *LinuxSeccomp `json:"seccomp,omitempty"`
241243
// RootfsPropagation is the rootfs mount propagation mode for the container.
@@ -491,6 +493,12 @@ type LinuxDevice struct {
491493
GID *uint32 `json:"gid,omitempty"`
492494
}
493495

496+
// LinuxNetDevice represents a single network device to be added to the container's network namespace
497+
type LinuxNetDevice struct {
498+
// Name of the device in the container namespace
499+
Name string `json:"name,omitempty"`
500+
}
501+
494502
// LinuxDeviceCgroup represents a device rule for the devices specified to
495503
// the device controller
496504
type LinuxDeviceCgroup struct {

specs-go/features/features.go

+8
Original file line numberDiff line numberDiff line change
@@ -48,6 +48,7 @@ type Linux struct {
4848
Selinux *Selinux `json:"selinux,omitempty"`
4949
IntelRdt *IntelRdt `json:"intelRdt,omitempty"`
5050
MountExtensions *MountExtensions `json:"mountExtensions,omitempty"`
51+
NetDevices *NetDevices `json:"netDevices,omitempty"`
5152
}
5253

5354
// Cgroup represents the "cgroup" field.
@@ -143,3 +144,10 @@ type IDMap struct {
143144
// Nil value means "unknown", not "false".
144145
Enabled *bool `json:"enabled,omitempty"`
145146
}
147+
148+
// NetDevices represents the "netDevices" field.
149+
type NetDevices struct {
150+
// Enabled is true if network devices support is compiled in.
151+
// Nil value means "unknown", not "false".
152+
Enabled *bool `json:"enabled,omitempty"`
153+
}

0 commit comments

Comments
 (0)