@@ -16,14 +16,15 @@ Enter `Services`.
16
16
17
17
A Kubernetes ` Service ` is an abstraction which defines a logical set of ` Pods `
18
18
and a policy by which to access them - sometimes called a micro-service. The
19
- set of ` Pods ` targeted by a ` Service ` is determined by a [ `Label
20
- Selector`] ( labels.md ) .
19
+ set of ` Pods ` targeted by a ` Service ` is (usually) determined by a [ `Label
20
+ Selector` ](labels.md) (see below for why you might want a ` Service` without a
21
+ selector).
21
22
22
23
As an example, consider an image-processing backend which is running with 3
23
24
replicas. Those replicas are fungible - frontends do not care which backend
24
25
they use. While the actual ` Pods ` that compose the backend set may change, the
25
- frontend clients should not need to manage that themselves. The ` Service `
26
- abstraction enables this decoupling.
26
+ frontend clients should not need to be aware of that or keep track of the list
27
+ of backends themselves. The ` Service ` abstraction enables this decoupling.
27
28
28
29
For Kubernetes-native applications, Kubernetes offers a simple ` Endpoints ` API
29
30
that is updated whenever the set of ` Pods ` in a ` Service ` changes. For
@@ -37,16 +38,12 @@ REST objects, a `Service` definition can be POSTed to the apiserver to create a
37
38
new instance. For example, suppose you have a set of ` Pods ` that each expose
38
39
port 9376 and carry a label "app=MyApp".
39
40
40
-
41
41
``` json
42
42
{
43
43
"kind" : " Service" ,
44
44
"apiVersion" : " v1beta3" ,
45
45
"metadata" : {
46
46
"name" : " my-service" ,
47
- "labels" : {
48
- "environment" : " testing"
49
- }
50
47
},
51
48
"spec" : {
52
49
"selector" : {
@@ -64,22 +61,34 @@ port 9376 and carry a label "app=MyApp".
64
61
```
65
62
66
63
This specification will create a new ` Service ` object named "my-service" which
67
- targets TCP port 9376 on any ` Pod ` with the "app=MyApp" label. Every ` Service `
68
- is also assigned a virtual IP address (called the "portal IP"), which is used by
69
- the service proxies (see below). The ` Service ` 's selector will be evaluated
70
- continuously and the results will be posted in an ` Endpoints ` object also named
71
- "my-service".
64
+ targets TCP port 9376 on any ` Pod ` with the "app=MyApp" label. This ` Service `
65
+ will also be assigned an IP address (sometimes called the "portal IP"), which
66
+ is used by the service proxies (see below). The ` Service ` 's selector will be
67
+ evaluated continuously and the results will be posted in an ` Endpoints ` object
68
+ also named "my-service".
69
+
70
+ Note that a ` Service ` can map an incoming port to any ` targetPort ` . By default
71
+ the ` targetPort ` is the same as the ` port ` field. Perhaps more interesting is
72
+ that ` targetPort ` can be a string, referring to the name of a port in the
73
+ backend ` Pod ` s. The actual port number assigned to that name can be different
74
+ in each backend ` Pod ` . This offers a lot of flexibility for deploying and
75
+ evolving your ` Service ` s. For example, you can change the port number that
76
+ pods expose in the next version of your backend software, without breaking
77
+ clients.
78
+
79
+ Kubernetes ` Service ` s support ` TCP ` and ` UDP ` for protocols. The default
80
+ is ` TCP ` .
72
81
73
82
### Services without selectors
74
83
75
- Services, in addition to providing abstractions to access ` Pods ` , can also
76
- abstract any kind of backend . For example:
84
+ Services generally abstract access to Kubernetes ` Pods ` , but they can also
85
+ abstract other kinds of backends . For example:
77
86
- you want to have an external database cluster in production, but in test
78
- you use your own databases.
87
+ you use your own databases
79
88
- you want to point your service to a service in another
80
- [ ` Namespace ` ] ( namespaces.md ) or on another cluster.
89
+ [ ` Namespace ` ] ( namespaces.md ) or on another cluster
81
90
- you are migrating your workload to Kubernetes and some of your backends run
82
- outside of Kubernetes.
91
+ outside of Kubernetes
83
92
84
93
In any of these scenarios you can define a service without a selector:
85
94
@@ -102,7 +111,8 @@ In any of these scenarios you can define a service without a selector:
102
111
}
103
112
```
104
113
105
- Then you can manually map the service to a specific endpoint(s):
114
+ Because this has no selector, the corresponding ` Endpoints ` object will not be
115
+ created. You can manually map the service to your own specific endpoints:
106
116
107
117
``` json
108
118
{
@@ -135,8 +145,8 @@ watches the Kubernetes master for the addition and removal of `Service`
135
145
and ` Endpoints ` objects. For each ` Service ` it opens a port (random) on the
136
146
local node. Any connections made to that port will be proxied to one of the
137
147
corresponding backend ` Pods ` . Which backend to use is decided based on the
138
- AffinityPolicy of the ` Service ` . Lastly, it installs iptables rules which
139
- capture traffic to the ` Service ` 's ` Port ` on the ` Service ` 's portal IP (which
148
+ ` SessionAffinity ` of the ` Service ` . Lastly, it installs iptables rules which
149
+ capture traffic to the ` Service ` 's ` Port ` on the ` Service ` 's cluster IP (which
140
150
is entirely virtual) and redirects that traffic to the previously described
141
151
port.
142
152
@@ -146,12 +156,59 @@ appropriate backend without the clients knowing anything about Kubernetes or
146
156
147
157
![ Services overview diagram] ( services_overview.png )
148
158
149
- By default, the choice of backend is random. Client-IP-based session affinity
150
- can be selected by setting ` service.spec.sessionAffinity ` to ` "ClientIP" ` .
159
+ By default, the choice of backend is random. Client-IP based session affinity
160
+ can be selected by setting ` service.spec.sessionAffinity ` to ` "ClientIP" ` (the
161
+ default is ` "None" ` ).
151
162
152
163
As of Kubernetes 1.0, ` Service ` s are a "layer 3" (TCP/UDP over IP) construct. We do not
153
164
yet have a concept of "layer 7" (HTTP) services.
154
165
166
+ ## Multi-Port Services
167
+
168
+ Many ` Service ` s need to expose more than one port. For this case, Kubernetes
169
+ supports multiple port definitions on a ` Service ` object. When using multiple
170
+ ports you must give all of your ports names, so that endpoints can be
171
+ disambiguated. For example:
172
+
173
+ ``` json
174
+ {
175
+ "kind" : " Service" ,
176
+ "apiVersion" : " v1beta3" ,
177
+ "metadata" : {
178
+ "name" : " my-service" ,
179
+ },
180
+ "spec" : {
181
+ "selector" : {
182
+ "app" : " MyApp"
183
+ },
184
+ "ports" : [
185
+ {
186
+ "name" : " http" ,
187
+ "protocol" : " TCP" ,
188
+ "port" : 80 ,
189
+ "targetPort" : 9376
190
+ },
191
+ {
192
+ "name" : " https" ,
193
+ "protocol" : " TCP" ,
194
+ "port" : 443 ,
195
+ "targetPort" : 9377
196
+ }
197
+ ]
198
+ }
199
+ }
200
+ ```
201
+
202
+ ## Choosing your own PortalIP address
203
+
204
+ A user can specify their own ` PortalIP ` address as part of a ` Service ` creation
205
+ request. For example, if they already have an existing DNS entry that they
206
+ wish to replace, or legacy systems that are configured for a specific IP
207
+ address and difficult to re-configure. The ` PortalIP ` address that a user
208
+ chooses must be a valid IP address and within the portal_net CIDR range that is
209
+ specified by flag to the API server. If the PortalIP value is invalid, the
210
+ apiserver returns a 422 HTTP status code to indicate that the value is invalid.
211
+
155
212
### Why not use round-robin DNS?
156
213
157
214
A question that pops up every now and then is why we do all this stuff with
@@ -208,66 +265,104 @@ DNS records for each. If DNS has been enabled throughout the cluster then all
208
265
For example, if you have a ` Service ` called "my-service" in Kubernetes
209
266
` Namespace ` "my-ns" a DNS record for "my-service.my-ns" is created. ` Pods `
210
267
which exist in the "my-ns" ` Namespace ` should be able to find it by simply doing
211
- a name lookup for "my-service". ` Pods ` which exist in other ` Namespaces ` must
268
+ a name lookup for "my-service". ` Pods ` which exist in other ` Namespace ` s must
212
269
qualify the name as "my-service.my-ns". The result of these name lookups is the
213
- virtual portal IP.
270
+ cluster IP.
271
+
272
+ We will soon add DNS support for multi-port ` Service ` s in the form of SRV
273
+ records.
214
274
215
275
## Headless Services
216
276
217
- Sometimes you don't need or want a single virtual IP. In this case, you can
218
- create "headless" services by specifying "None" for the PortalIP. For such
219
- services , a virtual IP is not allocated, DNS is not configured (this will be
220
- fixed), and service-specific environment variables for pods are not created.
221
- Additionally, the kube proxy does not handle these services and there is no
222
- load balancing or proxying done by the platform for them. The endpoints
223
- controller will still create endpoint records in the API for such services.
224
- These services also take advantage of any UI, readiness probes, etc. that are
225
- applicable for services in general .
226
-
227
- The tradeoff for a developer would be whether to couple to the Kubernetes API
228
- or to a particular discovery system. Applications can still use a
229
- self-registration pattern and adapters for other discovery systems could be
230
- built upon this API, as well .
277
+ Sometimes you don't need or want a single service IP. In this case, you can
278
+ create "headless" services by specifying ` "None" ` for the ` PortalIP ` . For such
279
+ ` Service ` s , a cluster IP is not allocated and service-specific environment
280
+ variables for ` Pod ` s are not created. DNS is configured to return multiple A
281
+ records (addresses) for the ` Service ` name, which point directly to the ` Pod ` s
282
+ backing the ` Service ` . Additionally, the kube proxy does not handle these
283
+ services and there is no load balancing or proxying done by the platform for
284
+ them. The endpoints controller will still create ` Endpoints ` records in the
285
+ API .
286
+
287
+ This option allows developers to reduce coupling to the Kubernetes system, if
288
+ they desire, but leaves them freedom to do discovery in their own way.
289
+ Applications can still use a self-registration pattern and adapters for other
290
+ discovery systems could easily be built upon this API.
231
291
232
292
## External Services
233
293
234
294
For some parts of your application (e.g. frontends) you may want to expose a
235
295
Service onto an external (outside of your cluster, maybe public internet) IP
236
- address.
237
-
238
- On cloud providers which support external load balancers, this should be as
239
- simple as setting the ` createExternalLoadBalancer ` flag of the ` Service ` spec
240
- to ` true ` . This sets up a cloud-specific load balancer and populates the
241
- ` publicIPs ` field of the spec (see below). Traffic from the external load
242
- balancer will be directed at the backend ` Pods ` , though exactly how that works
243
- depends on the cloud provider.
244
-
245
- For cloud providers which do not support external load balancers, there is
246
- another approach that is a bit more "do-it-yourself" - the ` publicIPs ` field.
247
- Any address you put into the ` publicIPs ` array will be handled the same as the
248
- portal IP - the kube-proxy will install iptables rules which proxy traffic
249
- through to the backends. You are then responsible for ensuring that traffic to
250
- those IPs gets sent to one or more Kubernetes ` Nodes ` . As long as the traffic
251
- arrives at a Node, it will be be subject to the iptables rules.
252
-
253
- An common situation is when a ` Node ` has both internal and an external network
254
- interfaces. If you put that ` Node ` 's external IP in ` publicIPs ` , you can
255
- then aim traffic at the ` Service ` port on that ` Node ` and it will be proxied to
256
- the backends. If you set all ` Node ` s' external IPs as ` publicIPs ` you can then
257
- reach a ` Service ` through any ` Node ` , which means you can build your own
258
- load-balancer or even just use DNS round-robin. The downside to this approach
259
- is that all such ` Service ` s share a port space - only one of them can have port
260
- 80, for example.
296
+ address. Kubernetes supports two ways of doing this: ` NodePort ` s and
297
+ ` LoadBalancer ` s.
261
298
262
- ## Choosing your own PortalIP address
299
+ Every ` Service ` has a ` Type ` field which defines how the ` Service ` can be
300
+ accessed. Valid values for this field are:
301
+ - ClusterIP: use a cluster-internal IP (portal) only - this is the default
302
+ - NodePort: use a cluster IP, but also expose the service on a port on each
303
+ node of the cluster (the same port on each)
304
+ - LoadBalancer: use a ClusterIP and a NodePort, but also ask the cloud
305
+ provider for a load balancer which forwards to the ` Service `
263
306
264
- A user can specify their own ` PortalIP ` address as part of a service creation
265
- request. For example, if they already have an existing DNS entry that they
266
- wish to replace, or legacy systems that are configured for a specific IP
267
- address and difficult to re-configure. The ` PortalIP ` address that a user
268
- chooses must be a valid IP address and within the portal net CIDR range that is
269
- specified by flag to the API server. If the PortalIP value is invalid, the
270
- apiserver returns a 422 HTTP status code to indicate that the value is invalid.
307
+ Note that while ` NodePort ` s can be TCP or UDP, ` LoadBalancer ` s only support TCP
308
+ as of Kubernetes 1.0.
309
+
310
+ ### Type = NodePort
311
+
312
+ If you set the ` type ` field to ` "NodePort" ` , the Kubernetes master will
313
+ allocate you a port (from a flag-configured range) on each node for each port
314
+ exposed by your ` Service ` . That port will be reported in your ` Service ` 's
315
+ ` spec.ports[*].nodePort ` field. If you specify a value in that field, the
316
+ system will allocate you that port or else will fail the API transaction.
317
+
318
+ This gives developers the freedom to set up their own load balancers, to
319
+ configure cloud environments that are not fully supported by Kubernetes, or
320
+ even to just expose one or more nodes' IPs directly.
321
+
322
+ ### Type = LoadBalancer
323
+
324
+ On cloud providers which support external load balancers, setting the ` type `
325
+ field to ` "LoadBalancer" ` will provision a load balancer for your ` Service ` .
326
+ The actual creation of the load balancer happens asynchronously, and
327
+ information about the provisioned balancer will be published in the ` Service ` 's
328
+ ` status.loadBalancer ` field. For example:
329
+
330
+ ``` json
331
+ {
332
+ "kind" : " Service" ,
333
+ "apiVersion" : " v1beta3" ,
334
+ "metadata" : {
335
+ "name" : " my-service" ,
336
+ },
337
+ "spec" : {
338
+ "selector" : {
339
+ "app" : " MyApp"
340
+ },
341
+ "ports" : [
342
+ {
343
+ "protocol" : " TCP" ,
344
+ "port" : 80 ,
345
+ "targetPort" : 9376 ,
346
+ "nodePort" : 30061
347
+ }
348
+ ],
349
+ "portalIP" : " 10.0.171.239" ,
350
+ "type" : " LoadBalancer"
351
+ },
352
+ "status" : {
353
+ "loadBalancer" : {
354
+ "ingress" : [
355
+ {
356
+ "ip" : " 146.148.47.155"
357
+ }
358
+ ]
359
+ }
360
+ }
361
+ }
362
+ ```
363
+
364
+ Traffic from the external load balancer will be directed at the backend ` Pods ` ,
365
+ though exactly how that works depends on the cloud provider.
271
366
272
367
## Shortcomings
273
368
@@ -280,6 +375,13 @@ details.
280
375
Using the kube-proxy obscures the source-IP of a packet accessing a ` Service ` .
281
376
This makes some kinds of firewalling impossible.
282
377
378
+ LoadBalancers only support TCP, not UDP.
379
+
380
+ The ` Type ` field is designed as nested functionality - each level adds to the
381
+ previous. This is not strictly required on all cloud providers (e.g. GCE does
382
+ not need to allocate a ` NodePort ` to make ` LoadBalancer ` work, but AWS does)
383
+ but the current API requires it.
384
+
283
385
## Future work
284
386
285
387
In the future we envision that the proxy policy can become more nuanced than
@@ -293,11 +395,11 @@ eliminate userspace proxying in favor of doing it all in iptables. This should
293
395
perform better and fix the source-IP obfuscation, though is less flexible than
294
396
arbitrary userspace code.
295
397
296
- We hope to make the situation around external load balancers and public IPs
297
- simpler and easier to comprehend.
298
-
299
398
We intend to have first-class support for L7 (HTTP) ` Service ` s.
300
399
400
+ We intend to have more flexible ingress modes for ` Service ` s which encompass
401
+ the current ` ClusterIP ` , ` NodePort ` , and ` LoadBalancer ` modes and more.
402
+
301
403
## The gory details of portals
302
404
303
405
The previous information should be sufficient for many people who just want to
@@ -348,9 +450,9 @@ When a client connects to the portal the iptables rule kicks in, and redirects
348
450
the packets to the ` Service proxy ` 's own port. The ` Service proxy ` chooses a
349
451
backend, and starts proxying traffic from the client to the backend.
350
452
351
- This means that ` Service ` owners can choose any ` Service ` port they want without
352
- risk of collision. Clients can simply connect to an IP and port, without
353
- being aware of which ` Pods ` they are actually accessing.
453
+ This means that ` Service ` owners can choose any port they want without risk of
454
+ collision. Clients can simply connect to an IP and port, without being aware
455
+ of which ` Pod ` s they are actually accessing.
354
456
355
457
![ Services detailed diagram] ( services_detail.png )
356
458
0 commit comments