Skip to content
Alan DeKok edited this page Dec 27, 2024 · 9 revisions

Proxying

Proxy implementations differ on how they handle retransmission. There was much discussion on this topic over the years in RADEXT, and the consensus (RFC3539( was that proxies should be "pass through" for retransmissions. i.e. they should retransmit only when clients retransmit, and should not have their own retransmission timers.

Some proxies ignore all retransmissions from the client, and instead have their own timers for retransmission. This process is discouraged. See RFC 6613 Section 2.5, and RFC 3539 Section 2.8.

Other implementations do not have their own timers, and instead retransmit when the NAS retransmits. However, the danger here is that the retransmissions then are not always controlled by the NAS, but they can sometimes be triggered by badly written supplicants! i.e. a supplicant which does not receive a response could "spam" the NAS with EAPoL packets, and a naive NAS would send a duplicate Access-Request. Which the proxies would then blindly retransmit.

Such behavior should be discouraged. RFC5080 Section 2.2.1 defines retransmissions, but does not give a lower bound. It also does not address the issues of proxies retransmitting due to client retransmits, or proxies implementing their own timers.

A proxy which relies on client retransmits to trigger outbound retransmits, then the proxy MUST limit those retransmits to no more than one per second. i.e. see the patch in FreeRADIUS

Discuss pros and cons for client retransmits or proxy retransmits

  • proxy retransmits
    • con: retransmissions are decoupled from the NAS. This has ?? bad behavior?
    • pro: retransmissions are decoupled from the NAS. Most NAS retransmissions ignore RFC5080, and just do fixed intervals. Proxies are smarter, and can do exponential backoff, which lowers the impact of bad networks.

Proxies either:

  • act as "pass through" forwarders for client packets (subject to rate limiting) and MUST NOT have their own retransmission timers

or

  • have their own retransmission timers, and therefore MUST NOT forward retransmissions from clients.

Implementations which implement both "pass through" forwarding and their own retransmission timers contribute to packet storms and congestive failures. These implementations are pathological, and should be fixed.

Proxy-State

The only real utility of Proxy-State is loop detection. It is possible to implement a RADIUS server / proxy which does nothing more than forward / copy reply of Proxy-State as per https://datatracker.ietf.org/doc/html/rfc2865#section-5.33. A proxy doesn't in fact need to add a Proxy-State attribute in order to work in most situations.

However, some implementations filter / remove Proxy-State when forwarding. This behavior is pathological, and is forbidden. Implementations MUST NOT automatically filter / remove Proxy-State when proxying. Implementations SHOULD NOT permit administrators to create configurations or policy rules which filter / remove Proxy-State.

As an aside, the original Merit RADIUS server would "copy" Proxy-State attributes by using sscanf("%d") to turn the Proxy-State value into an integer, and then using `printf("%d") on the integer to create a "new" Proxy-State for the proxied packet. It is difficult to understand the rationale behind such behavior. A simple "copy attribute" API would be less code, and would be more correct

Loops are partially detected by simply counting the number of Proxy-State attributes, and then rejecting packets which contain "too many" Proxy-State attributes. (reject for auth, NAK for CoA, discard for Acct?)

Proxy implementations SHOULD create Proxy-State using a site-local 64-bit value, taken from a secure random number generator. This value can then be used to identify a particular proxy server, or set of servers. This value should be the same for all packets proxied by the server. This same value should be used by all "equivalent" proxies, such as ones used in a load-balance configuration. Using 64 bits means that the chance of a hash collision is approximately one in 2^32, which should be sufficient. i.e. there are significantly fewer than 2^32 RADIUS servers on the planet.

Any one of the equivalent proxy servers can then examine Proxy-State, and look for their unique token. If the token exists, then a loop has been detected.

This recommendation isn't perfect. Other servers can still delete the Proxy-State, or copy the values that they see. But such corner cases are best addressed by administrator intervention.

This recommendation allows administrators and implementations to better detect inadvertent proxy loops, rather than deliberate misconfigurations.

Proxies should also NOT copy the Proxy-State attributes from the proxied reply to their reply. The downstream server may have deleted or modified the attributes. Instead a proxy should copy the Proxy-State from the original request to the reply.

Rate Limiting

It is possible for a malicious client to send enormous amounts of packets to proxies in a roaming consortium. There are a few different cases:

  • users immediately retrying after a reject. Home servers SHOULD implement a reject delay. Proxies MUST NOT implement a reject delay

  • clients overloading upstream proxies with thousands of packets. e.g. when a power failure occurs, there may be thousands or tens of thousands of devices which are all trying to authenticate in a short period of time. Such behavior can result in congestive collapse, and overload of home servers. Clients (and proxies) should implement some kind of rate limiting.

Load Balancing

This is partially for normal clients, but is more often appropriate for proxies.

Load balancing can be done a number of different ways:

  • round-robin. This works for PAP / accounting. But doesn't work for EAP. Should only be used for accounting.

  • some kind of keyed load balancing

RFC3539 has a lot of text on this, which has mostly been ignored.

Systems SHOULD do load balancing based on a hash of Calling-Station-Id. The alternative is to do "packet-by-packet" load balancing, which can break authentication methods which require multiple rounds. i.e. EAP.

Proxies which do not wish to examine packet contents MAY do load-balancing based on a hash of (client source IP + client source port). That also gets good distribution. But it assumes that all packets for one session (e.g. EAP) are originating from one client port.

Changing Transports

RFC 6613 does not discuss udp -> tcp -> udp proxying issues, and where retransmissions come from. Since there are no retransmissions over TCP, the server which proxies from TCP to UDP must instead run the retransmission timers itself.

https://datatracker.ietf.org/doc/html/rfc6613#section-2.5 says this about proxies:

    Intrinsically, proxy systems operate with multiple control loops
   instead of one end-to-end loop, and so they are less stable.  This is
   true even for TCP-TCP proxies.  As discussed in [RFC3539], the only
   way to achieve stability equivalent to a single TCP connection is to
   mimic the end-to-end behavior of a single TCP connection.  This
   typically is not achievable with an application-layer RADIUS
   implementation, regardless of transport.

So there isn't a lot which can be done here. We can give recommendations on what to do, but there is no "fix" which would work everywhere and be correct. RFC 3539 Section 2.8 has more text on this subject.