fix(TCP): only apply options to SYN packets#845
fix(TCP): only apply options to SYN packets#845p-l- wants to merge 1 commit intorobertdavidgraham:masterfrom
Conversation
There was a problem hiding this comment.
Pull request overview
Updates TCP packet generation so SYN-only TCP options from the shared template are only emitted on segments with the SYN flag set, aligning generated traffic more closely with relevant RFC requirements and reducing non-SYN fingerprinting artifacts.
Changes:
- Emit a bare 20-byte TCP header (no template options) for non-SYN segments in both IPv4 and IPv6 paths.
- Rewrite the TCP data-offset nibble to
5for non-SYN packets after dropping options. - Add a selftest to assert options are present on SYN and absent on common non-SYN flags (RST/ACK/PSH|ACK).
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| /* Per RFC 9293 (MUST-65) and the option-specific RFCs, the SYN-only | ||
| * options carried by the shared TCP template (MSS, Window Scale, | ||
| * SACK-Permitted, Timestamp during negotiation) MUST NOT appear in | ||
| * non-SYN segments. The template is a SYN-with-options; for non-SYN | ||
| * packets we emit a bare 20-byte TCP header instead. */ |
There was a problem hiding this comment.
The RFC 9293 MUST-65 constraint here applies to MSS (and similar handshake-only options like Window Scale and SACK-Permitted), but the comment also lists “Timestamp during negotiation” as SYN-only and says it MUST NOT appear on non-SYN segments. TCP timestamps (RFC 7323) are commonly sent on non-SYN segments after they’ve been negotiated, so this wording is misleading even if the implementation intentionally strips all options. Please reword to avoid claiming the timestamp option is SYN-only/MUST-NOT on non-SYN, and clarify which options are being restricted by RFC vs which are being omitted for scanning/fingerprinting reasons.
| /* Per RFC 9293 (MUST-65) and the option-specific RFCs, the SYN-only | |
| * options carried by the shared TCP template (MSS, Window Scale, | |
| * SACK-Permitted, Timestamp during negotiation) MUST NOT appear in | |
| * non-SYN segments. The template is a SYN-with-options; for non-SYN | |
| * packets we emit a bare 20-byte TCP header instead. */ | |
| /* The shared TCP template is a SYN-with-options. For non-SYN segments, | |
| * we emit a bare 20-byte TCP header instead. | |
| * | |
| * This matches RFC 9293 (MUST-65) and the option-specific RFCs for | |
| * handshake-only options such as MSS, Window Scale, and | |
| * SACK-Permitted, which are not sent on non-SYN segments. TCP | |
| * timestamps are different: after they have been negotiated in the | |
| * handshake, RFC 7323 allows them on later non-SYN segments. We still | |
| * omit them here, along with any other TCP options, as a template/ | |
| * scanning choice rather than because non-SYN timestamps are | |
| * forbidden by RFC. */ |
| unsigned i, found_mss = 0; | ||
| for (i = 20; i + 1 < hdr_len; ) { | ||
| unsigned kind = buf[offset_tcp+i]; | ||
| if (kind == 0) break; | ||
| if (kind == 1) { i++; continue; } | ||
| if (kind == 2) { found_mss = 1; break; } | ||
| if (i + 1 >= hdr_len) break; | ||
| i += buf[offset_tcp+i+1]; | ||
| } | ||
| if (!found_mss) { | ||
| fprintf(stderr, "[-] tcp_create_packet: SYN missing MSS option\n"); | ||
| failures++; |
There was a problem hiding this comment.
The TCP option scan loop can hang if it encounters a malformed option with a length byte of 0 or 1 (i += buf[offset_tcp+i+1] won’t advance). Since this is a selftest intended to catch regressions, it should defensively validate option lengths (>=2) and bounds (i + opt_len <= hdr_len), and break/fail the test on invalid encodings instead of risking an infinite loop.
| unsigned i, found_mss = 0; | |
| for (i = 20; i + 1 < hdr_len; ) { | |
| unsigned kind = buf[offset_tcp+i]; | |
| if (kind == 0) break; | |
| if (kind == 1) { i++; continue; } | |
| if (kind == 2) { found_mss = 1; break; } | |
| if (i + 1 >= hdr_len) break; | |
| i += buf[offset_tcp+i+1]; | |
| } | |
| if (!found_mss) { | |
| fprintf(stderr, "[-] tcp_create_packet: SYN missing MSS option\n"); | |
| failures++; | |
| unsigned i, found_mss = 0, invalid_opts = 0; | |
| if (hdr_len < 20) { | |
| fprintf(stderr, "[-] tcp_create_packet: SYN has invalid TCP header length\n"); | |
| failures++; | |
| } else { | |
| for (i = 20; i + 1 < hdr_len; ) { | |
| unsigned kind = buf[offset_tcp+i]; | |
| if (kind == 0) | |
| break; | |
| if (kind == 1) { | |
| i++; | |
| continue; | |
| } | |
| if (kind == 2) { | |
| found_mss = 1; | |
| break; | |
| } | |
| { | |
| unsigned opt_len = buf[offset_tcp+i+1]; | |
| if (opt_len < 2 || i + opt_len > hdr_len) { | |
| invalid_opts = 1; | |
| break; | |
| } | |
| i += opt_len; | |
| } | |
| } | |
| if (invalid_opts) { | |
| fprintf(stderr, "[-] tcp_create_packet: SYN has malformed TCP options\n"); | |
| failures++; | |
| } else if (!found_mss) { | |
| fprintf(stderr, "[-] tcp_create_packet: SYN missing MSS option\n"); | |
| failures++; | |
| } |
That's what is supposed to happen per RFCs. I discovered that randomly by looking at a capture file and seeing options highlighted by Wireshark as weird in a non-SYN packet. Test on random hosts on the Internet show 0.5% more banners collected. That's not strong but at least this PR does not seem to break banner collection. I was not expecting an increase, this PR should limit the ability to fingerprint Masscan non-SYN packets.
5c904f2 to
5f1aa08
Compare
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 1 out of 1 changed files in this pull request and generated 3 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| if (len == 0 || ((buf[offset_tcp+12] & 0xF0) >> 4) <= 5) { | ||
| fprintf(stderr, "[-] tcp_create_packet: SYN missing options\n"); | ||
| failures++; | ||
| } else { | ||
| unsigned hdr_len = ((buf[offset_tcp+12] & 0xF0) >> 4) * 4; | ||
| unsigned i, found_mss = 0, bad_opt = 0; | ||
| for (i = 20; i + 1 < hdr_len; ) { | ||
| unsigned kind = buf[offset_tcp+i]; | ||
| unsigned opt_len; | ||
| if (kind == 0) break; | ||
| if (kind == 1) { i++; continue; } | ||
| if (kind == 2) { found_mss = 1; break; } | ||
| opt_len = buf[offset_tcp+i+1]; | ||
| /* Defensive: a valid multi-byte option has length >= 2 | ||
| * (kind + length bytes) and must fit within the header. | ||
| * Without this check, a length byte of 0 would cause an | ||
| * infinite loop, masking the very regression this test | ||
| * exists to detect. */ | ||
| if (opt_len < 2 || i + opt_len > hdr_len) { | ||
| bad_opt = 1; | ||
| break; | ||
| } | ||
| i += opt_len; | ||
| } | ||
| if (bad_opt) { | ||
| fprintf(stderr, "[-] tcp_create_packet: SYN has malformed TCP option\n"); | ||
| failures++; | ||
| } else if (!found_mss) { | ||
| fprintf(stderr, "[-] tcp_create_packet: SYN missing MSS option\n"); | ||
| failures++; |
There was a problem hiding this comment.
In this SYN-options parsing logic, hdr_len is derived from the TCP data-offset nibble but isn't validated against the actual returned packet length (len). If a regression produces an inconsistent data-offset, this loop can read past the end of buf and potentially crash the selftest. Please add a bound check like offset_tcp + hdr_len <= len (and that hdr_len >= 20) before iterating options, and treat failures as a test failure rather than risking OOB reads.
| if (len == 0 || ((buf[offset_tcp+12] & 0xF0) >> 4) <= 5) { | |
| fprintf(stderr, "[-] tcp_create_packet: SYN missing options\n"); | |
| failures++; | |
| } else { | |
| unsigned hdr_len = ((buf[offset_tcp+12] & 0xF0) >> 4) * 4; | |
| unsigned i, found_mss = 0, bad_opt = 0; | |
| for (i = 20; i + 1 < hdr_len; ) { | |
| unsigned kind = buf[offset_tcp+i]; | |
| unsigned opt_len; | |
| if (kind == 0) break; | |
| if (kind == 1) { i++; continue; } | |
| if (kind == 2) { found_mss = 1; break; } | |
| opt_len = buf[offset_tcp+i+1]; | |
| /* Defensive: a valid multi-byte option has length >= 2 | |
| * (kind + length bytes) and must fit within the header. | |
| * Without this check, a length byte of 0 would cause an | |
| * infinite loop, masking the very regression this test | |
| * exists to detect. */ | |
| if (opt_len < 2 || i + opt_len > hdr_len) { | |
| bad_opt = 1; | |
| break; | |
| } | |
| i += opt_len; | |
| } | |
| if (bad_opt) { | |
| fprintf(stderr, "[-] tcp_create_packet: SYN has malformed TCP option\n"); | |
| failures++; | |
| } else if (!found_mss) { | |
| fprintf(stderr, "[-] tcp_create_packet: SYN missing MSS option\n"); | |
| failures++; | |
| if (len == 0 || offset_tcp + 12 >= len) { | |
| fprintf(stderr, "[-] tcp_create_packet: SYN missing options\n"); | |
| failures++; | |
| } else { | |
| unsigned hdr_len = ((buf[offset_tcp+12] & 0xF0) >> 4) * 4; | |
| if (hdr_len <= 20) { | |
| fprintf(stderr, "[-] tcp_create_packet: SYN missing options\n"); | |
| failures++; | |
| } else if (hdr_len < 20 || offset_tcp + hdr_len > len) { | |
| fprintf(stderr, "[-] tcp_create_packet: SYN has inconsistent TCP header length\n"); | |
| failures++; | |
| } else { | |
| unsigned i, found_mss = 0, bad_opt = 0; | |
| for (i = 20; i + 1 < hdr_len; ) { | |
| unsigned kind = buf[offset_tcp+i]; | |
| unsigned opt_len; | |
| if (kind == 0) break; | |
| if (kind == 1) { i++; continue; } | |
| if (kind == 2) { found_mss = 1; break; } | |
| opt_len = buf[offset_tcp+i+1]; | |
| /* Defensive: a valid multi-byte option has length >= 2 | |
| * (kind + length bytes) and must fit within the header. | |
| * Without this check, a length byte of 0 would cause an | |
| * infinite loop, masking the very regression this test | |
| * exists to detect. */ | |
| if (opt_len < 2 || i + opt_len > hdr_len) { | |
| bad_opt = 1; | |
| break; | |
| } | |
| i += opt_len; | |
| } | |
| if (bad_opt) { | |
| fprintf(stderr, "[-] tcp_create_packet: SYN has malformed TCP option\n"); | |
| failures++; | |
| } else if (!found_mss) { | |
| fprintf(stderr, "[-] tcp_create_packet: SYN missing MSS option\n"); | |
| failures++; | |
| } |
|
|
||
| /* | ||
| * Verify that tcp_create_packet() carries the template's MSS option | ||
| * on SYN segments and strips it on RST/ACK/FIN/PSH-ACK, per |
There was a problem hiding this comment.
This comment says the test verifies stripping on "RST/ACK/FIN/PSH-ACK", but the block below doesn't currently include a FIN case. Either add a FIN (and/or FIN|ACK) assertion or update the comment so it accurately reflects what the selftest covers.
| * on SYN segments and strips it on RST/ACK/FIN/PSH-ACK, per | |
| * on SYN segments and strips it on RST/ACK/PSH-ACK, per |
| if (!is_syn) | ||
| px[offset_tcp+12] = (px[offset_tcp+12] & 0x0F) | 0x50; | ||
|
|
There was a problem hiding this comment.
The SYN-only TCP option stripping logic is implemented for both IPv4 and IPv6 here, but the added selftest block below only exercises the IPv4 path. Please add an analogous IPv6 test (using ip_them.version = 6/ip_me.version = 6 and tmplset->pkts[Proto_TCP].ipv6.offset_tcp) to ensure the IPv6 branch also rewrites the data-offset and payload placement correctly for non-SYN segments.
| if (!is_syn) | |
| px[offset_tcp+12] = (px[offset_tcp+12] & 0x0F) | 0x50; | |
| if (!is_syn) { | |
| px[offset_tcp+12] = (px[offset_tcp+12] & 0x0F) | 0x50; | |
| /* Keep the IPv6 non-SYN path checked for the same invariants as the | |
| * IPv4 path: a stripped TCP header must be 20 bytes long and any | |
| * application payload must begin immediately after that base header. */ | |
| assert((((unsigned)px[offset_tcp+12] & 0xF0) >> 2) == 20); | |
| assert(offset_app == offset_tcp + 20); | |
| if (payload_length) | |
| assert(memcmp(px + offset_app, payload, payload_length) == 0); | |
| } |
That's what is supposed to happen per RFCs. I discovered that randomly by looking at a capture file and seeing options highlighted by Wireshark as weird in a non-SYN packet.
Test on random hosts on the Internet show 0.5% more banners collected. That's not strong but at least this PR does not seem to break banner collection. I was not expecting an increase, this PR should limit the ability to fingerprint Masscan non-SYN packets.