Skip to content

Commit b47be44

Browse files
authored
Remove sniffing of HTML
As no user agent today appears to identify a text/html resource starting with <rss as XML, remove those rules from the standard. At the same time, make it more clear that XML (and now HTML) are never sniffed. This part is a non-normative change for clarity. Tests: web-platform-tests/wpt#47002. Closes #173.
1 parent 9fd7a60 commit b47be44

File tree

1 file changed

+6
-287
lines changed

1 file changed

+6
-287
lines changed

mimesniff.bs

+6-287
Original file line numberDiff line numberDiff line change
@@ -1806,6 +1806,12 @@ algorithm</dfn>:
18061806
user agents must use the following <dfn>MIME type sniffing algorithm</dfn>:
18071807

18081808
<ol>
1809+
<li>
1810+
If the <a>supplied MIME type</a> is an <a>XML MIME type</a> or <a>HTML MIME type</a>, the
1811+
<a>computed MIME type</a> is the <a>supplied MIME type</a>.
1812+
1813+
Abort these steps.
1814+
18091815
<li>
18101816
If the <a>supplied MIME type</a> is undefined or if the
18111817
<a>supplied MIME type</a>'s <a for="MIME type">essence</a> is
@@ -1826,17 +1832,6 @@ algorithm</dfn>:
18261832
<a>rules for distinguishing if a resource is text or binary</a> and
18271833
abort these steps.
18281834

1829-
<li>
1830-
If the <a>supplied MIME type</a> is an <a>XML MIME type</a>, the
1831-
<a>computed MIME type</a> is the <a>supplied MIME type</a>.
1832-
1833-
Abort these steps.
1834-
1835-
<li>
1836-
If the <a>supplied MIME type</a>'s <a for="MIME type">essence</a> is "<code>text/html</code>",
1837-
execute the <a>rules for distinguishing if a resource is a feed or HTML</a> and
1838-
abort these steps.
1839-
18401835
<li>
18411836
If the <a>supplied MIME type</a> is an <a>image MIME type</a>
18421837
<a>supported by the user agent</a>, let <var>matched-type</var> be
@@ -2264,9 +2259,6 @@ type</dfn>:
22642259

22652260
</table>
22662261

2267-
<p class=XXX>
2268-
What about feeds?
2269-
22702262
<li>
22712263
<p>Execute the following steps for each row <var>row</var> in the following table:
22722264

@@ -2466,279 +2458,6 @@ type</dfn>:
24662458

24672459

24682460

2469-
<h3 id=sniffing-a-mislabeled-feed>Sniffing a mislabeled feed</h3>
2470-
2471-
<p>
2472-
To determine whether a feed has been mislabeled as HTML, execute the
2473-
following <dfn>rules for distinguishing if a resource is a feed or
2474-
HTML</dfn>:
2475-
2476-
<ol>
2477-
<li>
2478-
Let <var>sequence</var> be the <a>resource header</a>, where
2479-
<var>sequence</var>[<var>s</var>] is <a>byte</a> <var>s</var> in
2480-
<var>sequence</var> and <var>sequence</var>[0] is the first
2481-
<a>byte</a> in <var>sequence</var>.
2482-
2483-
<li>
2484-
Let <var>length</var> be the number of <a>bytes</a> in
2485-
<var>sequence</var>.
2486-
2487-
<li>
2488-
Initialize <var>s</var> to 0.
2489-
2490-
<li>
2491-
If <var>length</var> is greater than or equal to 3 and the three
2492-
<a>bytes</a> from <var>sequence</var>[0] to
2493-
<var>sequence</var>[2] are equal to 0xEF 0xBB 0xBF (UTF-8 BOM), increment
2494-
<var>s</var> by 3.
2495-
2496-
<li>
2497-
While <var>s</var> is less than <var>length</var>, continuously loop
2498-
through these steps:
2499-
2500-
<ol>
2501-
<li>
2502-
Enter loop <var>L</var>:
2503-
2504-
<ol>
2505-
<li>
2506-
If <var>sequence</var>[<var>s</var>] is undefined, the <a>computed
2507-
MIME type</a> is the <a>supplied MIME type</a>.
2508-
2509-
Abort these steps.
2510-
2511-
<li>
2512-
If <var>sequence</var>[<var>s</var>] is equal to 0x3C
2513-
("<code>&lt;</code>"), increment <var>s</var> by 1 and exit loop
2514-
<var>L</var>.
2515-
2516-
<li>
2517-
If <var>sequence</var>[<var>s</var>] is not a <a>whitespace
2518-
byte</a>, the <a>computed MIME type</a> is the <a>supplied
2519-
MIME type</a>.
2520-
2521-
Abort these steps.
2522-
2523-
<li>
2524-
Increment <var>s</var> by 1.
2525-
</ol>
2526-
2527-
<li>
2528-
Enter loop <var>L</var>:
2529-
2530-
<ol>
2531-
<li>
2532-
If <var>sequence</var>[<var>s</var>] is undefined, the <a>computed
2533-
MIME type</a> is the <a>supplied MIME type</a>.
2534-
2535-
Abort these steps.
2536-
2537-
<li>
2538-
If <var>length</var> is greater than or equal to <var>s</var> + 3 and
2539-
the three <a>bytes</a> from
2540-
<var>sequence</var>[<var>s</var>] to
2541-
<var>sequence</var>[<var>s</var> + 2] are equal to 0x21 0x2D 0x2D
2542-
("<code>!--</code>"), increment <var>s</var> by 3 and enter loop
2543-
<var>M</var>:
2544-
2545-
<ol>
2546-
<li>
2547-
If <var>sequence</var>[<var>s</var>] is undefined, the <a>computed
2548-
MIME type</a> is the <a>supplied MIME type</a>.
2549-
2550-
Abort these steps.
2551-
2552-
<li>
2553-
If <var>length</var> is greater than or equal to <var>s</var> + 3 and
2554-
the three <a>bytes</a> from
2555-
<var>sequence</var>[<var>s</var>] to
2556-
<var>sequence</var>[<var>s</var> + 2] are equal to 0x2D 0x2D 0x3E
2557-
("<code>--></code>"), increment <var>s</var> by 3 and exit
2558-
loops <var>M</var> and <var>L</var>.
2559-
2560-
<li>
2561-
Increment <var>s</var> by 1.
2562-
</ol>
2563-
2564-
<li>
2565-
If <var>length</var> is greater than or equal to <var>s</var> + 1 and
2566-
<var>sequence</var>[<var>s</var>] is equal to 0x21
2567-
("<code>!</code>"), increment <var>s</var> by 1 and enter loop
2568-
<var>M</var>:
2569-
2570-
<ol>
2571-
<li>
2572-
If <var>sequence</var>[<var>s</var>] is undefined, the <a>computed
2573-
MIME type</a> is the <a>supplied MIME type</a>.
2574-
2575-
Abort these steps.
2576-
2577-
<li>
2578-
If <var>length</var> is greater than or equal to <var>s</var> + 1 and
2579-
<var>sequence</var>[<var>s</var>] is equal to 0x3E
2580-
("<code>></code>"), increment <var>s</var> by 1 and exit loops
2581-
<var>M</var> and <var>L</var>.
2582-
2583-
<li>
2584-
Increment <var>s</var> by 1.
2585-
</ol>
2586-
2587-
<li>
2588-
If <var>length</var> is greater than or equal to <var>s</var> + 1 and
2589-
<var>sequence</var>[<var>s</var>] is equal to 0x3F
2590-
("<code>?</code>"), increment <var>s</var> by 1 and enter loop
2591-
<var>M</var>:
2592-
2593-
<ol>
2594-
<li>
2595-
If <var>sequence</var>[<var>s</var>] is undefined, the <a>computed
2596-
MIME type</a> is the <a>supplied MIME type</a>.
2597-
2598-
Abort these steps.
2599-
2600-
<li>
2601-
If <var>length</var> is greater than or equal to <var>s</var> + 2 and
2602-
the two <a>bytes</a> from
2603-
<var>sequence</var>[<var>s</var>] to
2604-
<var>sequence</var>[<var>s</var> + 1] are equal to 0x3F 0x3E
2605-
("<code>?></code>"), increment <var>s</var> by 2 and exit loops
2606-
<var>M</var> and <var>L</var>.
2607-
2608-
<li>
2609-
Increment <var>s</var> by 1.
2610-
</ol>
2611-
2612-
<li>
2613-
If <var>length</var> is greater than or equal to <var>s</var> + 3 and
2614-
the three <a>bytes</a> from
2615-
<var>sequence</var>[<var>s</var>] to
2616-
<var>sequence</var>[<var>s</var> + 2] are equal to 0x72 0x73 0x73
2617-
("<code>rss</code>"), the <a>computed MIME type</a> is
2618-
"<code>application/rss+xml</code>".
2619-
2620-
Abort these steps.
2621-
2622-
<li>
2623-
If <var>length</var> is greater than or equal to <var>s</var> + 4 and
2624-
the four <a>bytes</a> from
2625-
<var>sequence</var>[<var>s</var>] to
2626-
<var>sequence</var>[<var>s</var> + 3] are equal to 0x66 0x65 0x65 0x64
2627-
("<code>feed</code>"), the <a>computed MIME type</a> is
2628-
"<code>application/atom+xml</code>".
2629-
2630-
Abort these steps.
2631-
2632-
<li>
2633-
If <var>length</var> is greater than or equal to <var>s</var> + 7 and
2634-
the seven <a>bytes</a> from
2635-
<var>sequence</var>[<var>s</var>] to
2636-
<var>sequence</var>[<var>s</var> + 6] are equal to 0x72 0x64 0x66 0x3A
2637-
0x52 0x44 0x46 ("<code>rdf:RDF</code>"), increment <var>s</var>
2638-
by 7 and enter loop <var>M</var>:
2639-
2640-
<ol>
2641-
<li>
2642-
If <var>sequence</var>[<var>s</var>] is undefined, the <a>computed
2643-
MIME type</a> is the <a>supplied MIME type</a>.
2644-
2645-
Abort these steps.
2646-
2647-
<li>
2648-
If <var>length</var> is greater than or equal to <var>s</var> + 24
2649-
and the twenty-four <a>bytes</a> from
2650-
<var>sequence</var>[<var>s</var>] to
2651-
<var>sequence</var>[<var>s</var> + 23] are equal to 0x68 0x74 0x74
2652-
0x70 0x3A 0x2F 0x2F 0x70 0x75 0x72 0x6C 0x2E 0x6F 0x72 0x67 0x2F 0x72
2653-
0x73 0x73 0x2F 0x31 0x2E 0x30 0x2F
2654-
("<code>http://purl.org/rss/1.0/</code>"), increment
2655-
<var>s</var> by 24 and enter loop <var>N</var>:
2656-
2657-
<ol>
2658-
<li>
2659-
If <var>sequence</var>[<var>s</var>] is undefined, the
2660-
<a>computed MIME type</a> is the <a>supplied MIME
2661-
type</a>.
2662-
2663-
Abort these steps.
2664-
2665-
<li>
2666-
If <var>length</var> is greater than or equal to <var>s</var> + 43
2667-
and the forty-three <a>bytes</a> from
2668-
<var>sequence</var>[<var>s</var>] to
2669-
<var>sequence</var>[<var>s</var> + 42] are equal to 0x68 0x74 0x74
2670-
0x70 0x3A 0x2F 0x2F 0x77 0x77 0x77 0x2E 0x77 0x33 0x2E 0x6F 0x72
2671-
0x67 0x2F 0x31 0x39 0x39 0x39 0x2F 0x30 0x32 0x2F 0x32 0x32 0x2D
2672-
0x72 0x64 0x66 0x2D 0x73 0x79 0x6E 0x74 0x61 0x78 0x2D 0x6E 0x73
2673-
0x23
2674-
("<code>http://www.w3.org/1999/02/22-rdf-syntax-ns#</code>"),
2675-
the <a>computed MIME type</a> is
2676-
"<code>application/rss+xml</code>".
2677-
2678-
Abort these steps.
2679-
2680-
<li>
2681-
Increment <var>s</var> by 1.
2682-
</ol>
2683-
2684-
<li>
2685-
If <var>length</var> is greater than or equal to <var>s</var> + 24
2686-
and the twenty-four <a>bytes</a> from
2687-
<var>sequence</var>[<var>s</var>] to
2688-
<var>sequence</var>[<var>s</var> + 23] are equal to 0x68 0x74 0x74
2689-
0x70 0x3A 0x2F 0x2F 0x77 0x77 0x77 0x2E 0x77 0x33 0x2E 0x6F 0x72 0x67
2690-
0x2F 0x31 0x39 0x39 0x39 0x2F 0x30 0x32 0x2F 0x32 0x32 0x2D 0x72 0x64
2691-
0x66 0x2D 0x73 0x79 0x6E 0x74 0x61 0x78 0x2D 0x6E 0x73 0x23
2692-
("<code>http://www.w3.org/1999/02/22-rdf-syntax-ns#</code>"),
2693-
increment <var>s</var> by 24 and enter loop <var>N</var>:
2694-
2695-
<ol>
2696-
<li>
2697-
If <var>sequence</var>[<var>s</var>] is undefined, the
2698-
<a>computed MIME type</a> is the <a>supplied MIME
2699-
type</a>.
2700-
2701-
Abort these steps.
2702-
2703-
<li>
2704-
If <var>length</var> is greater than or equal to <var>s</var> + 43
2705-
and the forty-three <a>bytes</a> from
2706-
<var>sequence</var>[<var>s</var>] to
2707-
<var>sequence</var>[<var>s</var> + 42] are equal to 0x68 0x74 0x74
2708-
0x70 0x3A 0x2F 0x2F 0x70 0x75 0x72 0x6C 0x2E 0x6F 0x72 0x67 0x2F
2709-
0x72 0x73 0x73 0x2F 0x31 0x2E 0x30 0x2F
2710-
("<code>http://purl.org/rss/1.0/</code>"), the <a>computed
2711-
MIME type</a> is "<code>application/rss+xml</code>".
2712-
2713-
Abort these steps.
2714-
2715-
<li>
2716-
Increment <var>s</var> by 1.
2717-
</ol>
2718-
2719-
<li>
2720-
Increment <var>s</var> by 1.
2721-
</ol>
2722-
2723-
<li>
2724-
The <a>computed MIME type</a> is the <a>supplied MIME
2725-
type</a>.
2726-
2727-
Abort these steps.
2728-
</ol>
2729-
</ol>
2730-
2731-
<li>
2732-
The <a>computed MIME type</a> is the <a>supplied MIME type</a>.
2733-
</ol>
2734-
2735-
<p class=note>
2736-
It might be more efficient for the user agent to implement the <a>rules
2737-
for distinguishing if a resource is a feed or HTML</a> in parallel with
2738-
its algorithm for detecting the character encoding of an HTML document.
2739-
2740-
2741-
27422461
<h2 id=context-specific-sniffing>Context-specific sniffing</h2>
27432462

27442463
<p class=XXX>

0 commit comments

Comments
 (0)