-
Notifications
You must be signed in to change notification settings - Fork 37
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Expose Content-Encoding to ResourceTiming #381
Comments
Similar to Content-Type, this probably needs to be restricted to content that is CORS-enabled or same-origin. That's not a limitation for the dictionary compression encodings (as they have similar restrictions). This might be a limitation for non-dictionary ztd/brotli. |
Discussed on the Feb 29, 2024 W3C WebPerf call: Summary:
|
Filed chromium side bug. https://crbug.com/327941462 +CC: @Jxck who is interested in this. |
Thanks @horo-t I'm happy to work on it ! |
add `contentEncoding` to Resource Timing. closed w3c#381.
Updated explainer can be found here: |
I plan to filter the Continuing the discussing in whatwg/fetch#1742. Below are the possible values that are registered. I am putting them in the following categories: -- definitely allowed (confirmed that Chromium supports in WPT tests) -- maybe allowed? -- What about these? Maybe not allow? why? exi -- The below are deprecated. Since we are building new feature, should we just disallow them( forcing the app to change to the up-to-date name)? Or should be be forgiving and transfer them to x-compress Deprecated (alias for compress) -- What about these? They are not registered ++++++++++++++++ |
I'd limit it to formats a given browser supports natively (not sure if
"exi" makes that list for Chrome) and "unknown" for anything else, even if
it is registered. That way the browser can be responsible for updating
their enum as formats change.
br-d and zstd-d were replaced by dcb and dcz and should never be reported.
…On Fri, Dec 6, 2024 at 4:37 PM Guohui Deng ***@***.***> wrote:
I plan to filter the contentEncoding value before exposing it in resource
timing. (But keep the unfiltered value in the response header for other
uses), but how exactly the value should be filtered?
continuing the discussing in whatwg/fetch#1742
<whatwg/fetch#1742>.
Below are the possible values that are registered. I am putting in the
following categories:
-- definitely allowed
br
compress
dcb
dcz
deflate
exi
gzip
zstd
-- What about these? Maybe not allow? why?
identity
pack200-gzip
aes128gcm
-- The below are deprecated. Since we are building new feature, should we
just disallow them( forcing the app to change to the up-to-date name)? Or
should be be forgiving and transfer them to compress and gzip?
x-compress Deprecated (alias for compress)
x-gzip Deprecated (alias for gzip)
-- What about these? They are not registered contendEncoding values,
should we disallow them?
br-d
zstd-d
++++++++++++++++
A disallowed value will be exposed as unknown
++++++++++++++++
—
Reply to this email directly, view it on GitHub
<#381 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AADMOBNV25ANMQ4QXDEHMDD2EIKIXAVCNFSM6AAAAABTDBYOFOVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDKMRUGIZDCNZVGU>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
Based on what Patrick said above and some digging, I think the following could be filtered out:
And the following will be supported: (last updated 2024/12/11) |
I think we must also specify what would be the value in case there is no content encoding (i.e. uncompressed payload). Seems easiest to just not set the value in that case so that when the property is read it will be JS My suggestion is for a specific string value in that case: Anyone has a preference or other suggestions? |
"identity" seems appropriate IMO. |
Thanks for pointing out the "identity" value meaning Yoav. I missed it. :) And I think I could mention the "identity" value too in the |
Do we need to decipher between "identity" and "no content-encoding header" (an empty string in |
Noam point out this spec: "Note that the coding named "identity" is reserved for its special role in Accept-Encoding and thus SHOULD NOT be included." If we cannot report So it appears to me that it's desirable that:
==> |
What does it mean "The server didn't update to the new fetch standard"? Fetch is a client-side standards that works on top of HTTP. If the server didn't send |
Im working on other things related to HTTP content-encoding. The feedback I've received is to NOT use RFC 9110 reflects the reality of use in the wild, that implementations SHOULD NOT include "identity" in Content-Encoding but they might. Therefore, my suggestion would be to use a label like |
I believe empty-string is consistent with other no-value cases in the resource timing spec (e.g. |
Thanks for the input! So, the reality is that most of the servers just send empty string, instead of a literal string Let me list the possible situation here and could you guys please verify:
{gzip, apple, identity, pear } will be filtered to or we filter it to |
We don't want this to become a little side-channel where servers can encode information by sending mutltiple encodings over a 1-byte So I think it should be:
Examples: |
Adding to what Noam said above:
""
below is the conversion from the raw string from server, to the string reported in resourceTiming: ", gzip" ==> "gzip" 3 What if duplicated values from server? "gzip, Gzip" => "gzip"? |
To be more concrete, these are th values used in this header in November 2024 in more than 100 responses (HTTP archive, mobile, out of ~1.5M):
Seems like using multiple values is very rare in practice. Perhaps supplying just |
I can't think of a valid use case for multiple values. I actually now wonder what does the browser do with e.g. Ideally, we'd report the value that was actually respected by the browser. Is that the first? |
Doesn't it mean - compress with |
You're right! TIL I still don't think it's a valid use case. A single value of "multiple" is sufficient. |
That would be much easier to define for now. One the other hand, I could speculate that in future "two compressions" could become more common, like a very specific compression used with a general compression technique (Maybe some dictionary compress followed by gzip or br) could be beneficial. |
I think even that would be an extreme edge case. For example, the current dictionary compressions are built directly into brotli and zstd (using I'd prefer to keep it simple and just use |
Thanks, and I think got the principle. And I think this "multiple" rule can take the highest precedence? Examples below are all to converted to "multiple" (Because this is the simplest and the very unlikely cases don't matter much.) |
Agreed. The comma means there are multiple passes at encoding, even if all of the passes are using the same encoding. |
If in some unforeseeable future we'd find ourselves with a genuine case where it makes sense to apply multiple encodings, we can always decide then to add a value that represents those multiple encodings. I don't think that deciding to go with "multiple" now would have negative future-compat implications. |
The value from the http header can be of multiple codings, or not properly formatted. Per discussion on w3c/resource-timing#381, multiple codings should be transformed to "multiple"; "identity" is not allowed in response header; and the coding value should be formatted if it's not in http header. Bug: 327941462 Change-Id: I9048423c5ad562d8001562324cb35f72ef8ac5da
The value from the http header can be of multiple codings, or not properly formatted. Per discussion on w3c/resource-timing#381, multiple codings should be transformed to "multiple"; "identity" is not allowed in response header; and the coding value should be formatted if it's not in http header. Bug: 327941462 Change-Id: I9048423c5ad562d8001562324cb35f72ef8ac5da Reviewed-on: https://chromium-review.googlesource.com/c/chromium/src/+/6215331 Reviewed-by: Noam Rosenthal <[email protected]> Reviewed-by: Yoav Weiss (@Shopify) <[email protected]> Commit-Queue: Guohui Deng <[email protected]> Cr-Commit-Position: refs/heads/main@{#1415037}
The value from the http header can be of multiple codings, or not properly formatted. Per discussion on w3c/resource-timing#381, multiple codings should be transformed to "multiple"; "identity" is not allowed in response header; and the coding value should be formatted if it's not in http header. Bug: 327941462 Change-Id: I9048423c5ad562d8001562324cb35f72ef8ac5da Reviewed-on: https://chromium-review.googlesource.com/c/chromium/src/+/6215331 Reviewed-by: Noam Rosenthal <[email protected]> Reviewed-by: Yoav Weiss (@Shopify) <[email protected]> Commit-Queue: Guohui Deng <[email protected]> Cr-Commit-Position: refs/heads/main@{#1415037}
The value from the http header can be of multiple codings, or not properly formatted. Per discussion on w3c/resource-timing#381, multiple codings should be transformed to "multiple"; "identity" is not allowed in response header; and the coding value should be formatted if it's not in http header. Bug: 327941462 Change-Id: I9048423c5ad562d8001562324cb35f72ef8ac5da Reviewed-on: https://chromium-review.googlesource.com/c/chromium/src/+/6215331 Reviewed-by: Noam Rosenthal <[email protected]> Reviewed-by: Yoav Weiss (@Shopify) <[email protected]> Commit-Queue: Guohui Deng <[email protected]> Cr-Commit-Position: refs/heads/main@{#1415037}
… ResourceTiming, a=testonly Automatic update from web-platform-tests Revise filtering for Content-Encoding in ResourceTiming The value from the http header can be of multiple codings, or not properly formatted. Per discussion on w3c/resource-timing#381, multiple codings should be transformed to "multiple"; "identity" is not allowed in response header; and the coding value should be formatted if it's not in http header. Bug: 327941462 Change-Id: I9048423c5ad562d8001562324cb35f72ef8ac5da Reviewed-on: https://chromium-review.googlesource.com/c/chromium/src/+/6215331 Reviewed-by: Noam Rosenthal <[email protected]> Reviewed-by: Yoav Weiss (@Shopify) <[email protected]> Commit-Queue: Guohui Deng <[email protected]> Cr-Commit-Position: refs/heads/main@{#1415037} -- wpt-commits: 1691f567df269bca146cb8d25c43d644b2b42c63 wpt-pr: 50456
Hi,
Similar to request #203 to expose
Content-Type
, I would like to request exposing theContent-Encoding
of each resource to ResourceTiming.As we're starting to see experimentation and deployments of new content encodings such as Zstandard (
zstd
) and compression dictionary transports (zstd-d
andbr-d
), we are moving toward content being delivered from a large set of possible encodings, even to the same client on different page loads or (sub)requests to the same domain.When the content encoded was a small set: (none), gzip and brotli, one could often infer the encoding depending on the encoded/decoded body sizes, though that generally only works if one "owns" the content (has visibility into what the size would be for each encoding type).
Having an explicit
.contentEncoding
would help with some use-cases I can think of:CC @pmeenan @horo-t
The text was updated successfully, but these errors were encountered: