mdaniel 2 hours ago

> The Netlog looks scary because it not only contains the traffic while I reproduced the bug but also 1) all traffic from the Chrome plugins and 2) many websites that I have browsed before but haven’t visited during the recording

Isn't that a fantastic use for a 2nd Chrome Profile or even just downloading a Chromium build[1] and using that, showing the behavior in a bleeding edge build?

1: https://download-chromium.appspot.com/

  • nightpool 2 hours ago

    I was also pretty surprised when the OP said "the Chromium team refused to use my server to reproduce the bug", when the actual comments of the ticket were "clone this repo and run my giant node app" and the tester's response was "It seems a bit difficult to set up an build environment to run the static server, could you provide a more minimal repro case?". OP's description of the tester's reasonable concerns seems very unfair.

    Even just having a web-accessible endpoint that reproduced the issue would have made the process a lot smoother I think. Apparently in response to OP's request for an easier test case, OP asked for GCP cloud credits(?) to host their server with?. You probably used more bandwidth & CPU loading the new Chromium issue tracker page then you would have just setting up a simple vps to reproduce the issue

Something1234 3 hours ago

This is super weird and needs a bit of editing but it seems like an actual bug. Shouldn’t a 403 invalidate whatever was cached?

As in it should bubble the error up to the user.

  • YetAnotherNick 2 hours ago

    I think merging two requests opens up whole can of worms. 200+403 merged translates to 206? There is also content length merging. Wondering what would the rest of the headers translates to. If I respond with a header saying that the stream is EOF in the second call, would that be preserved.

  • ajross 3 hours ago

    Should it? You can return a partial result for the request, there's no reason it couldn't be a subset of a previous partial request. Why is the browser required to make a network request at all when it can serve a valid (but incomplete) response out of the cache? There's space for argument for what the "best" way to handle this is, but I have a hard time seeing a valid response as "incorrect" or a "bug".

    Honestly, this genre of "big tech company refused to fix my very obscure edge case and that confirms all my priors about them" post is getting a little tiresome. There are like three of them coming through the front page every day.

    • mananaysiempre 3 hours ago

      Whether it should or not depends on whether you understand a 403 as a refusal to let you do the given method against the given resource at all, or as a refusal to do this one specific request. The HTTP spec (as I’ve just learned) does support the narrower interpretation if the server wishes it: the description for 403 is just that “[t]he server understood the request, but is refusing to fulfill it”, with no implications regarding other requests for this resource.

      • ajross 2 hours ago

        Again, it's a range request though. What if the browser simply didn't send a network request at all and just synchronously returned the partial result from the cache. You agree that would be correct (if arguably not very useful), right? The point is that the 403 isn't required to be seen, at all. You can't require the browser return a value that the browser doesn't know about.

        It's a cache consistency bug at its root. The value was there, and now it's not. The reporter says "the browser is responsible for cache coherency" (call this the "MESI camp"). The Chrome folks say "the app is responsible for cache coherency" (the "unsnooped incoherent" gang). Neither is wrong. And the problem remains obscure regardless.

        • aoli-al 2 hours ago

          I'm the author of the post.

          I'm not sure Chrome's current caching behavior is helpful because the second response does not indicate which part of the data is returned. So, the application has no choice but to discard the data.

          But thank you for your comments. This helped me to crystalize why I think this is a bug.

          • mananaysiempre 2 hours ago

            Yeah, if there's no way to tell from the request which range has actually been returned that seems like a deal-breaker. The spec’s allowance for a partial response is explicitly motivated by the response being self-describing, and if after Chrome’s creative reinterpretation it is not, then it’s not clear what the client could even do.

            • ajross an hour ago

              There's no clear way to define "correct" in this case regardless. The whole premise behind a range request is that the data is immutable (because otherwise it wouldn't make sense to be able to fetch it piecewise), and it's mutating here by disappearing! What are you supposed to do, really? The answer is always going to be app-dependent, the browser can't get it right because the server is being obtuse and confusing.

              When we handle this in the hardware world it's via algorithms that know about the mutability of the cached data and operate on top of primitives like "flush" and "invalidate" that can restore the inconsistent memory system to a known state. HTTP didn't spec that stuff, but the closest analog is "fetch it again", which is exactly what the suggested workaround is in the bug.

    • ForTheKidz an hour ago

      > Honestly, this genre of "big tech company refused to fix my very obscure edge case and that confirms all my priors about them" post is getting a little tiresome.

      Ahh, let's just wait for the startup to fix it then.

nemothekid 3 hours ago

I'm assuming that the OP is using a signed request and the fact that Chrome rewrites the request is what is causing the 403.

I'm interested in what kind of application depends on this behavior - if an application gets partial data from the server, especially one that doesn't match the content-length header, that should always be an error to me.

  • gwd 22 minutes ago

    Some authentication schemes have a short-lived "authorization token", that is valid for like 5 minutes, and a longer-lived "refresh token", which is valid for like a week or two; and return a 403 when the authorization token expires to prompt the client to refresh the token. (See say, supertokens.com .) If your auth token expired in the middle of a multi-part download, it could cause this situation that triggered this bug.

mnot an hour ago

Chrome's cache is indeed acting correctly. Effectively, it is acting as an intermediary here - your application made a partial content request, and it can satisfy it (partially), so it sends you a 206.

HTTP partial content responses need to be evaluated (like any other response) according to their metadata: servers are not required to send you exactly the ranges you request, so you need to pay attention to Content-Range and process accordingly (potentially issuing more requests).

See: https://httpwg.org/specs/rfc9110.html#status.206

  • nightpool an hour ago

    But the Content-Range header and the Content-Length header both indicated the "expected" number of bytes e.g. the number of bytes that would have been returned if the server had given a 206 or a 200, not the truncated number of bytes that the response actually contained. Is that expected?

    The latest response from the Chromium team (https://issues.chromium.org/issues/390229583#comment20) seems to take a different approach from your comment, and says that you should think of it as a streaming response where the connection failed partway through, which feels reasonable to me, except for the fact that `await`ing the response doesn't seem to trigger any errors: https://issues.chromium.org/issues/390229583#comment21

  • Ajedi32 an hour ago

    Shouldn't the response header returned by Chrome say "4-138724" then though, and not "4-1943507"? The synthesized response body doesn't include bytes "138725-1943507".

    • mnot an hour ago

      Ah - I need to remember to coffee before posting in the AM.

      Yes, the mismatch between the response headers and the content is a problem. Unfortunately, IME browsers often do "fix ups" of headers that make them less than reliable, this might be one of them -- it's effectively rewriting the response but failing to update all of the metadata.

      The bug summary says "Chrome returns wrong status code while using range header with caches." That's indeed not a bug. I think the most concerning thing here is that the Content-Range header is obviously incorrect, so Chrome should either be updating it or producing a clear error to alert you -- which it looks like the Chrome dev acknowledges when they say "it is probably a bug that there is no AbortError exception on the read".

      I might try to add some tests for this to https://cache-tests.fyi/#partial

omoikane 3 hours ago

[flagged]

  • mananaysiempre 3 hours ago

    The actual issue is more involved: Chrome wrongly returns 206 Partial Content upon a range request when it already had part of the range in its cache (due to a previous request that correctly returned a 206) and Chrome's attempt to get the remaining part yielded a 403. The Chrome team suggests, citing unspecified compatibility concerns, for the client code to notice that (from its point of view) either Chrome or the web server violated the range request contract by returning less than what was requested and to then rerequest the remaining part as a separate range, which will, finally, return a 403. That makes no sense to me, but what do I know.

    ETA: I’m wrong here—turns out the range request contract explicitly allows this: “a server might want to send only a subset of the data requested for reasons of its own” <https://www.rfc-editor.org/rfc/rfc9110.html#name-206-partial...>.

    • ayende 3 hours ago

      Chrome gives you what data it has, and you are expected to issue the next request to get the rest of the data.

      Consider a read() call in Linux if you ask to read 16kb and the cache has 4kb page ready, it may give you that.

      You'll need another call to get the rest, and if there is a bad disk sector, that first read() may bot notice that

      • mananaysiempre 3 hours ago

        The spec for HTTP GET is of course in no way similar to the spec for read(). On the other hand, I have to concede that (as I’ve just learned) an HTTP server is actually within its rights[1] to return only part of the requested range(s) and expect the client to redo the request if it needs the rest:

        > A server that supports range requests (Section 14) will usually attempt to satisfy all of the requested ranges, since sending less data will likely result in another client request for the remainder. However, a server might want to send only a subset of the data requested for reasons of its own, such as temporary unavailability, cache efficiency, load balancing, etc. Since a 206 response is self-descriptive, the client can still understand a response that only partially satisfies its range request.

        [1] https://www.rfc-editor.org/rfc/rfc9110.html#name-206-partial...

        • lelandbatey 2 hours ago

          The only thing I'd note is that the spec seems to be pretty clear about the content-length response header needing to match how many bytes are actually in the response, and the 206 from Chrome is not returning a number of bytes matching the content-length header. Spec:

          > A Content-Length header field present in a 206 response indicates the number of octets in the content of this message, which is usually not the complete length of the selected representation.

          While in the article (and in the mailing group discussion) it seems that Chrome is responding with a `content-length` of 1943504 while the body of the response only contains 138721 octets. Unless there's some even more obscure part of the spec, that definitely seems like a bug as it makes detecting the need to re-request more annoying.

      • mlhpdx 3 hours ago

        I’m not sure I follow. The second request is for an overlapping range and can’t be satisfied because of the 403. There’s just no way around that.

    • mlhpdx 3 hours ago

      Yep, reproduced. Caching partial content is a wonderful feature and I’m glad they worked on it. But they clearly shipped some bugs.

      Prototypical “bad” technical debt, perhaps.

      • dgoldstein0 2 hours ago

        Having taken a minute to understand this, I agree with the chrome team - it's not a bug, but rather an unfortunate complexity of interacting with caching and a server. It's definitely a choice on their part to issue a request different than the one you asked for by partially satisfying it from cache and asking the server for the rest. But it's not an unseasonable optimization. At which point when their shorter range request fails, they have no way to message the 403 back to you and also give back the partial content. So they chose to pass back the partial content.

        • black3r 2 hours ago

          I think it's a bit more complicated. Returning partial content seems to be a valid response according to the spec. But the Content-Range and Content-Length headers shown in Chrome DevTools are clearly wrong, they should indicate what is actually returned. So I still think there's a small bug in how Chrome handles this, but solving this bug wouldn't help the author, as it's still the responsibility of OpenDAL to fix handling partial content on their side.

      • Scaevolus 3 hours ago

        This seems like a better design than returning no data and a 403, and they wouldn't be the first people to run into it and have to properly handle a partial range response.

  • Etheryte 3 hours ago

    If you'll take the moment to read the article past the headline you'll see there's more to it than that. Titles can't fit the whole article.