-
Notifications
You must be signed in to change notification settings - Fork 16
Optionally enable caching of any request method #24
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
This is useful for example when using a HTTPlug based SoapClient. php-http/client-common#26
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
my first reaction was "wtf, why?", then i read soap and think thats a reasonable usecase. and after some more thinking, i actually feel we should allow arbitrary methods to be cached if we allow to configure that at all.
src/CachePlugin.php
Outdated
$resolver->setAllowedValues('hash_algo', hash_algos()); | ||
$resolver->setAllowedValues('methods', function ($value) { | ||
return 0 === count(array_diff($value, ['GET', 'HEAD', 'POST'])); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i think once we allow to configure, we should allow any methods. we need a big warning sign for this anyways, as people should know what they do before caching post and other potentially modifying requests... for example, it could make sense to cache a PROPFIND request (webdav, its a non-modifying request)...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree. Question is since RFC2616 says valid request method can be pretty much any string (not sure if A..Z or alphanumeric) should any request method be allowed or just the common methods defined in RFC2616-sec9
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i'd not do any validation apart from requiring an array of strings.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Found the spec. RFC2616 says request method should be a token and RFC7230 defines token to be any visible USASCII
character except delimiters "(),/:;<=>?@[\]{}
.
src/CachePlugin.php
Outdated
@@ -225,6 +224,10 @@ private function getCacheControlDirective(ResponseInterface $response, $name) | |||
*/ | |||
private function createCacheKey(RequestInterface $request) | |||
{ | |||
if ('POST' === $request->getMethod()) { | |||
return hash($this->config['hash_algo'], $request->getMethod().' '.$request->getUri().' '.$request->getBody()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i don't like that we specifically do this for POST (as i want to support configuring all request methods as cacheable). i think using the body in hash should not be an issue - its usually just empty. technically, GET requests can have a body as well, for example with elasticsearch. so its actually an improvement to also use the body for the hash.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agree on this one too. However some might consider changing the way how keys are generated a BC break. See discussion in #12. Although if I read it correctly the consensus in the end was this would not be considered a BC break.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i would not consider that a BC break. chances are you replace the cache on a new deployment anyways, and even if not, its just a cache.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So what's the verdict? Do we change this to allow any methods?
btw, caching proxies like varnish usually do a GET when seeing a HEAD request they have no cache for. then they store the response and return just the headers to honor the HEAD request. but i don't think we want to do that here. if people will do a GET anyways, they should not do a HEAD request first. they might be implementing their own cache, but then they should probably not use the cache-plugin. |
spec/CachePluginSpec.php
Outdated
@@ -97,6 +97,49 @@ function it_doesnt_store_post_requests(CacheItemPoolInterface $pool, CacheItemIn | |||
$this->handleRequest($request, $next, function () {}); | |||
} | |||
|
|||
function it_stores_post_requests_when_allowed(CacheItemPoolInterface $pool, CacheItemInterface $item, RequestInterface $request, ResponseInterface $response, StreamFactory $streamFactory, StreamInterface $stream) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you please format this according to PSR-1/2?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There were some other PSR-2 related problems too so I made a separate PR #25.
spec/CachePluginSpec.php
Outdated
|
||
$response->getStatusCode()->willReturn(200); | ||
$response->getBody()->willReturn($stream); | ||
$response->getHeader('Cache-Control')->willReturn(array())->shouldBeCalled(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please use short array syntax (StyleCI is turned off for spec files because of the snake_case).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was following the CS of the rest of the file. Long array()
syntax is used elsewhere the spec file.
spec/CachePluginSpec.php
Outdated
$response->getHeader('Expires')->willReturn(array())->shouldBeCalled(); | ||
$response->getHeader('ETag')->willReturn(array())->shouldBeCalled(); | ||
|
||
$pool->getItem('e4311a9af932c603b400a54efab21b6d7dea7a90')->shouldBeCalled()->willReturn($item); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please split this too
src/CachePlugin.php
Outdated
@@ -225,6 +224,10 @@ private function getCacheControlDirective(ResponseInterface $response, $name) | |||
*/ | |||
private function createCacheKey(RequestInterface $request) | |||
{ | |||
if ('POST' === $request->getMethod()) { | |||
return hash($this->config['hash_algo'], $request->getMethod().' '.$request->getUri().' '.$request->getBody()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So what's the verdict? Do we change this to allow any methods?
Yeah, this one was a shocker for me at first too. 😄 For the record: we should probably rethink our BDD(-misuse) strategy and rather use PHPUnit or something. Tests are almost unreadable with all the PSR-7 mocks. |
Request method can contain any visible USASCII character except delimiters "(),/:;<=>?@[\]{} https://tools.ietf.org/html/rfc7230#section-3.1.1 https://tools.ietf.org/html/rfc7230#section-3.2.6
Updated the PR. It is now possible to enable caching for any arbitrary request method which conforms RFC7230. Also request body is always used when creating the cache key. |
src/CachePlugin.php
Outdated
$resolver->setAllowedValues('hash_algo', hash_algos()); | ||
$resolver->setAllowedValues('methods', function ($value) { | ||
$matches = preg_grep('/[^[:alnum:]!#$%&\'*\/+\-.^_`|~]/', $value); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you please add a comment here with the RFC number that defines this regular expression? just for future reference and to make it clear we did not come up with a random expression here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i am horrible with regular expressions. is this forcing upper case? we should either force that or uppercase ourselves when processing the options. PSR-7 forces method to upper case. when you specify something lower case, it would never match.
the doc should explain that if you specify methods, you also need to specify GET if you still want those cached. and that you should do specific clients with different setup if you do soap and other stuff, to be sure only the right requests get cached...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It does not force uppercase. [:alnum:]
is alias for [0-9A-Za-z]
. RFC2616 says method is case-insensitive so technically both get
and GET
should be valid but just different methods.
I have no strong feelings whether we should force uppercase or not. I think most of the time people probably use uppercase methods.
The regexp basically says is request method has any other characters than these the method is invalid.
cool, i like this. commits need to be squashed and we should have a documentation pull request (no big thing, just mention the feature in the right place). then its ready to merge imo |
Documentation added. GitHub now has squash before merge feature which I think is better than doing squash + force push myself. |
PSR-7 says:
but for example guzzle psr-7 says:
i am surprised by this, but feel like then we should not force upper case but just make sure all our examples use all caps and maybe mention that the option is case sensitive. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please change the CachePlugin constructor phpdoc to mention the new option for method
(and say that its case sensitive)
src/CachePlugin.php
Outdated
@@ -45,6 +45,7 @@ | |||
* @var int $cache_lifetime (seconds) To support serving a previous stale response when the server answers 304 | |||
* we have to store the cache for a longer time than the server originally says it is valid for. | |||
* We store a cache item for $cache_lifetime + max age of the response. | |||
* @var array $methods case insensitive list of request methods which can be cached. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
its case sensitive, not insensitive
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah yes of course. Brain malfunction...
looks good to me. what do the others think? can we merge? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great PR. Thanks. I had some questions.
src/CachePlugin.php
Outdated
@@ -225,7 +225,7 @@ private function getCacheControlDirective(ResponseInterface $response, $name) | |||
*/ | |||
private function createCacheKey(RequestInterface $request) | |||
{ | |||
return hash($this->config['hash_algo'], $request->getMethod().' '.$request->getUri()); | |||
return hash($this->config['hash_algo'], $request->getMethod().' '.$request->getUri().$request->getBody()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What if the body is super large? Do we care?
Also, I suggest a space (or any other non-common delimiter) between uri and body.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Without space this is backwards compatible with previous versions if body is empty. Adding a space there will invalidate old caches, but it is of course ok with me if needed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great. I did not think about that. How about doing something like:
$body = (string) $request->getBody()
if (!empty($body)) {
$body = ' ' . $body;
}
return hash($this->config['hash_algo'], $request->getMethod().' '.$request->getUri().$body);
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i think we do not need to worry about keeping caches valid over deployments. but nyholms idea is good. as GET usually has empty body, it happens to keep old caches.
regarding big body: this method is only executed on methods that should be cached, right? so body will be usually empty, except for things like soap which hopefully won't be big file uploads. you would not want to cache a file upload, anyways.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
GET and HEAD has (EDIT: generally) no bodies. If you change the default behavior, then it is on you if you get performance issues.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can add it. Why is the space needed BTW?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
GET can have body, for example when talking to elasticsearch: https://www.elastic.co/guide/en/elasticsearch/reference/current/search-request-body.html
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To make a difference between these two requests
POST /foo
Host: www.example.com
bar
POST /foob
Host: www.example.com
ar
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I feel we could unconditionally add the space. It is only cache so it is not that big deal if old cache gets invalidated when plugin is updated. What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm, sorry for the late response.
I would prefer not invalidating cache when we can.. Im using Google Translate paid services and I know I will have to pay 100USD if my cache gets invalidated.
Technically, this is okey. But we could be more nice to our users. =)
src/CachePlugin.php
Outdated
@@ -45,6 +45,7 @@ | |||
* @var int $cache_lifetime (seconds) To support serving a previous stale response when the server answers 304 | |||
* we have to store the cache for a longer time than the server originally says it is valid for. | |||
* We store a cache item for $cache_lifetime + max age of the response. | |||
* @var array $methods case sensitive list of request methods which can be cached. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hm.. Yes, they are case sensitive but lower case will always fail..
@var array $methods list of request methods which can be cached. Methods should always be upper case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why it will fail?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
when you use the guzzle psr implementation, the method is forced to uppercase. but i pasted the fragment from psr-7 somewhere here that says that the getMethod of the request SHOULD preserve case. so depending on implementations, you could need lower case headers. i think its correct to allow lower case and not transform case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i think its correct to allow lower case and not transform case.
But that would lead to unexpected behavior. I think we should do a case insensitive comparison or force input to be uppercase.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cache plugin also converts the request method to uppercase. While this is not technically correct it is probably ok for 99% of the use cases. I think the question is should we follow RFC or not.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh, missed that. ok, that would be a BC break if we stop doing this...
whatever, lets uppercase the method and force configuration to use uppercase. (thats better than magically uppercasing and confusing people that expected some specific behaviour). i think this will be correct in 99.999999% of the cases. would be surprised if anybody ever wants GET
to be cached but not get
...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed.
src/CachePlugin.php
Outdated
$resolver->setAllowedValues('hash_algo', hash_algos()); | ||
$resolver->setAllowedValues('methods', function ($value) { | ||
/* Any VCHAR, except delimiters. RFC7230 sections 3.1.1 and 3.2.6 */ | ||
$matches = preg_grep('/[^[:alnum:]!#$%&\'*\/+\-.^_`|~]/', $value); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks really complicated. Why not use [A-Z]+
. That will also make sure you use upper values. Or am I too specific?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unless I have misunderstood the specs request method can also contain lowercase letters. Valid request methods are described in RFC7230-3.1.1 which states it is a token. Token is defined in RFC7230-3.2.6 which says any visible USASCII
character except delimiters "(),/:;<=>?@[\]{}
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, yes you are correct. (Reading the code and RFC more carefully...). I was wrong assuming that !#$%&
etc should not be a part of the HTTP method.
HTTP methods are generally uppercase but PSR-7 specifies that case should not change. I think we are correct by running strtoupper()
. This implementation will, however, always fail when using a lower case method in the configuration. I suggest changing the regex to fail directly if a lower-case method are used.
$matches = preg_grep('/[^[A-Z0-9]!#$%&\'*\/+\-.^_`|~]/', $value);
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
were do we strtoupper? i think we should leave the method unchanged, given how psr-7 is specified. as mentioned above, guzzle psr-7 request does not follow the SHOULD of the spec.
See #24 (comment) and the method doc block. I think we should do a case insensitive comparison. |
fine for me
|
Technically RFC7230 allows lowercase request methods but consensus was they might be too confusing for users.
Without delimiter the following requests would get the same cache key: POST /foo bar POST /foob ar
ok, i guess that is a good argument in favor of not concating ' '.$body if the body is empty (which it usually is for GET). sorry, can you please adjust that @tuupola ? i think then we are really ready. @sagikazarmark or do you see any issues that need to be adressed? |
Done. I will update the docs when I know this is frozen. |
thanks a lot! lets wrap up the doc PR as well. work for more flexibility will continue in #3 |
Thank you @tuupola |
What's in this PR?
Adds new
methods
setting which allows you to change which request methods are allowed to be cached.Why?
Sometimes it is desirable to cache
POST
requests. For example when using HTTPlug based SOAP client. I am currently using this in live system for SOAP requests.Example Usage
To enable caching of
GET
andPOST
requests you can do the following.Checklist
To Do