Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for infering the result type of queries in EntityManager::createQuery() #232

Merged
merged 3 commits into from
Jan 21, 2022

Conversation

arnaud-lb
Copy link
Contributor

@arnaud-lb arnaud-lb commented Nov 16, 2021

Adds a type parameter TResult on the Doctrine\ORM\Query and Doctrine\ORM\AbstractQuery stubs. This parameter represents the type of the results returned by the getResult() and execute() family of methods, when in HYDRATE_OBJECT mode.

Also adds a DynamicReturnTypeExtension on EntityManager::createQuery() so that TResult is infered from the query string.

As the hydration mode influences the return type of the getResult() and execute() family of methods, we can not just declare that these methods return TResult yet. This will require a DynamicReturnTypeExtension (next PR).

(see issue #221 and the fully functional dev branch with the DynamicReturnTypeExtension)

@arnaud-lb arnaud-lb marked this pull request as ready for review November 16, 2021 18:36
Comment on lines 72 to 161
// It makes no sense to speak about a result type for UPDATE or DELETE
// queries, so we return VoidType here.
if (!$this->selectQuery) {
return new VoidType();
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is that really true? Don't these queries return the number of affected rows? I'm not quite sure if it's the case for queries that are executed in this way through Doctrine - maybe the value is lost somewhere - but generally for SQL that's the case.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The thing is that the return type of getResult() or getSingleResult() is inconsistent between SELECT and UPDATE/DELETE queries.

For SELECT queries, the return types are:

getResult(): array<TResult>
getSingleResult(): TResult

For UPDATE/DELETE queries, the return types are:

getResult(): int
getSingleResult(): int

Because of this we can not use TResult directly in @return.

Instead, we will need a return type extension. However, the return type extension needs a way to find whether the Query instance is a SELECT or UPDATE/DELETE query. Setting TResult to void is one way to do that.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. 👍

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One alternative would have been to have multiple type params on Query, so that we could do this:

/** 
 * @template TResult
 * @template TSingleResult
 */
class Query {
    getResult(): TResult
    getSingleResult(): TSingleResult
}

However I think that it would be tedious to declare the two types in annotations, like @return Query<array<array{foo: int, bar: string}>, array{foo: int, bar: string}>, especially because of the redundancy.

Also, we still need a return type extension so that we revert to mixed if the hydration mode may not be HYDRATE_OBJECT

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I don't like this dual template type approach either.

class ManyId
{
/** @var string */
public $id;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
public $id;
public $id;


assert(is_string($typeName));

$nullable = $this->isQueryComponentNullable($dqlAlias) || $class->isNullable($fieldName) || $this->isAggregated;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The last part of this condition basically seems to say "if there is an aggregate function, the type is nullable". However what I'm often doing is this:

SELECT COALESCE(SUM(column), 0)

Now this is likely already taken care of by the walkCoalesceExpression method but I noticed this case isn't covered by a test. Could you add a test case for this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea, I've added a test for this

@ondrejmirtes
Copy link
Member

Hi Arnaud, I really appreciate it! I'm just busy at this moment (130 unread emails all related to PHPStan) so it might take me a while to review this PR. Thanks for understanding.

@allan-simon
Copy link

@arnaud-lb thanks for this PR, at my company we have an heavy use of createQuery, and we've been bitten several by the fact that we currently need to manually annotate the return type (which cause phpstan to blindly trust us poor human that makes copy/paste mistakes) so this PR would permits us to catch a whole class of bugs

tell me if there's any way we can help (testing the branch on a real code base etc. )

@arnaud-lb
Copy link
Contributor Author

Thank you @allan-simon. If you want to test, this commit contains the full feature (the equivalent of this PR + a future PR I will open once this one is merged)

@arnaud-lb arnaud-lb force-pushed the generic-query branch 2 times, most recently from cb70704 to 7e9e87e Compare January 4, 2022 15:14
@allan-simon
Copy link

thanks for the rebase, I try that tomorrow !

@arnaud-lb
Copy link
Contributor Author

@ondrejmirtes I've just rebased on top of 1.1

@ondrejmirtes
Copy link
Member

I love the idea, but it's so much code to go through! Some exploratory questions about the implementation:

  • Does it include inferring things from createQueryBuilder? I don't think it's too hard to include that, we're getting DQL from the QueryBuilder calls in this codebase already.
  • Does it handle various result types that you can get on the Query like getArrayResult, getScalarResult, getResult, getSingleColumnResult, getOneOrNullResult, getSingleResult etc.? This always put me away from trying to tackle this.

@arnaud-lb
Copy link
Contributor Author

arnaud-lb commented Jan 18, 2022

Sorry about the code size! I wanted to test the QueryResultTypeWalker and the CreateQueryDynamicReturnTypeExtension as a single unit, so I couldn't split the code mode than this.

Does it include inferring things from createQueryBuilder? I don't think it's too hard to include that, we're getting DQL from the QueryBuilder calls in this codebase already.

Yes, this is something that can be added easily in a separate PR

Does it handle various result types that you can get on the Query like getArrayResult, getScalarResult, getResult, getSingleColumnResult, getOneOrNullResult, getSingleResult etc.? This always put me away from trying to tackle this.

ArrayResult

It may be difficult to add support for getArrayResult(), HYDRATE_ARRAY, or hydration modes other than HYDRATE_OBJECT.

The reason for this is that it may require to add additional type parameters on Query to represent the type of results in other hydration modes, which would be inconvenient for plugin users.

This series of PRs will fully support scalar values in HYDRATE_OBJECT mode, though. For example

SELECT e.id, a.value + 42 AS value_plus_42 FROM Entity e

is fully supported, and Query::getResult() / Query::getResult(Query::HYDRATE_OBJECT) are inferred to return a array<array{id: int, value_plus_42: int}>.

This PR works by adding a type parameter on Query to represent the result type in HYDRATE_OBJECT mode. We use this type parameter in the return type of getResult() and variants.

The type parameter is also convenient outside of the extension, and can be used to implement generic query handlers, like a paginator:

/**
 * @template T
 * 
 * @param Query<ResultType> $query
 * 
 * @return Paginator<ResultType>
 */
function paginate(Query $query): Paginator;

Adding support for HYDRATE_ARRAY mode may require a second type parameter, which would be inconvenient because every reference to the Query type would be required to specify the two parameters:

/**
 * @return Query<TypeInObjectMode, TypeInArrayMode>
 */
function example(): Query;

This is the big caveat of this PR series.

I can think of a few solutions, though:

  • Abusing the type parameter so that we can use it to optionally represent the result type in one or more hyrdation modes, like Query<array{hydrate_object: SomeEntity, hydrate_array: array{id: int, ...}>. This way, it would be possible for the user to specify only one result type
  • Creating convenient aliases of the Query type like ObjectQuery<ResultType> and ArrayQuery<ResultType> that would map to Query<ResultType, mixed> and Query<mixed, ResultType>
  • It may be possible to infer the type in HYDRATE_ARRAY mode from the type in HYDRATE_OBJECT mode. This would be ideal, as it would require only one type parameter, but I'm not sure whether it's possible or not (Edit: it's not possible because the object result type doesn't convey all information necessary to infer the array result type, at least when fetch-joins are involved.)

(Edit: none of these solutions are really satisfying)

getScalarResult

getScalarResult is not supported currently because its return type is inconsistent: https://github.com/doctrine/orm/blob/v2.7.3/lib/Doctrine/ORM/Internal/Hydration/AbstractHydrator.php#L353-L360

It may be possible to support this, though

getResult()

getResult() is supported when called with 0 parameters, or when the first parameter is Query::HYDRATE_OBJECT

getOneOrNullResult(), getSingleResult()

getOneOrNullResult() and getSingleResult() are supported, but only when explicitly specifying the hydration mode. This is because the default hydration mode of these functions depends on the Query state (Query::$_hydrationMode).

@enumag
Copy link
Contributor

enumag commented Jan 18, 2022

getScalarResult is not supported currently because its return type is inconsistent: https://github.com/doctrine/orm/blob/v2.7.3/lib/Doctrine/ORM/Internal/Hydration/AbstractHydrator.php#L353-L360

Whoa from that comment in Doctrine I can't even tell what behavior is the desired one and what is the errorneous one. I'm also surprised this isn't fixed in the 3.0.x branch.

EDIT: Ah... based on the original comment I found through git blame apparently in 3.0 the if should be removed since the code inside should be used even for scalars.

@ondrejmirtes
Copy link
Member

@arnaud-lb Can you please add the QueryBuilder support now? I have a big project at my disposal that has more than 1 million LoC and heavily uses Doctrine with QueryBuilder, so I could test this thouroughly. Unfortunately with pure createQuery support I'd be flying blind merging this.

Getting the DQL from the QueryBuilder object should be as easy as $scope->getType(new MethodCall($queryBuilderVar), 'getDQL').

Here are the various extensions that make it happen: https://github.com/phpstan/phpstan-doctrine/tree/master/src/Type/Doctrine/QueryBuilder

The two main ones are https://github.com/phpstan/phpstan-doctrine/blob/master/src/Type/Doctrine/QueryBuilder/QueryBuilderGetQueryDynamicReturnTypeExtension.php and https://github.com/phpstan/phpstan-doctrine/blob/master/src/Type/Doctrine/QueryBuilder/QueryBuilderGetDqlDynamicReturnTypeExtension.php.

@arnaud-lb arnaud-lb force-pushed the generic-query branch 3 times, most recently from c8da2d8 to 3ada067 Compare January 19, 2022 12:26
Comment on lines 10 to +20
/** @api */
class QueryType extends ObjectType
class QueryType extends GenericObjectType
{

/** @var string */
private $dql;

public function __construct(string $dql)
public function __construct(string $dql, ?Type $resultType = null)
{
parent::__construct('Doctrine\ORM\Query');
$resultType = $resultType ?? new MixedType();
parent::__construct('Doctrine\ORM\Query', [$resultType]);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Taking the $dql as a string here forces us to create a separate QueryType instance for each alternative query. This becomes a problem with the generic parameter, because instead of typing $em->createQuery('select e from ' . (rand(0,1) ? 'Foo' : 'Bar')) as Query<Foo|Bar>, we are forced to type it as Query<Foo>|Query<Bar>. The former is more useful in some cases: https://phpstan.org/r/7ce3e2f6-f544-4637-87bb-57a96e7e8ba2

Changing string $dql to Type|string $dql would work, but it would break the compatibility of the getDql() method. Alternatively, we could introduce a new type: GenericQueryType, and use this type instead of QueryType. Would this be acceptable with the backward compatibility promise ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(not very critical though, since createQuery() rarely receives a union of constant strings in practice)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I think we can disregard these cases for now to make it simple :)

@arnaud-lb
Copy link
Contributor Author

@ondrejmirtes I've updated this PR with support for QueryBuilder. I've also pushed this branch based on this PR, with the DynamicTypeExtension for Query::getResult() and variants.

@ondrejmirtes
Copy link
Member

Alright, just tested in on a real project and was able to make it crash :)

Uncaught AssertionError: assert(is_string($typeName)) in /Users/ondrej/Development/slevomat/vendor/phpstan/phpstan-doctrine/src/Type/Doctrine/Query/QueryResultTypeWalker.php:219

That's because $typeName is null. The offending field is defined like this:

	#[ORM\ManyToOne]
	private ?PartnerProfile $partnerProfile = null;

I think the problem is that you're looking for the type in fieldMappings, but this is an association. This is how the problematic query that crashes PHPStan looks like:

	/**
	 * @return string[][]
	 */
	public function getPartnerProfilesForElasticsearchExport(): array
	{
		return $this->entityManager->createQueryBuilder()
			->select('DISTINCT(p.partnerProfile) AS partnerProfileId')
			->from(Product::class, 'p')
			->join(PartnerProfile::class, 'pp', 'p.partnerProfile = pp.id')
			->andWhere('p.status = ?', ProductStatus::VALID)
			->andWhere('BIT_AND(p.visibility, ?) > 0', ProductVisibility::getGenerallyVisible()->getValue())
			->andWhere('pp.visibility IN (?)', [
				PartnerProfileVisibility::VISIBLE,
				PartnerProfileVisibility::HIDDEN_FOR_CRAWLERS,
			])
			->addOrderBy('partnerProfileId', 'DESC')
			->getQuery()
			->getArrayResult();
	}

@arnaud-lb arnaud-lb force-pushed the generic-query branch 4 times, most recently from 76b4a27 to 760810f Compare January 20, 2022 12:19
@arnaud-lb
Copy link
Contributor Author

Oh right, this was not handled. This is now fixed in this PR and in the generic-query-full branch 👌

@staabm
Copy link
Contributor

staabm commented Jan 20, 2022

I understand it might be hard but narrowing down the result type based on WHERE and also maybe HAVING would be awesome!

these are things which you get for free when using the result set metadata approach I am using with https://github.com/staabm/phpstan-dba

I did not spent time in reviewing the changes of this PR, but in case you are interessted:
when we would have a dynamic return type extension for the query builder like I have done in a proof of concept for zend framework 1 in combination with phpstan-dba we could resolve every query you can think of, without additional type inference logic on top required to handle cases like #232 (comment)

I wrote a bit on how this could work in staabm/phpstan-dba#94 (comment)

@ondrejmirtes
Copy link
Member

@staabm Your suggestion doesn't really work for DQL - it's a different language built to generate SQL with its own specifics.

Also are you sure that phpstan-dba correctly resolves a nullable column without null if you post a query with WHERE x IS NOT NULL?

@staabm
Copy link
Contributor

staabm commented Jan 20, 2022

Your suggestion doesn't really work for DQL - it's a different language built to generate SQL with its own specifics.

i guess doctrine is already able to build a sql query based on DQL.
If the sql could be generated at analysis time, this could easily work with phpstan-dba

Also are you sure that phpstan-dba correctly resolves a nullable column without null if you post a query with WHERE x IS NOT NULL?

will try tomorrow and report back

@staabm
Copy link
Contributor

staabm commented Jan 20, 2022

It seems I was wrong. The where-condition dependent type is not correclty inferred from the result-set metadata atm

staabm/phpstan-dba#172

@ondrejmirtes
Copy link
Member

That's why I want to work the other way around - do everything a similar way that's done in this PR, relying on database engine has its downsides.

@arnaud-lb
Copy link
Contributor Author

@ondrejmirtes

I think I've found a problem, hopefully solvable:
->select(sprintf('NEW %s(p.id, pt.nameShort)', MailingCampaignOffersList

I've found a behavior I didn't anticipated while fixing this. I should be able to submit a fix tomorrow

Thank you for all the tests :)

I understand it might be hard but narrowing down the result type based on WHERE and also maybe HAVING would be awesome!

Agreed it would be awesome. I think that it's totally feasible, too (but maybe not in this PR).

@ondrejmirtes
Copy link
Member

Alright, I’ll merge it once NEW is fixed, and we’ll be looking forward to more progress 😊

@ondrejmirtes
Copy link
Member

I plan to release this PR as 1.2.0 if there are no objections.

@arnaud-lb arnaud-lb force-pushed the generic-query branch 4 times, most recently from 4751ea8 to dac5f17 Compare January 21, 2022 16:22
@arnaud-lb
Copy link
Contributor Author

I've just pushed a fix for NEW expressions, and a few other non trivial changes, so I would recommend to test again.

One of these changes is related to how Doctrine shapes the results. There are some cases where I got it wrong initially, so I fixed them. I've documented Doctrine behavior here.

An other change is related to the type of expressions (functions, arithmetic expressions, etc). In PHP < 8.1 the result of these expressions was always returned as a string (this PR used numeric-string where possible), except for COUNT() and LENGTH() which are explicitly casted as int since Doctrine 2.8. Since PHP 8.1, some expression results may be returned as an int or float. I made adjustments so that these expressions are typed as int|numeric-string or equivalent. It maybe be possible to refine that later, but for now the types should be correct and safe. (The type of selected entities and entity fields is always well known, and not affected by this.)

I've also added rudimentary end to end tests with SQLite to double check our expectations.

(I think that I've force-pushed over your commits, sorry 😬)

…ateQuery()

Adds a type parameter "TResult" on the Doctrine\ORM\Query and
Doctrine\ORM\AbstractQuery stubs. This parameter represents the type of the
results returned by the getResult() and execute() family of methods, when in
HYDRATE_OBJECT mode.

Also adds a DynamicReturnTypeExtension on EntityManager::createQuery() so that
TResult is infered from the query string.

Caveat: As the hydration mode influences the return type of the getResult() and
execute() family of methods, we can not just declare that these methods return
TResult yet. This will require a DynamicReturnTypeExtension.
@ondrejmirtes
Copy link
Member

I really love this! Tested it and apart from the one WHERE u.firstName IS NOT NULL still being nullable, it's green!

I rebase this on top of master, I'll wait for the green build and will release it :)

Along with #253 and this PR phpstan-doctrine 1.2.0 will be an awesome release :)

@ondrejmirtes ondrejmirtes merged commit 8061e5f into phpstan:master Jan 21, 2022
@ondrejmirtes
Copy link
Member

Thank you!

@ondrejmirtes
Copy link
Member

Alright, this is in 1.2.1 mainly because I screwed up and forgot to fetch before I tagged it locally :)

@arnaud-lb
Copy link
Contributor Author

Awesome 🎉 🚀

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants