Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for discounted properties in MDPs and DTMCs #621

Open
wants to merge 9 commits into
base: master
Choose a base branch
from

Conversation

AlexBork
Copy link
Contributor

This PR adds support for checking cumulative and total reward properties with a given discount factor on MDPs and DTMCs.

  • Extends the parser to allow discounted properties, proposed syntax is R=? [Cdiscount=_factor_]
  • Introduces helper classes for checking discounted properties
  • Extends the respective PCTL model checking classes with support for discounted cumulative and total reward properties, implemented analogous to the existing undiscounted implementations

@AlexBork AlexBork added this to the 1.10 milestone Sep 10, 2024
@sjunges
Copy link
Contributor

sjunges commented Sep 10, 2024

This is absolutely great.

I have two questions:

  • Does this code support convergence detection based on the Bellman residual error?
  • Do we throw relevant error messages when doing, e.g., weak bismulation?

Best,
Sebastian

@AlexBork
Copy link
Contributor Author

Yes, convergence is detected using the Bellman residual, see src/storm/solver/helper/DiscountedValueIterationHelper.cpp. I have also tested convergence using only the number of iterations as the criterion, i.e. letting the iteration run until it is guaranteed that we are precise enough. That was however highly inefficient as the number of iterations for guaranteed convergence is usually far larger than the actual number needed to reach the precision.

The error messages are not implemented, I will take care of it! I'm happy about any pointers where such problems might occur.

@AlexBork
Copy link
Contributor Author

Thanks @sjunges for the review! I incorporated your comments now.

@sjunges
Copy link
Contributor

sjunges commented Oct 29, 2024

Thanks! @tquatmann do you want to have a look before merging?

@sjunges sjunges requested a review from tquatmann November 28, 2024 11:21
Copy link
Member

@tquatmann tquatmann left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the great work!

  • Do we (have to) enforce somewhere that the discount factor is strictly between 0 and 1? For both formula types (total and cumulative formulas)? I might have missed this in my review
  • I'd suggest to add more tests, in particular in FormulaParserTest, DtmcPrctlModelCheckerTest, and SchedulerGenerationMdpPrctlModelCheckerTest. The existing test also doesn't consider DiscountedCumulativeRewardFormulas


namespace storm {
namespace logic {
class DiscountedCumulativeRewardFormula : public CumulativeRewardFormula {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This means that .isCumulativeRewardFormula() is also true forDiscountedCumulativeRewardFormulas.
Does this have any unintended effects? (e.g. places like multi-objective model checking, where the discounted formula would be treaded as if it were undiscounted)? Maybe it's safer to also override .isCumulativeRewardFormula() to false here, so things will rather fail instead of silently dropping the discount factor.

Also, this probably needs to override gatherUsedVariables to make sure that variables in the discount factor are catched.

Same for TotalRewardFormulas

optionalRewardAccumulation = f.getRewardAccumulation();
}
return std::static_pointer_cast<Formula>(
std::make_shared<DiscountedCumulativeRewardFormula>(f.getDiscountFactor(), bounds, timeBoundReferences, optionalRewardAccumulation));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need to apply substitution function on the discount factor

@@ -24,6 +24,7 @@ class ExpressionSubstitutionVisitor : public CloneVisitor {
virtual boost::any visit(RewardOperatorFormula const& f, boost::any const& data) const override;
virtual boost::any visit(BoundedUntilFormula const& f, boost::any const& data) const override;
virtual boost::any visit(CumulativeRewardFormula const& f, boost::any const& data) const override;
virtual boost::any visit(DiscountedCumulativeRewardFormula const& f, boost::any const& data) const override;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also needs a case for the DiscountedTotalRewardFormula (to apply substitution on the discount factor)

Comment on lines +201 to +212
FormulaInformation result;
result.setContainsCumulativeRewardFormula(true);
for (unsigned i = 0; i < f.getDimension(); ++i) {
if (f.getTimeBoundReference(i).isRewardBound()) {
result.setContainsRewardBoundedFormula(true);
}
}
return result;
}

boost::any FormulaInformationVisitor::visit(DiscountedTotalRewardFormula const&, boost::any const&) const {
return FormulaInformation();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't these need a result.setContainsDiscountFormula(true)?

DiscountingHelper(storm::storage::SparseMatrix<ValueType> const& A, ValueType discountFactor);
DiscountingHelper(storm::storage::SparseMatrix<ValueType> const& A, ValueType discountFactor, bool trackScheduler);

void setUpViOperator() const;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldn't it be simpler to makesetUpViOperator private and call it within solveWithDiscountedValueIteration?

// Initialize result to the zero vector.
std::vector<ValueType> result(transitionMatrix.getRowGroupCount(), storm::utility::zero<ValueType>());

auto multiplier = storm::solver::MultiplierFactory<ValueType>().create(env, transitionMatrix);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This does not use the discountFactor, right?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants