Skip to content

C#: mass enable diff-informed data flow #19661

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

d10c
Copy link
Contributor

@d10c d10c commented Jun 3, 2025

An auto-generated patch that enables diff-informed data flow in the obvious cases.

Builds on #18344 and https://github.com/github/codeql-patch/pull/88

@github-actions github-actions bot added the C# label Jun 3, 2025
@d10c d10c marked this pull request as ready for review June 4, 2025 11:32
@Copilot Copilot AI review requested due to automatic review settings June 4, 2025 11:32
@d10c d10c requested a review from a team as a code owner June 4, 2025 11:32
Copy link
Contributor

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR auto-generates patches to enable diff-informed data flow by adding a default observeDiffInformedIncrementalMode predicate in numerous data-flow configuration modules.

  • Added predicate observeDiffInformedIncrementalMode() { any() } to all relevant DataFlow::ConfigSig modules.
  • Covers security, cryptography, and likely-bug query modules for incremental diff analysis.

Reviewed Changes

Copilot reviewed 26 out of 26 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
csharp/ql/src/Security Features/CWE-114/AssemblyPathInjection.ql Added observeDiffInformedIncrementalMode predicate
csharp/ql/src/Security Features/CWE-091/XMLInjection.ql Added observeDiffInformedIncrementalMode predicate
csharp/ql/src/Likely Bugs/LeapYear/UnsafeYearConstruction.ql Added observeDiffInformedIncrementalMode predicate
csharp/ql/lib/semmle/code/csharp/security/dataflow/ZipSlipQuery.qll Added observeDiffInformedIncrementalMode predicate
csharp/ql/lib/semmle/code/csharp/security/dataflow/XPathInjectionQuery.qll Added observeDiffInformedIncrementalMode predicate
csharp/ql/lib/semmle/code/csharp/security/dataflow/UrlRedirectQuery.qll Added observeDiffInformedIncrementalMode predicate
csharp/ql/lib/semmle/code/csharp/security/dataflow/TaintedPathQuery.qll Added observeDiffInformedIncrementalMode predicate
csharp/ql/lib/semmle/code/csharp/security/dataflow/SqlInjectionQuery.qll Added observeDiffInformedIncrementalMode predicate
csharp/ql/lib/semmle/code/csharp/security/dataflow/ResourceInjectionQuery.qll Added observeDiffInformedIncrementalMode predicate
csharp/ql/lib/semmle/code/csharp/security/dataflow/RegexInjectionQuery.qll Added observeDiffInformedIncrementalMode predicate
csharp/ql/lib/semmle/code/csharp/security/dataflow/ReDoSQuery.qll Added observeDiffInformedIncrementalMode predicate
csharp/ql/lib/semmle/code/csharp/security/dataflow/MissingXMLValidationQuery.qll Added observeDiffInformedIncrementalMode predicate
csharp/ql/lib/semmle/code/csharp/security/dataflow/LogForgingQuery.qll Added observeDiffInformedIncrementalMode predicate
csharp/ql/lib/semmle/code/csharp/security/dataflow/LDAPInjectionQuery.qll Added observeDiffInformedIncrementalMode predicate
csharp/ql/lib/semmle/code/csharp/security/dataflow/ExposureOfPrivateInformationQuery.qll Added observeDiffInformedIncrementalMode predicate
csharp/ql/lib/semmle/code/csharp/security/dataflow/CommandInjectionQuery.qll Added observeDiffInformedIncrementalMode predicate
csharp/ql/lib/semmle/code/csharp/security/dataflow/CodeInjectionQuery.qll Added observeDiffInformedIncrementalMode predicate
csharp/ql/lib/semmle/code/csharp/security/dataflow/CleartextStorageQuery.qll Added observeDiffInformedIncrementalMode predicate
csharp/ql/lib/semmle/code/csharp/security/cryptography/HardcodedSymmetricEncryptionKey.qll Added observeDiffInformedIncrementalMode predicate
csharp/ql/lib/semmle/code/csharp/security/cryptography/EncryptionKeyDataFlowQuery.qll Added observeDiffInformedIncrementalMode predicate
Comments suppressed due to low confidence (2)

csharp/ql/src/Security Features/CWE-114/AssemblyPathInjection.ql:45

  • [nitpick] Add a brief comment above this predicate to explain its role in diff-informed incremental analysis, improving clarity for future maintainers.
predicate observeDiffInformedIncrementalMode() { any() }

csharp/ql/src/Security Features/CWE-114/AssemblyPathInjection.ql:45

  • There are no existing tests exercising the incremental diff mode; consider adding test cases to validate behavior when this predicate is active.
predicate observeDiffInformedIncrementalMode() { any() }

@d10c d10c marked this pull request as draft June 5, 2025 15:59
@d10c
Copy link
Contributor Author

d10c commented Jun 5, 2025

It turns out that some of the generated changes in the PRs were not correct, e.g. because they should have also generated a getASelected{Source,Sink}Location() override but didn't (see Chuan-kai's comment here). So for now I'm putting them back in Draft until I make sure (via the patch script) that we are correctly handling all 3 documented query patterns, starting with the simplest one (both source and sink are used as location sources). If you have already started reviewing the PRs, thank you (also for your patience) and stay tuned for an update as to what has changed in the meantime!

@michaelnebel
Copy link
Contributor

@d10c : Great!
Thank you for doing this. A couple of questions

  • Do you expect there will be any configurations that will not be diff-informed?
  • How does it impact tests (I remember hearing something along the lines of that tests already worked with "diff-informed", if the tests are .qlref files)?

@d10c
Copy link
Contributor Author

d10c commented Jun 10, 2025

Update: no changes since last time I opened the PR. It turns out that it's sound (but not optimally performant) to leave getASelected{Source,Sink}Location() un-overridden, specifically in case of a select clause containing only one of source or sink but not both. The patch script currently does not differentiate between that case and the one in which both source and sink are present in the select clause. So I will re-open these PRs as they are, and generate an appropriate getASelected{Source,Sink}Location() override in a follow-up round of PRs.

@d10c
Copy link
Contributor Author

d10c commented Jun 10, 2025

  • Do you expect there will be any configurations that will not be diff-informed?

In this initial PR, we are skipping over the cases where a node other than source or sink is used as a location source in the select clause. To enable these cases, a non-empty override of getASelected{Source,Sink}Location() will be needed. Also, queries with more than one dataflow config are probably another case to be considered separately.

  • How does it impact tests (I remember hearing something along the lines of that tests already worked with "diff-informed", if the tests are .qlref files)?

It's true that in order to enable diff-informed testing of queries, the tests have to be .qlref files. I'm planning on doing that kind of rewrite (similar to this PR) in a later phase.

@d10c d10c marked this pull request as ready for review June 10, 2025 15:11
Copy link
Contributor

@michaelnebel michaelnebel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we trigger a DCA run before merging? Are there any other special testing that is needed before merge?

---
category: minorAnalysis
---
* A number of built-in C# queries can now run in diff-informed mode.
Copy link
Contributor

@michaelnebel michaelnebel Jun 11, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Has the documentation on, how to run queries in diff-informed mode been released? (basically my question is whether we should have a release note)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, it hasn't been made public. So it probably makes sense to not have a release note on any of these PRs.

@@ -43,6 +43,8 @@ module SqlInjectionConfig implements DataFlow::ConfigSig {
* `node` from the data flow graph.
*/
predicate isBarrier(DataFlow::Node node) { node instanceof Sanitizer }

predicate observeDiffInformedIncrementalMode() { any() }
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The related location for this query uses a PathNode instead of Node. Will that cause an issue?

Copy link
Contributor

@michaelnebel michaelnebel Jun 11, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As far as I can tell, I don't think it matters - as this predicate only relates to filtering of source and sink nodes based on an extensible predicate that is pre-populated with a diff range (in which a source and sink should be located).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since (AFAIK) PathNode and Node have the same location, it shouldn't make a difference.

@@ -26,6 +26,8 @@ module UnsafeYearCreationFromArithmeticConfig implements DataFlow::ConfigSig {
oc.getObjectType().getABaseType*().hasFullyQualifiedName("System", "DateTime")
)
}

predicate observeDiffInformedIncrementalMode() { any() }
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Both the primary and related locations are of type PathNode. Is that a problem?

@@ -41,6 +41,8 @@ module AssemblyPathInjectionConfig implements DataFlow::ConfigSig {
name = "UnsafeLoadFrom" and arg = 0
)
}

predicate observeDiffInformedIncrementalMode() { any() }
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Related location is PathNode.

@michaelnebel
Copy link
Contributor

It's true that in order to enable diff-informed testing of queries, the tests have to be .qlref files. I'm planning on doing that kind of rewrite (similar to this PR) in a later phase.

Fantastic! We are also rewriting the test cases whenever we touch the queries (to use inline expectations) - so that would be great! 😄

@d10c
Copy link
Contributor Author

d10c commented Jun 11, 2025

As for DCA, it looks like we don't have sources set up with diffs, so we can't use DCA to test performance on diffs yet. A normal DCA run would then test before-and-after-this-PR without diff-informed mode, but those two runs should have the same behaviour. So I say we leave DCA testing out for now until I get the diff-informed sources in place.

@d10c d10c added the no-change-note-required This PR does not need a change note label Jun 11, 2025
An auto-generated patch that enables diff-informed data flow in the obvious cases.

Builds on github#18344 and github/codeql-patch#88
@d10c d10c force-pushed the d10c/csharp/diff-informed branch from 9c48c73 to f2085c2 Compare June 11, 2025 16:58
@michaelnebel
Copy link
Contributor

michaelnebel commented Jun 12, 2025

As for DCA, it looks like we don't have sources set up with diffs, so we can't use DCA to test performance on diffs yet. A normal DCA run would then test before-and-after-this-PR without diff-informed mode, but those two runs should have the same behaviour. So I say we leave DCA testing out for now until I get the diff-informed sources in place.

Yes, I just wanted to make sure that the introduction of the override didn't somehow affect performance negatively. I don't know, if introducing this override might for some obscure reason e.g. change some join-ordering that could affect performance.

Copy link
Contributor

@michaelnebel michaelnebel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C# no-change-note-required This PR does not need a change note
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants