JohT · JohT · Apr 26, 2025 · Apr 26, 2025 · Apr 26, 2025 · Apr 26, 2025
diff --git a/.github/workflows/public-analyze-code-graph.yml b/.github/workflows/public-analyze-code-graph.yml
@@ -37,10 +37,10 @@ on:
       analysis-arguments:
         description: >
           The arguments to pass to the analysis script.
-          Default: '--profile Neo4jv5-low-memory'
+          Default: '--profile Neo4j-latest-low-memory'
         required: false
         type: string
-        default: '--profile Neo4jv5-low-memory'
+        default: '--profile Neo4j-latest-low-memory'
       typescript-scan-heap-memory:
         description: >
           The heap memory size in MB to use for the TypeScript code scans (default=4096).
@@ -71,7 +71,7 @@ jobs:
       matrix:
         include:
         - os: ubuntu-22.04
-          java: 17
+          java: 21
           python: 3.12
           miniforge: 24.9.0-0
     steps:

diff --git a/CHANGELOG.md b/CHANGELOG.md
@@ -2,6 +2,91 @@
 
 This document describes the changes to the Code Graph Analysis Pipeline. The changes are grouped by version and date. The latest version is at the top.
 
+## v2.1.4
+
+### 🛠 Fix
+
+* [Remove debug prints](https://github.com/JohT/code-graph-analysis-pipeline/commit/4d0a419dc4344e1008ad9d08f8a572421758b191)
+
+## v2.1.3
+
+### 🚀 Feature
+
+* Improve git history rendering by @JohT in https://github.com/JohT/code-graph-analysis-pipeline/pull/371
+* Add git history csv reports by @JohT in https://github.com/JohT/code-graph-analysis-pipeline/pull/372
+  * Add git history CSV reports ([7ea6c28](https://github.com/JohT/code-graph-analysis-pipeline/pull/372/commits/7ea6c2823bdf0bda4012e13a629d5f29fd8a86c3))
+  * Use PREPARE_CONDA_ENVIRONMENT to fully skip conda ([2d0b800](https://github.com/JohT/code-graph-analysis-pipeline/pull/372/commits/2d0b800c48beb80164dd9a5c8f5d145d6923b991))
+
+### 🛠 Fix
+
+* Fix missing pairwise changed dependencies by @JohT in https://github.com/JohT/code-graph-analysis-pipeline/pull/368
+  * Calculate p-values only if there are enough samples ([71d3519](https://github.com/JohT/code-graph-analysis-pipeline/pull/368/commits/71d3519d50c7336e083841aaada0f3d8619fd0ec))
+* Fix git commitCount to only contain unique hashes ([14dceef](https://github.com/JohT/code-graph-analysis-pipeline/pull/372/commits/14dceef6c7eb38a376606a068b484f917cf8551b)) by @JohT in https://github.com/JohT/code-graph-analysis-pipeline/pull/372
+
+### 📦 Dependency Updates
+
+* Update jQAssistant TypeScript Plugin to v1.4.0-M2 by @renovate in https://github.com/JohT/code-graph-analysis-pipeline/pull/364
+* Update dependency com.buschmais.jqassistant.cli:jqassistant-commandline-neo4jv5 to v2.7.0-RC1 by @renovate in https://github.com/JohT/code-graph-analysis-pipeline/pull/366
+* Update actions/setup-java digest to c5195ef by @renovate in https://github.com/JohT/code-graph-analysis-pipeline/pull/365
+
+## v2.1.2
+
+### 🚀 Feature
+
+[Compare pairwise changed files with their dependency weights](https://github.com/JohT/code-graph-analysis-pipeline/pull/362/commits/7e5886904bcfe503a73dfba654aa972418f064b0) by @JohT in https://github.com/JohT/code-graph-analysis-pipeline/pull/362:   
+The [GitHistoryGeneral.ipynb](https://github.com/JohT/code-graph-analysis-pipeline/blob/cb47f814332f517807b9e144df352f68146cddfe/jupyter/GitHistoryGeneral.ipynb) notebook now includes a section that analyzes pairwise file changes alongside their code dependencies (e.g., imports). It calculates correlations, p-values, and visualizes the results using a scatter plot.
+
+### 🛠 Fix
+
+[Fix missing git changes due to not reliably present label](https://github.com/JohT/code-graph-analysis-pipeline/pull/362/commits/5242804ad517b82b928e7ebd87c9d64b1d2f8a0e) by @JohT in https://github.com/JohT/code-graph-analysis-pipeline/pull/362
+
+### 📦 Dependency Updates
+
+* Update Node.js to v23.11.0 by @renovate in https://github.com/JohT/code-graph-analysis-pipeline/pull/355
+* Update dependency com.buschmais.jqassistant.cli:jqassistant-commandline-neo4jv5 to v2.7.0-M1 by @renovate in https://github.com/JohT/code-graph-analysis-pipeline/pull/359
+* Update Neo4j and APOC to 5.26.5 by @renovate in https://github.com/JohT/code-graph-analysis-pipeline/pull/360
+* Update dependency JohT/open-graph-data-science-packaging to v2.13.4 by @renovate in https://github.com/JohT/code-graph-analysis-pipeline/pull/361
+
+## v2.1.1
+
+### 🚀 Features
+
+* Auto update Conda Environment by @JohT in https://github.com/JohT/code-graph-analysis-pipeline/pull/353
+  * [Update conda environment if its outdated compared to the `environment.yml`](https://github.com/JohT/code-graph-analysis-pipeline/pull/353/commits/7f5b2811963b94631d7bb4ef4da57bad98a8f0d4): Previously, Jupyter notebooks failed to import libraries that had been added lately. An already existing Conda environment "codegraph" was sufficient, even it was outdated. Now, it will automatically be updated if necessary so that there are no more import errors. 
+  * [Add PREPARE_CONDA_ENVIRONMENT to skip Conda environment setup](https://github.com/JohT/code-graph-analysis-pipeline/pull/353/commits/f13df113a691b55168dbc02cf5b94d5d838b688e): Previously, Conda environment activation was skipped when the `codegraph` environment was already active. Now, `PREPARE_CONDA_ENVIRONMENT="false` needs to be set additionally to explicitly skip that part. This is needed in GitHub Action pipelines because `conda init` doesn't work as expected but is taken care of by [setup-miniconda](https://github.com/marketplace/actions/setup-miniconda#important).
+  * [Introduce script testing](https://github.com/JohT/code-graph-analysis-pipeline/pull/353/commits/7b735b0abfa037e090580ca5e5cfc80835b80e16): The first (for now framework-free) script test is implemented in the pipeline 🎉.
+
+* Improve git history treemap visualizations and uncover pairwise changed files by @JohT in https://github.com/JohT/code-graph-analysis-pipeline/pull/352
+  * [Add CHANGED_TOGETHER_WITH edge for git file nodes](https://github.com/JohT/code-graph-analysis-pipeline/pull/352/commits/10e202e45e5d4ba602b6023277f63ddf60c97f2e): With this change, there is now the new relationship `CHANGED_TOGETHER_WITH` between `File` nodes (git as well as code) including a property `commitCount` on how often they were changed together. This adds an additional way of uncovering dependencies of files, besides code dependencies via imports.
+
+### 📈 Reports
+
+* Improve git history treemap visualizations and uncover pairwise changed files by @JohT in https://github.com/JohT/code-graph-analysis-pipeline/pull/352
+  * [Add plot highlighting directories with very few authors](https://github.com/JohT/code-graph-analysis-pipeline/pull/352/commits/46290acd5451ce7e4628f118cd67846bc47e535d)
+  * [Add treemap plot that shows commit counts of pairwise changed files](https://github.com/JohT/code-graph-analysis-pipeline/pull/352/commits/30349a77160acd1cde612c199c74b3c67f4cafdb): Now you can additionally see which areas in the code base where changed in conjunction with at least one other file.
+
+### ⚙️ Optimizations
+
+* Auto update Conda Environment by @JohT in https://github.com/JohT/code-graph-analysis-pipeline/pull/353
+  * [Improve change file detection](https://github.com/JohT/code-graph-analysis-pipeline/pull/353/commits/41260dfb02dc6ed7f4f3ff88ff463330b809efd3): 
+    * Log output is now colored (red = error, dark grey = info)
+    * Given `--paths` are now validated
+    * File statistics are now correctly extracted for MacOS and Linux
+
+* Improve git history treemap visualizations and uncover pairwise changed files by @JohT in https://github.com/JohT/code-graph-analysis-pipeline/pull/352
+  * [Change default svg rendering size to 1080x1080](https://github.com/JohT/code-graph-analysis-pipeline/pull/352/commits/b898b1611717497e2176b368db8ab5bd27a017e8)
+
+### 🛠 Fixes
+
+* Auto update Conda Environment by @JohT in https://github.com/JohT/code-graph-analysis-pipeline/pull/353
+  * [Defer download URL check for offline mode](https://github.com/JohT/code-graph-analysis-pipeline/pull/353/commits/a1d141fee5398fd32c57a453c39aa78f40b63d2c): Previously, it was not possible to get an artifact from the download script in offline mode, even if it had already been downloaded and ready to use in the cache. This is now resolved by deferring the check of the URL until right before the actual download, since it needs an internet connection.
+  * [Fix wrong variable for Jupyter notebook directory](https://github.com/JohT/code-graph-analysis-pipeline/pull/353/commits/59571c42f9b7c2f9bbb67554a08c1643deb56bb4): Conda environment creation still used an old variable from another file that kept working since these files are called consecutively. However, can break easily and is now resolved.
+
+### 📦 Dependency Updates
+
+* Update actions/download-artifact digest to 95815c3 by @renovate in https://github.com/JohT/code-graph-analysis-pipeline/pull/351
+* Update actions/cache digest to 5a3ec84 by @renovate in https://github.com/JohT/code-graph-analysis-pipeline/pull/350
+
 ## v2.1.0 (2025-03-22) Public GitHub Actions Workflow, GraphViz Visualization and Git History Treemaps
 
 For all details see: https://github.com/JohT/code-graph-analysis-pipeline/releases/tag/v2.1.0

diff --git a/COMMANDS.md b/COMMANDS.md
@@ -67,18 +67,19 @@ The [analyze.sh](./scripts/analysis/analyze.sh) command comes with these command
 
 - `--report Csv` only generates CSV reports. This speeds up the report generation and doesn't depend on Python, Jupyter Notebook or any other related dependencies. The default value os `All` to generate all reports. `Jupiter` will only generate Jupyter Notebook reports. `DatabaseCsvExport` exports the whole graph database as a CSV file (performance intense, check if there are security concerns first).
 
-- `--profile Neo4jv4` uses the older long term support (june 2023) version v4.4.x of Neo4j and suitable compatible versions of plugins and JQAssistant. `Neo4jv5` will explicitly select the newest (june 2023) version 5.x of Neo4j. Without setting
-a profile, the newest versions will be used. Other profiles can be found in the directory [scripts/profiles](./scripts/profiles/).
+- `--profile Neo4jv4` uses the older long term support (june 2023) version v4.4.x of Neo4j and suitable compatible versions of plugins and JQAssistant. Without specifying a profile, the newest versions will be used. Other profiles can be found in the directory [scripts/profiles](./scripts/profiles/).
 
-- `--profile Neo4jv5-continue-on-scan-errors` is based on the default profile (`Neo4jv5`) but uses the jQAssistant configuration template [template-neo4jv5-jqassistant-continue-on-error.yaml](./scripts/configuration/template-neo4jv5-jqassistant-continue-on-error.yaml) to continue on scan error instead of failing fast. This is temporarily useful when there is a known error that needs to be ignored. It is still recommended to use the default profile and fail fast if there is something wrong. Other profiles can be found in the directory [scripts/profiles](./scripts/profiles/).
+- `--profile Neo4jv5` uses the older long term support (march 2025) version v5.26.x of Neo4j and suitable compatible versions of plugins and JQAssistant. Without specifying a profile, the newest versions will be used. Other profiles can be found in the directory [scripts/profiles](./scripts/profiles/).
 
-- `--profile Neo4jv5-low-memory` is based on the default profile (`Neo4jv5`) but uses only half of the memory (RAM) as configured in [template-neo4j-low-memory.conf](./scripts/configuration/template-neo4j-low-memory.conf). This is useful for the analysis of smaller codebases with less resources. Other profiles can be found in the directory [scripts/profiles](./scripts/profiles/).
+- `--profile Neo4j-latest-continue-on-scan-errors` is based on the default profile (`Neo4j-latest`) but uses the jQAssistant configuration template [template-neo4j-remote-jqassistant-continue-on-error.yaml](./scripts/configuration/template-neo4j-remote-jqassistant-continue-on-error.yaml) to continue on scan error instead of failing fast. This is temporarily useful when there is a known error that needs to be ignored. It is still recommended to use the default profile and fail fast if there is something wrong. Other profiles can be found in the directory [scripts/profiles](./scripts/profiles/).
+
+- `--profile Neo4j-latest-low-memory` is based on the default profile (`Neo4j-latest`) but uses only half of the memory (RAM) as configured in [template-neo4j-low-memory.conf](./scripts/configuration/template-neo4j-low-memory.conf). This is useful for the analysis of smaller codebases with less resources. Other profiles can be found in the directory [scripts/profiles](./scripts/profiles/).
 
 - `--explore` activates the "explore" mode where no reports are generated. Furthermore, Neo4j won't be stopped at the end of the script and will therefore continue running.  This makes it easy to just set everything up but then use the running Neo4j server to explore the data manually.
 
 ### Notes
 
-- Be sure to use Java 17 for Neo4j v5 and Java 11 for Neo4j v4
+- Be sure to use Java 21 for Neo4j v2025, Java 17 for v5 and Java 11 for v4. Details see [Neo4j System Requirements / Java](https://neo4j.com/docs/operations-manual/current/installation/requirements/#deployment-requirements-java).
 - Use your own initial Neo4j password
 - For more details have a look at the script [analyze.sh](./scripts/analysis/analyze.sh)
 

diff --git a/INTEGRATION.md b/INTEGRATION.md
@@ -36,7 +36,7 @@ The workflow parameters are as follows:
 - **sources-upload-name**: The name of the sources uploaded with [actions/upload-artifact](https://github.com/actions/upload-artifact/tree/65c4c4a1ddee5b72f698fdd19549f0f0fb45cf08) containing the content of the 'source' directory for the analysis. It also supports sub-folders for multiple source code bases. This parameter is optional and defaults to an empty string.
 Please use 'include-hidden-files: true' if you also want to upload the git history.
 - **ref**: The branch, tag, or SHA of the code-graph-analysis-pipeline to checkout. This parameter is optional and defaults to "main".
-- **analysis-arguments**: The arguments to pass to the analysis script. This parameter is optional and defaults to '--profile Neo4jv5-low-memory'. You can find all available options in section [Command Line Options of COMMANDS.md/](./COMMANDS.md#command-line-options).
+- **analysis-arguments**: The arguments to pass to the analysis script. This parameter is optional and defaults to '--profile Neo4j-latest-low-memory'. You can find all available options in section [Command Line Options of COMMANDS.md/](./COMMANDS.md#command-line-options).
 - **typescript-scan-heap-memory**: The heap memory size in MB to use for the TypeScript code scans. This value is only used for the TypeScript code scans and is ignored for other scans. This parameter is optional and defaults to '4096'. It will set the environment variable `TYPESCRIPT_SCAN_HEAP_MEMORY` which leads to `NODE_OPTIONS` set to `--max-old-space-size=4096` for TypeScript scans. See [Questions and Answers of README.md](./README.md#thinking-questions--answers) for more information.
 
 The workflow also provides an output parameter:

diff --git a/README.md b/README.md
@@ -25,6 +25,10 @@ Contained within this repository is a comprehensive and automated code graph ana
 - Example analysis for [AxonFramework](https://github.com/AxonFramework/AxonFramework)
 - Example analysis for [react-router](https://github.com/remix-run/react-router)
 
+### :newspaper: News
+
+- May 2025: Migrated to [Neo4j 2025.x](https://neo4j.com/docs/upgrade-migration-guide/current/version-2025/upgrade) and Java 21.
+
 ### :notebook: Jupyter Notebook Reports
 
 Here is an overview of [Jupyter Notebooks](https://jupyter.org) reports from [code-graph-analysis-examples](https://github.com/JohT/code-graph-analysis-examples). For a complete list, see the [Jupyter Notebook Report Reference](#page_with_curl-jupyter-notebook-report-reference).
@@ -66,7 +70,8 @@ Here are some fully automated graph visualizations utilizing [GraphViz](https://
 
 ## :hammer_and_wrench: Prerequisites
 
-- Java 17 is [required for Neo4j](https://neo4j.com/docs/operations-manual/current/installation/requirements/#deployment-requirements-software) (Neo4j 5.x requirement).
+- Java 21 is [required since Neo4j 2025.01](https://neo4j.com/docs/operations-manual/current/installation/requirements/#deployment-requirements-java). See also [Changes from Neo4j 5 to 2025.x](https://neo4j.com/docs/upgrade-migration-guide/current/version-2025/upgrade).
+- Java 17 is [required for Neo4j 5](https://neo4j.com/docs/operations-manual/current/installation/requirements/#deployment-requirements-java).
 - On Windows it is recommended to use the git bash provided by [git for windows](https://github.com/git-guides/install-git#install-git-on-windows).
 - [jq](https://github.com/jqlang/jq) the "lightweight and flexible command-line JSON processor" needs to be installed. Latest releases: https://github.com/jqlang/jq/releases/latest. Check using `jq --version`.
 - Set environment variable `NEO4J_INITIAL_PASSWORD` to a password of your choice. For example:
@@ -254,17 +259,17 @@ The [Code Structure Analysis Pipeline](./.github/workflows/internal-java-code-an
   ```
 
 - How can i continue on errors when scanning Typescript projects instead of cancelling the whole analysis?  
-  👉 Use the profile `Neo4jv5-continue-on-scan-errors` (default = `Neo4jv5`):
+  👉 Use the profile `Neo4j-latest-continue-on-scan-errors` (default = `Neo4j-latest`):
 
   ```shell
-  ./../../scripts/analysis/analyze.sh --profile Neo4jv5-continue-on-scan-errors
+  ./../../scripts/analysis/analyze.sh --profile Neo4j-latest-continue-on-scan-errors
   ```
 
 - How can i reduce the memory (RAM) consumption?  
-  👉 Use the profile `Neo4jv5-low-memory` (default = `Neo4jv5`):
+  👉 Use the profile `Neo4j-latest-low-memory` (default = `Neo4j-latest`):
 
   ```shell
-  ./../../scripts/analysis/analyze.sh --profile Neo4jv5-low-memory
+  ./../../scripts/analysis/analyze.sh --profile Neo4j-latest-low-memory
   ```
 
 ## 🕸 Web References

diff --git a/cypher/Centrality/Centrality_10d_Bridges_Stream.cypher b/cypher/Centrality/Centrality_10d_Bridges_Stream.cypher
@@ -1,7 +1,8 @@
 // Centrality 10d Bridges Stream
 
 CALL gds.bridges.stream($dependencies_projection + '-cleaned')
- YIELD from, to
+// The field "remainingSizes" is only needed until https://github.com/neo4j/graph-data-science/issues/354 is resolved.
+ YIELD from, to, remainingSizes
   WITH gds.util.asNode(from) AS fromMember
       ,gds.util.asNode(to)   AS toMember
   WITH *, coalesce(fromMember.declaringType + ': ', '')  +

diff --git a/cypher/Centrality/Centrality_10e_Bridges_Write.cypher b/cypher/Centrality/Centrality_10e_Bridges_Write.cypher
@@ -1,7 +1,8 @@
 // Centrality 10e Bridges Stream - Write Relationship Property "isBridge"
 
 CALL gds.bridges.stream($dependencies_projection + '-cleaned')
- YIELD from, to
+// The field "remainingSizes" is only needed until https://github.com/neo4j/graph-data-science/issues/354 is resolved.
+ YIELD from, to, remainingSizes
   WITH gds.util.asNode(from) AS fromMember
       ,gds.util.asNode(to)   AS toMember
  MATCH (fromMember)-[dependency:DEPENDS_ON]-(toMember)