From d13bbdef2ff22231b2f9c8c10a5c183aa8e29cf2 Mon Sep 17 00:00:00 2001 From: James Emberton <60827102+j-emberton@users.noreply.github.com> Date: Mon, 22 Jul 2024 11:20:44 +0100 Subject: [PATCH 01/17] Create docs --- docs | 1 + 1 file changed, 1 insertion(+) create mode 100644 docs diff --git a/docs b/docs new file mode 100644 index 00000000..8b137891 --- /dev/null +++ b/docs @@ -0,0 +1 @@ + From 7fd9746df4fda88c740a3428826b1701a0df05c1 Mon Sep 17 00:00:00 2001 From: James Emberton <60827102+j-emberton@users.noreply.github.com> Date: Mon, 22 Jul 2024 11:20:59 +0100 Subject: [PATCH 02/17] Delete docs --- docs | 1 - 1 file changed, 1 deletion(-) delete mode 100644 docs diff --git a/docs b/docs deleted file mode 100644 index 8b137891..00000000 --- a/docs +++ /dev/null @@ -1 +0,0 @@ - From d2d1b4aca0b40a7beefc2d42905d94bceb60aa50 Mon Sep 17 00:00:00 2001 From: James Emberton <60827102+j-emberton@users.noreply.github.com> Date: Mon, 22 Jul 2024 11:24:09 +0100 Subject: [PATCH 03/17] add launch.m flow diagram --- docs/flow_diagrams/ustar_cp/launch | 51 ++++++++++++++++++++++++++++++ 1 file changed, 51 insertions(+) create mode 100644 docs/flow_diagrams/ustar_cp/launch diff --git a/docs/flow_diagrams/ustar_cp/launch b/docs/flow_diagrams/ustar_cp/launch new file mode 100644 index 00000000..12df4b4f --- /dev/null +++ b/docs/flow_diagrams/ustar_cp/launch @@ -0,0 +1,51 @@ +# Functional Flow Diagram for `launch` Function + +## Start +| +|-- **Initialize Function** +| - Set `exitcode` to 0 +| - Turn off warnings +| +|-- **Check and Normalize Input Folder Path** +| - If `input_folder` not provided, use current directory +| - If relative path, convert to absolute path +| - Ensure path ends with a slash or backslash +| +|-- **Check and Normalize Output Folder Path** +| - If `output_folder` not provided, use current directory +| - If relative path, convert to absolute path +| - Ensure path ends with a slash or backslash +| - Create output directory if it doesn't exist +| +|-- **Identify Files to Process** +| - List all CSV files matching pattern `*_qca_ustar_*.csv` +| +|-- **Initialize Error String Array** +| +|-- **Process Each Identified File** +| | +| |-- **Open and Read File** +| | - Extract metadata (site, year, lat, lon, timezone, htower, timeres, sc_negl, notes) +| | - Import data from file +| | +| |-- **Identify Columns** +| | - Identify indices for `USTAR`, `NEE`, `TA`, `PPFD_IN`, `SW_IN` +| | - Handle errors if columns are not found +| | +| |-- **Extract and Validate Data** +| | - Extract data for identified columns +| | - Replace invalid data values with `NaN` +| | - Handle `PPFD` data derived from `SW_IN` if necessary +| | +| |-- **uStar Threshold Computation** +| | - Call `cpdBootstrapUStarTh4Season20100901` for 4-season analysis +| | - Call `cpdAssignUStarTh20100901` to assign annual Cp arrays +| | +| |-- **Write Results** +| | - Write computed results to output files +| | - Append metadata and notes to result files +| | +| |-- **Handle Errors** +| | - Collect error messages in `error_str` +| | +|-- **End** From 65f265b350894bf512179740f7879a9a550e0cc6 Mon Sep 17 00:00:00 2001 From: James Emberton <60827102+j-emberton@users.noreply.github.com> Date: Mon, 22 Jul 2024 11:42:00 +0100 Subject: [PATCH 04/17] Create cpdBootstrapUStarTh4Season20100901 --- .../cpdBootstrapUStarTh4Season20100901 | 38 +++++++++++++++++++ 1 file changed, 38 insertions(+) create mode 100644 docs/flow_diagrams/ustar_cp/cpdBootstrapUStarTh4Season20100901 diff --git a/docs/flow_diagrams/ustar_cp/cpdBootstrapUStarTh4Season20100901 b/docs/flow_diagrams/ustar_cp/cpdBootstrapUStarTh4Season20100901 new file mode 100644 index 00000000..0ce84b9f --- /dev/null +++ b/docs/flow_diagrams/ustar_cp/cpdBootstrapUStarTh4Season20100901 @@ -0,0 +1,38 @@ +# Functional Flow Diagram for `cpdBootstrapUStarTh4Season20100901` Function + +## Start +| +|-- **Initialize Function** +| - Define function with inputs: `t`, `NEE`, `uStar`, `T`, `fNight`, `fPlot`, `cSiteYr`, `nBoot` +| +|-- **Initialize Variables** +| - Calculate `nt` (length of `t`) +| - Determine `nPerDay` (number of periods per day based on median difference in `t`) +| - Identify night indices `iNight` +| - Filter out invalid `uStar` values (`uStar < 0` or `uStar > 4`) +| - Set parameters: `nSeasons = 4`, `nStrataN = 4`, `nStrataX = 8`, `nBins = 50`, `nPerBin = 5` +| - Adjust `nPerBin` based on `nPerDay` (24 or 48) +| - Calculate `nPerSeason` and `ntN` +| +|-- **Find Valid Data Indices** +| - Identify non-NaN indices for `NEE`, `uStar`, `T` +| - Intersect with night indices to get `itNee` +| - Initialize `StatsMT` structure with default NaN values +| +|-- **Initialize Result Arrays** +| - Initialize `Cp2`, `Cp3`, `Stats2`, `Stats3` arrays with NaN values +| +|-- **Check Sufficient Data** +| - If `ntNee` (valid nighttime data points) is greater than or equal to `ntN` +| | +| |-- **Bootstrap Loop** +| | - For each bootstrap iteration `iBoot` from 1 to `nBoot` +| | | - Record start time `t0` +| | | - Generate random indices `it` for resampling +| | | - Calculate `ntNee` for resampled data +| | | - Set `fPlot` to 0 if `iBoot` > 1 +| | | - Call `cpdEvaluateUStarTh4Season20100901` with resampled data +| | | - Store results in `Cp2`, `Cp3`, `Stats2`, `Stats3` +| | | - Calculate and log processing time +| +|-- **End** From 236c306d11d7f3c46170da7606469840ffb0a0ca Mon Sep 17 00:00:00 2001 From: James Emberton <60827102+j-emberton@users.noreply.github.com> Date: Mon, 22 Jul 2024 11:45:49 +0100 Subject: [PATCH 05/17] Create cpdEvaluateUStarTh4Season20100901 --- .../cpdEvaluateUStarTh4Season20100901 | 59 +++++++++++++++++++ 1 file changed, 59 insertions(+) create mode 100644 docs/flow_diagrams/ustar_cp/cpdEvaluateUStarTh4Season20100901 diff --git a/docs/flow_diagrams/ustar_cp/cpdEvaluateUStarTh4Season20100901 b/docs/flow_diagrams/ustar_cp/cpdEvaluateUStarTh4Season20100901 new file mode 100644 index 00000000..9ffb6cd2 --- /dev/null +++ b/docs/flow_diagrams/ustar_cp/cpdEvaluateUStarTh4Season20100901 @@ -0,0 +1,59 @@ +# Functional Flow Diagram for `cpdEvaluateUStarTh4Season20100901` Function + +## Start +| +|-- **Initialize Function** +| - Define function with inputs: `t`, `NEE`, `uStar`, `T`, `fNight`, `fPlot`, `cSiteYr` +| +|-- **Initializations** +| - Calculate `nt` (length of `t`) +| - Extract year, month, day vectors from `t` using `fcDatevec` +| - Calculate `iYr` (median year) and `EndDOY` (end day of year) +| - Determine `nPerDay` (number of periods per day based on median difference in `t`) +| +|-- **Set Parameters** +| - Set parameters: `nSeasons = 4`, `nStrataN = 4`, `nStrataX = 8`, `nBins = 50`, `nPerBin = 5` +| - Adjust `nPerBin` based on `nPerDay` (24 or 48) +| - Calculate `nPerSeasonN` and `nN` +| +|-- **Filter Invalid Data** +| - Find and filter out invalid `uStar` values (`uStar < 0` or `uStar > 3`) +| - Identify valid indices for annual nighttime data (`itAnnual`) and calculate `ntAnnual` +| +|-- **Initialize Outputs** +| - Initialize `Cp2`, `Cp3`, `Stats2`, `Stats3` arrays with NaN values +| - Initialize `StatsMT` structure with default NaN values +| +|-- **Check Sufficient Data** +| - If `ntAnnual` (valid annual nighttime data points) is less than `nN`, return +| +|-- **Reorder Data** +| - Move December data to the beginning of the year and reorder +| - Update `t`, `T`, `uStar`, `NEE`, `fNight` accordingly +| - Recalculate `itAnnual` and `ntAnnual` +| +|-- **Reset Parameters** +| - Calculate `nSeasons` based on actual number of good data +| - Recalculate `nPerSeason` +| +|-- **Stratify Data** +| - Stratify data by time using moving windows and by temperature class +| - For each season and temperature class, estimate change points `Cp2` and `Cp3` +| +|-- **Plotting (if enabled)** +| - Initialize plot settings if `fPlot` is set to 1 +| - Plot results for each season and strata if `fPlot` is set to 1 +| +|-- **Season Loop** +| - For each season `iSeason` +| | - Determine indices for the season (`jtSeason`, `itSeason`) +| | - Calculate number of strata based on `ntSeason` +| | - Calculate temperature thresholds `TTh` +| | - For each strata `iStrata` +| | | - Find indices for the strata (`itStrata`) +| | | - Bin `uStar` and `NEE` data and calculate mean values +| | | - Call `cpdFindChangePoint20100901` to find change points +| | | - Add additional fields to `xs2` and `xs3` not assigned by change-point function +| | | - Store results in `Cp2`, `Cp3`, `Stats2`, `Stats3` +| +|-- **End** From 3f516b6761002d1bccd7c02c579a020a6ef0a912 Mon Sep 17 00:00:00 2001 From: James Emberton <60827102+j-emberton@users.noreply.github.com> Date: Mon, 22 Jul 2024 11:49:00 +0100 Subject: [PATCH 06/17] Create cpdFindChangePoint20100901 --- .../ustar_cp/cpdFindChangePoint20100901 | 56 +++++++++++++++++++ 1 file changed, 56 insertions(+) create mode 100644 oneflux_steps/ustar_cp/cpdFindChangePoint20100901 diff --git a/oneflux_steps/ustar_cp/cpdFindChangePoint20100901 b/oneflux_steps/ustar_cp/cpdFindChangePoint20100901 new file mode 100644 index 00000000..4beff9f1 --- /dev/null +++ b/oneflux_steps/ustar_cp/cpdFindChangePoint20100901 @@ -0,0 +1,56 @@ +# Functional Flow Diagram for `cpdFindChangePoint20100901` Function + +## Start +| +|-- **Initialize Function** +| - Define function with inputs: `xx`, `yy`, `fPlot`, `cPlot` +| +|-- **Initialize Outputs** +| - Set `Cp2`, `Cp3` to NaN +| - Initialize `s2` and `s3` structures with default NaN values +| +|-- **Exclude Missing Data** +| - Reshape `xx` and `yy` to column vectors +| - Find and exclude NaN values in `xx` and `yy` +| - Calculate number of valid data points `n` +| - If `n` is less than 10, return +| +|-- **Exclude Extreme Linear Regression Outliers** +| - Perform linear regression on `x` and `y` to get regression coefficients `a` +| - Calculate predicted values `yHat` and residuals `dy` +| - Calculate mean `mdy` and standard deviation `sdy` of residuals +| - Find and exclude outliers beyond `ns` (4) standard deviations +| - Calculate number of valid data points `n` +| - If `n` is less than 10, return +| +|-- **Compute Null Hypothesis Models** +| - Compute mean of `y` as `yHat2` and `SSERed2` +| - Perform linear regression on `x` and `y` to get `yHat3` and `SSERed3` +| - Set `nRed2 = 1`, `nFull2 = 2`, `nRed3 = 2`, `nFull3 = 3` +| +|-- **Compute F Scores** +| - Initialize `MT`, `Fc2`, `Fc3` arrays with NaN values +| - Set `nEndPtsN = 3` and calculate `nEndPts` +| - Loop through each data point to compute F scores: +| | - Fit 2-parameter model and compute `Fc2` +| | - Fit 3-parameter model and compute `Fc3` +| +|-- **Assign Change Points** +| - Find `Fmax2` and `iCp2`, set `xCp2` +| - Perform linear regression for 2-parameter model and calculate `yHat2` +| - Calculate p-value `p2` and assign `Cp2` if significant +| - Find `Fmax3` and `iCp3`, set `xCp3` +| - Perform linear regression for 3-parameter model and calculate `yHat3` +| - Calculate p-value `p3` and assign `Cp3` if significant +| +|-- **Assign Values to s2 and s3** +| - Check if `iCp2` is within valid range, if so, assign values to `s2` +| - Check if `iCp3` is within valid range, if so, assign values to `s3` +| +|-- **Plot Results (if enabled)** +| - If `fPlot` is 1: +| | - Plot `x`, `y`, `yHat2`, `yHat3`, `xCp2`, `xCp3` +| | - Set plot title and adjust plot limits +| | - Format plot appearance +| +|-- **End** From 64b1ecd61bad8c92638094f5e71851fdf415ddc8 Mon Sep 17 00:00:00 2001 From: James Emberton <60827102+j-emberton@users.noreply.github.com> Date: Mon, 22 Jul 2024 11:53:27 +0100 Subject: [PATCH 07/17] Create cpdFindChangePoint20100901.md --- .../ustar_cp/cpdFindChangePoint20100901.md | 56 +++++++++++++++++++ 1 file changed, 56 insertions(+) create mode 100644 docs/flow_diagrams/ustar_cp/cpdFindChangePoint20100901.md diff --git a/docs/flow_diagrams/ustar_cp/cpdFindChangePoint20100901.md b/docs/flow_diagrams/ustar_cp/cpdFindChangePoint20100901.md new file mode 100644 index 00000000..4beff9f1 --- /dev/null +++ b/docs/flow_diagrams/ustar_cp/cpdFindChangePoint20100901.md @@ -0,0 +1,56 @@ +# Functional Flow Diagram for `cpdFindChangePoint20100901` Function + +## Start +| +|-- **Initialize Function** +| - Define function with inputs: `xx`, `yy`, `fPlot`, `cPlot` +| +|-- **Initialize Outputs** +| - Set `Cp2`, `Cp3` to NaN +| - Initialize `s2` and `s3` structures with default NaN values +| +|-- **Exclude Missing Data** +| - Reshape `xx` and `yy` to column vectors +| - Find and exclude NaN values in `xx` and `yy` +| - Calculate number of valid data points `n` +| - If `n` is less than 10, return +| +|-- **Exclude Extreme Linear Regression Outliers** +| - Perform linear regression on `x` and `y` to get regression coefficients `a` +| - Calculate predicted values `yHat` and residuals `dy` +| - Calculate mean `mdy` and standard deviation `sdy` of residuals +| - Find and exclude outliers beyond `ns` (4) standard deviations +| - Calculate number of valid data points `n` +| - If `n` is less than 10, return +| +|-- **Compute Null Hypothesis Models** +| - Compute mean of `y` as `yHat2` and `SSERed2` +| - Perform linear regression on `x` and `y` to get `yHat3` and `SSERed3` +| - Set `nRed2 = 1`, `nFull2 = 2`, `nRed3 = 2`, `nFull3 = 3` +| +|-- **Compute F Scores** +| - Initialize `MT`, `Fc2`, `Fc3` arrays with NaN values +| - Set `nEndPtsN = 3` and calculate `nEndPts` +| - Loop through each data point to compute F scores: +| | - Fit 2-parameter model and compute `Fc2` +| | - Fit 3-parameter model and compute `Fc3` +| +|-- **Assign Change Points** +| - Find `Fmax2` and `iCp2`, set `xCp2` +| - Perform linear regression for 2-parameter model and calculate `yHat2` +| - Calculate p-value `p2` and assign `Cp2` if significant +| - Find `Fmax3` and `iCp3`, set `xCp3` +| - Perform linear regression for 3-parameter model and calculate `yHat3` +| - Calculate p-value `p3` and assign `Cp3` if significant +| +|-- **Assign Values to s2 and s3** +| - Check if `iCp2` is within valid range, if so, assign values to `s2` +| - Check if `iCp3` is within valid range, if so, assign values to `s3` +| +|-- **Plot Results (if enabled)** +| - If `fPlot` is 1: +| | - Plot `x`, `y`, `yHat2`, `yHat3`, `xCp2`, `xCp3` +| | - Set plot title and adjust plot limits +| | - Format plot appearance +| +|-- **End** From 48137548b7f63eb3e9345dfe8e4990f3fc8a98bb Mon Sep 17 00:00:00 2001 From: James Emberton <60827102+j-emberton@users.noreply.github.com> Date: Mon, 22 Jul 2024 11:54:27 +0100 Subject: [PATCH 08/17] Rename cpdFindChangePoint20100901.md --- .../{cpdFindChangePoint20100901.md => cpdFindChangePoint20100901} | 0 1 file changed, 0 insertions(+), 0 deletions(-) rename docs/flow_diagrams/ustar_cp/{cpdFindChangePoint20100901.md => cpdFindChangePoint20100901} (100%) diff --git a/docs/flow_diagrams/ustar_cp/cpdFindChangePoint20100901.md b/docs/flow_diagrams/ustar_cp/cpdFindChangePoint20100901 similarity index 100% rename from docs/flow_diagrams/ustar_cp/cpdFindChangePoint20100901.md rename to docs/flow_diagrams/ustar_cp/cpdFindChangePoint20100901 From c1af4fe0fffbc82b7fcab7db50e2d2b241e7f419 Mon Sep 17 00:00:00 2001 From: James Emberton <60827102+j-emberton@users.noreply.github.com> Date: Mon, 22 Jul 2024 11:59:06 +0100 Subject: [PATCH 09/17] Create cpdAssignUStarTh20100901 --- .../ustar_cp/cpdAssignUStarTh20100901 | 53 +++++++++++++++++++ 1 file changed, 53 insertions(+) create mode 100644 docs/flow_diagrams/ustar_cp/cpdAssignUStarTh20100901 diff --git a/docs/flow_diagrams/ustar_cp/cpdAssignUStarTh20100901 b/docs/flow_diagrams/ustar_cp/cpdAssignUStarTh20100901 new file mode 100644 index 00000000..ed26515e --- /dev/null +++ b/docs/flow_diagrams/ustar_cp/cpdAssignUStarTh20100901 @@ -0,0 +1,53 @@ +# Functional Flow Diagram for `cpdAssignUStarTh20100901` Function + +## Start +| +|-- **Initialize Function** +| - Define function with inputs: `Stats`, `fPlot`, `cSiteYr` +| +|-- **Initialize Outputs** +| - Set initial values for outputs: `CpA`, `nA`, `tW`, `CpW`, `fSelect`, `cMode`, `cFailure`, `sSine`, `FracSig`, `FracModeD`, `FracSelect` +| +|-- **Compute Window Sizes** +| - Determine the dimensions of `Stats` +| - Set parameters: `nWindowsN = 4`, `nSelectN = nWindowsN * nStrataN * nBoot` +| - Initialize arrays for `CpA`, `nA`, `tW`, `CpW` +| +|-- **Extract Variable Arrays from Stats Structure** +| - Extract variables `mt`, `Cp`, `b1`, `c2`, `cib1`, `cic2`, `p` from `Stats` +| - Convert extracted arrays to column vectors +| - Set significance flag `fP = (p <= 0.05)` +| +|-- **Determine Change-Point Model Type** +| - Check if `c2` values are NaN to determine if it's a 2-parameter model +| - Set `nPar` to 2 or 3 accordingly +| +|-- **Classify Change Points** +| - Identify indices for significant (`iSig`) and non-significant (`iNS`) change points +| - Classify significant change points into modes `ModeD` (`b1 >= c2`) and `ModeE` (`b1 < c2`) +| +|-- **Evaluate Primary Mode** +| - Determine the primary mode (Deficit or Excess) based on the number of significant change points in each mode +| - Update `fSelect`, `fModeD`, `fModeE`, and calculate `FracSig`, `FracModeD`, `FracSelect` +| +|-- **Abort if Too Few Detections** +| - Check if `FracSelect` is below 0.10, if so, set `cFailure` and return +| +|-- **Exclude Outliers** +| - Normalize variables and identify outliers +| - Update `fSelect` to exclude outliers +| - Recalculate `FracSig`, `FracModeD`, `FracSelect` +| - Check if the number of selected points is below `nSelectN`, if so, set `cFailure` and return +| +|-- **Aggregate Values to Season and Year** +| - Aggregate change points to season and year +| - Calculate mean `tW` and `CpW` for each window +| - Fit annual sine curve +| +|-- **Plot Results (if enabled)** +| - If `fPlot` is 1: +| | - Plot raw change points, selected change points, and annual sine curve +| | - Plot histograms of annual change points +| | - Plot fraction of selected points by window +| +|-- **End** From d7b036671d8e03f0a0dfe36debaf4926d185c7f1 Mon Sep 17 00:00:00 2001 From: James Emberton <60827102+j-emberton@users.noreply.github.com> Date: Mon, 22 Jul 2024 12:04:32 +0100 Subject: [PATCH 10/17] Update cpdAssignUStarTh20100901 --- .../ustar_cp/cpdAssignUStarTh20100901 | 24 +++++++++++-------- 1 file changed, 14 insertions(+), 10 deletions(-) diff --git a/docs/flow_diagrams/ustar_cp/cpdAssignUStarTh20100901 b/docs/flow_diagrams/ustar_cp/cpdAssignUStarTh20100901 index ed26515e..f65efec0 100644 --- a/docs/flow_diagrams/ustar_cp/cpdAssignUStarTh20100901 +++ b/docs/flow_diagrams/ustar_cp/cpdAssignUStarTh20100901 @@ -9,13 +9,13 @@ | - Set initial values for outputs: `CpA`, `nA`, `tW`, `CpW`, `fSelect`, `cMode`, `cFailure`, `sSine`, `FracSig`, `FracModeD`, `FracSelect` | |-- **Compute Window Sizes** -| - Determine the dimensions of `Stats` -| - Set parameters: `nWindowsN = 4`, `nSelectN = nWindowsN * nStrataN * nBoot` +| - Determine the dimensions of `Stats` using `ndims(Stats)` +| - Set parameters: `nWindowsN`, `nSelectN` | - Initialize arrays for `CpA`, `nA`, `tW`, `CpW` | |-- **Extract Variable Arrays from Stats Structure** -| - Extract variables `mt`, `Cp`, `b1`, `c2`, `cib1`, `cic2`, `p` from `Stats` -| - Convert extracted arrays to column vectors +| - Extract variables `mt`, `Cp`, `b1`, `c2`, `cib1`, `cic2`, `p` from `Stats` using `fcReadFields` +| - Convert extracted arrays to column vectors using `fcx2colvec` | - Set significance flag `fP = (p <= 0.05)` | |-- **Determine Change-Point Model Type** @@ -34,20 +34,24 @@ | - Check if `FracSelect` is below 0.10, if so, set `cFailure` and return | |-- **Exclude Outliers** -| - Normalize variables and identify outliers +| - Normalize variables and identify outliers using `nanmedian` and `fcNaniqr` | - Update `fSelect` to exclude outliers | - Recalculate `FracSig`, `FracModeD`, `FracSelect` | - Check if the number of selected points is below `nSelectN`, if so, set `cFailure` and return | |-- **Aggregate Values to Season and Year** -| - Aggregate change points to season and year -| - Calculate mean `tW` and `CpW` for each window -| - Fit annual sine curve +| - Aggregate change points to season and year using `nanmean` and `reshape` +| - Calculate mean `tW` and `CpW` for each window using `fcBin` +| +|-- **Fit Annual Sine Curve** +| - Fit annual sine curve using `nlinfit` and `fcEqnAnnualSine` +| - Calculate goodness of fit `r2` using `fcr2Calc` | |-- **Plot Results (if enabled)** | - If `fPlot` is 1: -| | - Plot raw change points, selected change points, and annual sine curve -| | - Plot histograms of annual change points +| | - Plot raw change points, selected change points, and annual sine curve using `plot` +| | - Plot histograms of annual change points using `hist` | | - Plot fraction of selected points by window +| | - Use functions `fcDatetick`, `fcFigLoc` | |-- **End** From 578fcded9e6b1bcddc35b5ac6656c5bd7921f0e5 Mon Sep 17 00:00:00 2001 From: James Emberton <60827102+j-emberton@users.noreply.github.com> Date: Mon, 22 Jul 2024 12:09:13 +0100 Subject: [PATCH 11/17] Create run_launch --- docs/flow_diagrams/ustar_cp/run_launch | 48 ++++++++++++++++++++++++++ 1 file changed, 48 insertions(+) create mode 100644 docs/flow_diagrams/ustar_cp/run_launch diff --git a/docs/flow_diagrams/ustar_cp/run_launch b/docs/flow_diagrams/ustar_cp/run_launch new file mode 100644 index 00000000..5340b83f --- /dev/null +++ b/docs/flow_diagrams/ustar_cp/run_launch @@ -0,0 +1,48 @@ +# Functional Flow Diagram for Shell Script + +## Start +| +|-- **Initialize Script** +| - Define `exe_name` as `$0` +| - Define `exe_dir` as `dirname "$0"` +| - Print "------------------------------------------" +| +|-- **Check for Arguments** +| - If no arguments (`x$1 = "x"`): +| | - Print "Usage:" +| | - Print script name and expected arguments +| - Else: +| | - Print "Setting up environment variables" +| | - Set `MCRROOT` to the first argument +| | - Print "---" +| | +| |-- **Setup Environment Variables** +| | - Set `LD_LIBRARY_PATH` to include: +| | - Current directory `.` +| | - `${MCRROOT}/runtime/glnxa64` +| | - `${MCRROOT}/bin/glnxa64` +| | - `${MCRROOT}/sys/os/glnxa64` +| | - Set `MCRJRE` to `${MCRROOT}/sys/java/jre/glnxa64/jre/lib/amd64` +| | - Extend `LD_LIBRARY_PATH` to include: +| | - `${MCRJRE}/native_threads` +| | - `${MCRJRE}/server` +| | - `${MCRJRE}/client` +| | - `${MCRJRE}` +| | - Set `XAPPLRESDIR` to `${MCRROOT}/X11/app-defaults` +| | - Export `LD_LIBRARY_PATH` +| | - Export `XAPPLRESDIR` +| | - Print `LD_LIBRARY_PATH` +| +| |-- **Prepare Arguments for Execution** +| | - Shift to remove the first argument +| | - Initialize `args` as empty +| | - While there are more arguments (`$# > 0`): +| | - Escape spaces in the argument using `sed 's/ /\\\\ /g'` +| | - Append escaped argument to `args` +| | - Shift to next argument +| | +| |-- **Execute Command** +| | - Run `"${exe_dir}"/launch $args` +| +|-- **End** +| - Exit script From 854e361cf1034ee777612e5ce900a5da6260c706 Mon Sep 17 00:00:00 2001 From: James Emberton <60827102+j-emberton@users.noreply.github.com> Date: Mon, 22 Jul 2024 12:11:47 +0100 Subject: [PATCH 12/17] Update launch --- docs/flow_diagrams/ustar_cp/launch | 116 ++++++++++++++++++----------- 1 file changed, 72 insertions(+), 44 deletions(-) diff --git a/docs/flow_diagrams/ustar_cp/launch b/docs/flow_diagrams/ustar_cp/launch index 12df4b4f..6cac0b5f 100644 --- a/docs/flow_diagrams/ustar_cp/launch +++ b/docs/flow_diagrams/ustar_cp/launch @@ -3,49 +3,77 @@ ## Start | |-- **Initialize Function** +| - Define function with inputs: `input_folder`, `output_folder` | - Set `exitcode` to 0 -| - Turn off warnings -| -|-- **Check and Normalize Input Folder Path** -| - If `input_folder` not provided, use current directory -| - If relative path, convert to absolute path -| - Ensure path ends with a slash or backslash -| -|-- **Check and Normalize Output Folder Path** -| - If `output_folder` not provided, use current directory -| - If relative path, convert to absolute path -| - Ensure path ends with a slash or backslash -| - Create output directory if it doesn't exist -| -|-- **Identify Files to Process** -| - List all CSV files matching pattern `*_qca_ustar_*.csv` -| -|-- **Initialize Error String Array** -| -|-- **Process Each Identified File** -| | -| |-- **Open and Read File** -| | - Extract metadata (site, year, lat, lon, timezone, htower, timeres, sc_negl, notes) -| | - Import data from file -| | -| |-- **Identify Columns** -| | - Identify indices for `USTAR`, `NEE`, `TA`, `PPFD_IN`, `SW_IN` -| | - Handle errors if columns are not found -| | -| |-- **Extract and Validate Data** -| | - Extract data for identified columns -| | - Replace invalid data values with `NaN` -| | - Handle `PPFD` data derived from `SW_IN` if necessary -| | -| |-- **uStar Threshold Computation** -| | - Call `cpdBootstrapUStarTh4Season20100901` for 4-season analysis -| | - Call `cpdAssignUStarTh20100901` to assign annual Cp arrays -| | -| |-- **Write Results** -| | - Write computed results to output files -| | - Append metadata and notes to result files -| | -| |-- **Handle Errors** -| | - Collect error messages in `error_str` -| | +| - Turn off warnings with `warning off` +| +|-- **Check Input Path** +| - Use `exist('input_folder')` to check if `input_folder` exists +| - If not, set `input_folder` to current directory using `pwd` +| - Ensure `input_folder` ends with a backslash or forward slash +| +|-- **Check Output Path** +| - Use `exist('output_folder')` to check if `output_folder` exists +| - If not, set `output_folder` to current directory using `pwd` +| - Ensure `output_folder` ends with a backslash or forward slash +| - Create output directory with `mkdir(output_folder)` +| +|-- **Initialize Variables** +| - Set indices: `USTAR_INDEX`, `NEE_INDEX`, `TA_INDEX`, `PPFD_INDEX`, `RG_INDEX` +| - Define `input_columns_names` +| - Print message with `fprintf` +| - Use `dir` to list files in `input_folder` matching pattern `*_qca_ustar_*.csv` +| - Print number of files found with `fprintf('%d files founded.\n\n', numel(d))` +| +|-- **Process Each File** +| - Loop through each file `d(n)` +| | - Print processing message with `fprintf` +| | - Open file with `fid = fopen([input_folder,d(n).name] ,'r')` +| | - If file cannot be opened (`fid == -1`), print error message and continue +| | - Read dataset with `textscan(fid,'%[^\n]')` +| +| |-- **Extract Metadata** +| | - Check and extract `site`, `year`, `lat`, `lon`, `timezone`, `htower`, `timeres`, `sc_negl`, `notes` using `strncmpi` and `strrep` +| | - If any metadata is missing, print error message and continue +| | - Close file with `fclose(fid)` +| +| |-- **Import Data** +| | - Import data using `importdata` +| | - Extract `header` and `data` from `imported_data` +| | - Initialize `columns_index` +| | - Parse header to find column indices using `strcmpi` +| | - If any required column is missing, print error message and continue +| +| |-- **Assign Variables** +| | - Extract data columns for `uStar`, `NEE`, `Ta`, `Rg` +| | - If `PPFD` is missing, calculate from `Rg` +| | - Handle missing data by setting invalid values to `NaN` +| +| |-- **Check Data Validity** +| | - Check if `NEE`, `uStar`, `Ta`, `Rg` are empty, print error message and continue if any are empty +| +| |-- **Calculate Time Vector** +| | - Calculate `t` based on `uStar` length and number of periods per day +| +| |-- **Flag Nighttime Periods** +| | - Set `fNight` where `Rg < 5` +| | - Set `T` to `Ta` +| +| |-- **Plot Inputs (if enabled)** +| | - If `fPlot` is enabled, plot `t, uStar, NEE, Ta, PPFD, Rg` using `plot` and `mydatetick` +| +| |-- **Call Bootstrap Program** +| | - Call `cpdBootstrapUStarTh4Season20100901` with `t, NEE, uStar, T, fNight, fPlot, cSiteYr, nBoot` +| | - Assign results to `Cp2`, `Stats2`, `Cp3`, `Stats3` +| | - Call `cpdAssignUStarTh20100901` with `Stats2, fPlot, cSiteYr` +| +| |-- **Save Results** +| | - If no failure (`isempty(cFailure)`), save results using `dlmwrite` and `fopen`/`fprintf`/`fclose` +| | - Print success message with `fprintf('ok\n')` +| | - If failure, print error message and append to `error_str` +| +|-- **Clear Variables** +| - Clear variables used in loop with `clear` +| |-- **End** +| - Return `exitcode` From 50c53a5683ddd4727190a86b1cf180c2c4d03840 Mon Sep 17 00:00:00 2001 From: James Emberton <60827102+j-emberton@users.noreply.github.com> Date: Mon, 22 Jul 2024 12:14:07 +0100 Subject: [PATCH 13/17] Update cpdBootstrapUStarTh4Season20100901 --- .../cpdBootstrapUStarTh4Season20100901 | 73 +++++++++++-------- 1 file changed, 44 insertions(+), 29 deletions(-) diff --git a/docs/flow_diagrams/ustar_cp/cpdBootstrapUStarTh4Season20100901 b/docs/flow_diagrams/ustar_cp/cpdBootstrapUStarTh4Season20100901 index 0ce84b9f..cb9573b6 100644 --- a/docs/flow_diagrams/ustar_cp/cpdBootstrapUStarTh4Season20100901 +++ b/docs/flow_diagrams/ustar_cp/cpdBootstrapUStarTh4Season20100901 @@ -5,34 +5,49 @@ |-- **Initialize Function** | - Define function with inputs: `t`, `NEE`, `uStar`, `T`, `fNight`, `fPlot`, `cSiteYr`, `nBoot` | -|-- **Initialize Variables** -| - Calculate `nt` (length of `t`) -| - Determine `nPerDay` (number of periods per day based on median difference in `t`) -| - Identify night indices `iNight` -| - Filter out invalid `uStar` values (`uStar < 0` or `uStar > 4`) -| - Set parameters: `nSeasons = 4`, `nStrataN = 4`, `nStrataX = 8`, `nBins = 50`, `nPerBin = 5` -| - Adjust `nPerBin` based on `nPerDay` (24 or 48) -| - Calculate `nPerSeason` and `ntN` -| -|-- **Find Valid Data Indices** -| - Identify non-NaN indices for `NEE`, `uStar`, `T` -| - Intersect with night indices to get `itNee` -| - Initialize `StatsMT` structure with default NaN values -| -|-- **Initialize Result Arrays** -| - Initialize `Cp2`, `Cp3`, `Stats2`, `Stats3` arrays with NaN values -| -|-- **Check Sufficient Data** -| - If `ntNee` (valid nighttime data points) is greater than or equal to `ntN` -| | -| |-- **Bootstrap Loop** -| | - For each bootstrap iteration `iBoot` from 1 to `nBoot` -| | | - Record start time `t0` -| | | - Generate random indices `it` for resampling -| | | - Calculate `ntNee` for resampled data -| | | - Set `fPlot` to 0 if `iBoot` > 1 -| | | - Call `cpdEvaluateUStarTh4Season20100901` with resampled data -| | | - Store results in `Cp2`, `Cp3`, `Stats2`, `Stats3` -| | | - Calculate and log processing time +|-- **Calculate Variables** +| - Calculate `nt` as `length(t)` +| - Calculate `nPerDay` using `round(1/nanmedian(diff(t)))` (MATLAB stats toolbox) +| +|-- **Initialize Arrays and Constants** +| - Identify night periods with `find(fNight)` +| - Identify invalid `uStar` values with `find(uStar<0 | uStar>4)` and set them to `NaN` +| - Define constants: +| - `nSeasons=4` +| - `nStrataN=4` +| - `nStrataX=8` +| - `nBins=50` +| - `nPerBin=3` (if `nPerDay=24`) or `nPerBin=5` (if `nPerDay=48`) +| - Calculate `nPerSeason` as `nStrataN * nBins * nPerBin` +| - Calculate `ntN` as `nSeasons * nPerSeason` +| +|-- **Find Valid Data Points** +| - Find valid indices `itNee` with `find(~isnan(NEE + uStar + T))` +| - Intersect with night periods `itNee = intersect(itNee, iNight)` +| - Calculate `ntNee` as `length(itNee)` +| +|-- **Initialize Statistics Structures** +| - Define `StatsMT` structure with fields initialized to `NaN` +| - Initialize `Cp2`, `Cp3`, `Stats2`, and `Stats3` as arrays of `NaN` +| +|-- **Check Data Sufficiency** +| - Check if `ntNee >= ntN`, otherwise skip bootstrapping +| +|-- **Bootstrapping Loop** +| - For each bootstrap iteration `iBoot`: +| | - Record start time with `t0 = now` +| | - Generate random indices `it = sort(randi(nt, nt, 1))` +| | - Calculate `ntNee` as `sum(ismember(it, itNee))` +| | - Set `fPlot` to 0 after the first iteration +| +| |-- **Evaluate uStar Thresholds** +| | - Call `cpdEvaluateUStarTh4Season20100901(t(it), NEE(it), uStar(it), T(it), fNight(it), fPlot, cSiteYr)` +| | - Assign results to `xCp2`, `xStats2`, `xCp3`, `xStats3` +| | - Calculate elapsed time `dt = (now - t0) * 24 * 60 * 60` +| +| |-- **Store Results** +| | - Store results in `Cp2(:,:,iBoot) = xCp2`, `Stats2(:,:,iBoot) = xStats2` +| | - Store results in `Cp3(:,:,iBoot) = xCp3`, `Stats3(:,:,iBoot) = xStats3` | |-- **End** +| - Return `Cp2`, `Stats2`, `Cp3`, `Stats3` From 69be6dd06a0ad84f5c8e818b5b9c24a032df7192 Mon Sep 17 00:00:00 2001 From: James Emberton <60827102+j-emberton@users.noreply.github.com> Date: Mon, 22 Jul 2024 12:16:25 +0100 Subject: [PATCH 14/17] Update cpdEvaluateUStarTh4Season20100901 --- .../cpdEvaluateUStarTh4Season20100901 | 92 ++++++++++--------- 1 file changed, 50 insertions(+), 42 deletions(-) diff --git a/docs/flow_diagrams/ustar_cp/cpdEvaluateUStarTh4Season20100901 b/docs/flow_diagrams/ustar_cp/cpdEvaluateUStarTh4Season20100901 index 9ffb6cd2..62b4ea7d 100644 --- a/docs/flow_diagrams/ustar_cp/cpdEvaluateUStarTh4Season20100901 +++ b/docs/flow_diagrams/ustar_cp/cpdEvaluateUStarTh4Season20100901 @@ -6,54 +6,62 @@ | - Define function with inputs: `t`, `NEE`, `uStar`, `T`, `fNight`, `fPlot`, `cSiteYr` | |-- **Initializations** -| - Calculate `nt` (length of `t`) -| - Extract year, month, day vectors from `t` using `fcDatevec` -| - Calculate `iYr` (median year) and `EndDOY` (end day of year) -| - Determine `nPerDay` (number of periods per day based on median difference in `t`) +| - Calculate `nt` as `length(t)` +| - Use `fcDatevec(t)` to extract `y`, `m`, `d` +| - Calculate `iYr` as `median(y)` +| - Calculate `EndDOY` using `fcDoy(datenum(iYr,12,31.5))` +| - Calculate `nPerDay` using `round(1/nanmedian(diff(t)))` (MATLAB stats toolbox) | -|-- **Set Parameters** -| - Set parameters: `nSeasons = 4`, `nStrataN = 4`, `nStrataX = 8`, `nBins = 50`, `nPerBin = 5` -| - Adjust `nPerBin` based on `nPerDay` (24 or 48) -| - Calculate `nPerSeasonN` and `nN` +|-- **Define Constants** +| - Define constants: +| - `nSeasons=4` +| - `nStrataN=4` +| - `nStrataX=8` +| - `nBins=50` +| - `nPerBin=3` (if `nPerDay=24`) or `nPerBin=5` (if `nPerDay=48`) +| - Calculate `nPerSeasonN` as `nStrataN * nBins * nPerBin` +| - Calculate `nN` as `nSeasons * nPerSeasonN` | -|-- **Filter Invalid Data** -| - Find and filter out invalid `uStar` values (`uStar < 0` or `uStar > 3`) -| - Identify valid indices for annual nighttime data (`itAnnual`) and calculate `ntAnnual` +|-- **Filter uStar Values** +| - Identify and set invalid `uStar` values to `NaN` using `find(uStar < 0 | uStar > 3)` +| +|-- **Find Valid Data Points** +| - Find valid indices `itAnnual` using `find(fNight == 1 & ~isnan(NEE + uStar + T))` +| - Calculate `ntAnnual` as `length(itAnnual)` | |-- **Initialize Outputs** -| - Initialize `Cp2`, `Cp3`, `Stats2`, `Stats3` arrays with NaN values -| - Initialize `StatsMT` structure with default NaN values +| - Initialize `Cp2` and `Cp3` as `NaN` arrays of size `nSeasons x nStrataX` +| - Initialize `Stats2` and `Stats3` as structures with fields: +| - `n`, `Cp`, `Fmax`, `p`, `b0`, `b1`, `b2`, `c2`, `cib0`, `cib1`, `cic2`, `mt`, `ti`, `tf`, `ruStarVsT`, `puStarVsT`, `mT`, `ciT` all set to `NaN` | -|-- **Check Sufficient Data** -| - If `ntAnnual` (valid annual nighttime data points) is less than `nN`, return +|-- **Check Data Sufficiency** +| - If `ntAnnual < nN`, return +| - Calculate `nPerSeason` as `round(ntAnnual / nSeasons)` | -|-- **Reorder Data** -| - Move December data to the beginning of the year and reorder +|-- **Reorder Data for Analysis** +| - Reorder data to move December to the beginning of the year | - Update `t`, `T`, `uStar`, `NEE`, `fNight` accordingly -| - Recalculate `itAnnual` and `ntAnnual` -| -|-- **Reset Parameters** -| - Calculate `nSeasons` based on actual number of good data -| - Recalculate `nPerSeason` -| -|-- **Stratify Data** -| - Stratify data by time using moving windows and by temperature class -| - For each season and temperature class, estimate change points `Cp2` and `Cp3` -| -|-- **Plotting (if enabled)** -| - Initialize plot settings if `fPlot` is set to 1 -| - Plot results for each season and strata if `fPlot` is set to 1 -| -|-- **Season Loop** -| - For each season `iSeason` -| | - Determine indices for the season (`jtSeason`, `itSeason`) -| | - Calculate number of strata based on `ntSeason` -| | - Calculate temperature thresholds `TTh` -| | - For each strata `iStrata` -| | | - Find indices for the strata (`itStrata`) -| | | - Bin `uStar` and `NEE` data and calculate mean values -| | | - Call `cpdFindChangePoint20100901` to find change points -| | | - Add additional fields to `xs2` and `xs3` not assigned by change-point function -| | | - Store results in `Cp2`, `Cp3`, `Stats2`, `Stats3` +| - Recalculate `itAnnual` with the new order +| +|-- **Reset Seasons Based on Valid Data** +| - Calculate `nSeasons` as `round(ntAnnual / nPerSeason)` +| - Update `nPerSeason` as `round(ntAnnual / nSeasons)` +| +|-- **Stratify Data and Estimate Change Points** +| - For each season `iSeason`: +| | - Define `jtSeason` range for the season +| | - Find indices `itSeason` for the current season +| | - Calculate `nStrata` based on data points per bin +| | - Adjust `nStrata` within bounds `nStrataN` and `nStrataX` +| | - Calculate temperature thresholds `TTh` using `prctile` +| | +| |-- **Process Each Temperature Stratum** +| | - For each stratum `iStrata`: +| | | - Identify `itStrata` within temperature range +| | | - Bin `uStar` and `NEE` using `fcBin` +| | | - Call `cpdFindChangePoint20100901(muStar, mNEE, fPlot, cPlot)` +| | | - Calculate additional statistics: `mt`, `ti`, `tf`, `ruStarVsT`, `puStarVsT`, `mT`, `ciT` +| | | - Assign results to `Cp2(iSeason, iStrata)`, `Stats2(iSeason, iStrata)`, `Cp3(iSeason, iStrata)`, `Stats3(iSeason, iStrata)` | |-- **End** +| - Return `Cp2`, `Stats2`, `Cp3`, `Stats3` From 71f569d0d802b201e6eb8d26161771157779e440 Mon Sep 17 00:00:00 2001 From: James Emberton <60827102+j-emberton@users.noreply.github.com> Date: Mon, 22 Jul 2024 12:19:11 +0100 Subject: [PATCH 15/17] Update cpdFindChangePoint20100901 --- .../ustar_cp/cpdFindChangePoint20100901 | 87 ++++++++++--------- 1 file changed, 45 insertions(+), 42 deletions(-) diff --git a/docs/flow_diagrams/ustar_cp/cpdFindChangePoint20100901 b/docs/flow_diagrams/ustar_cp/cpdFindChangePoint20100901 index 4beff9f1..a05f35cf 100644 --- a/docs/flow_diagrams/ustar_cp/cpdFindChangePoint20100901 +++ b/docs/flow_diagrams/ustar_cp/cpdFindChangePoint20100901 @@ -6,51 +6,54 @@ | - Define function with inputs: `xx`, `yy`, `fPlot`, `cPlot` | |-- **Initialize Outputs** -| - Set `Cp2`, `Cp3` to NaN -| - Initialize `s2` and `s3` structures with default NaN values +| - Set `Cp2 = NaN`, `Cp3 = NaN` +| - Initialize structure `s2` and `s3` with fields: `n`, `Cp`, `Fmax`, `p`, `b0`, `b1`, `b2`, `c2`, `cib0`, `cib1`, `cic2` all set to `NaN` | |-- **Exclude Missing Data** -| - Reshape `xx` and `yy` to column vectors -| - Find and exclude NaN values in `xx` and `yy` -| - Calculate number of valid data points `n` -| - If `n` is less than 10, return +| - Reshape `xx` and `yy` to column vectors `x` and `y` +| - Identify and remove `NaN` values from `x` and `y` +| - Calculate `n` as `length(x + y)` +| - If `n < 10`, return | |-- **Exclude Extreme Linear Regression Outliers** -| - Perform linear regression on `x` and `y` to get regression coefficients `a` -| - Calculate predicted values `yHat` and residuals `dy` -| - Calculate mean `mdy` and standard deviation `sdy` of residuals -| - Find and exclude outliers beyond `ns` (4) standard deviations -| - Calculate number of valid data points `n` -| - If `n` is less than 10, return -| -|-- **Compute Null Hypothesis Models** -| - Compute mean of `y` as `yHat2` and `SSERed2` -| - Perform linear regression on `x` and `y` to get `yHat3` and `SSERed3` -| - Set `nRed2 = 1`, `nFull2 = 2`, `nRed3 = 2`, `nFull3 = 3` -| -|-- **Compute F Scores** -| - Initialize `MT`, `Fc2`, `Fc3` arrays with NaN values -| - Set `nEndPtsN = 3` and calculate `nEndPts` -| - Loop through each data point to compute F scores: -| | - Fit 2-parameter model and compute `Fc2` -| | - Fit 3-parameter model and compute `Fc3` -| -|-- **Assign Change Points** -| - Find `Fmax2` and `iCp2`, set `xCp2` -| - Perform linear regression for 2-parameter model and calculate `yHat2` -| - Calculate p-value `p2` and assign `Cp2` if significant -| - Find `Fmax3` and `iCp3`, set `xCp3` -| - Perform linear regression for 3-parameter model and calculate `yHat3` -| - Calculate p-value `p3` and assign `Cp3` if significant -| -|-- **Assign Values to s2 and s3** -| - Check if `iCp2` is within valid range, if so, assign values to `s2` -| - Check if `iCp3` is within valid range, if so, assign values to `s3` -| -|-- **Plot Results (if enabled)** -| - If `fPlot` is 1: -| | - Plot `x`, `y`, `yHat2`, `yHat3`, `xCp2`, `xCp3` -| | - Set plot title and adjust plot limits -| | - Format plot appearance +| - Perform linear regression `a = [ones(n,1) x] \ y` (MATLAB regress) +| - Calculate `yHat`, `dy`, `mdy`, `sdy` +| - Identify outliers `iOut` based on `ns * sdy` threshold +| - Remove outliers from `x` and `y` +| - Recalculate `n` as `length(x + y)` +| - If `n < 10`, return +| +|-- **Compute Statistics for Reduced Models** +| - Calculate `yHat2 = mean(y)` +| - Calculate `SSERed2 = sum((y - yHat2).^2)` +| - Perform linear regression `a = [ones(n,1) x] \ y` (MATLAB regress) +| - Calculate `yHat3`, `SSERed3`, `nRed2`, `nFull2`, `nRed3`, `nFull3` +| +|-- **Compute F Score for Each Data Point** +| - Initialize `Fc2`, `Fc3`, and `MT` as `NaN` arrays +| - Define `nEndPtsN = 3`, calculate `nEndPts` +| - For each data point `i`: +| | - Fit 2-parameter model and calculate `Fc2(i)` +| | - Fit 3-parameter model and calculate `Fc3(i)` +| +|-- **Assign Change Points and Test Significance** +| - Calculate `Fmax2` and `iCp2` from `Fc2` +| - Perform linear regression for `Cp2` and test significance using `cpdFmax2pCp2(Fmax2, n)` +| - Assign `Cp2`, if `p2 > pSig`, set `Cp2 = NaN` +| - Calculate `Fmax3` and `iCp3` from `Fc3` +| - Perform linear regression for `Cp3` and test significance using `cpdFmax2pCp3(Fmax3, n)` +| - Assign `Cp3`, if `p3 > pSig`, set `Cp3 = NaN` +| +|-- **Assign Values to `s2` and `s3`** +| - If `iCp2` not too close to end points: +| | - Assign values to `s2` fields: `n`, `Cp`, `Fmax`, `p`, `b0`, `b1`, `cib0`, `cib1` +| - If `iCp3` not too close to end points: +| | - Assign values to `s3` fields: `n`, `Cp`, `Fmax`, `p`, `b0`, `b1`, `b2`, `c2`, `cib0`, `cib1`, `cic2` +| +|-- **Plot Results (if `fPlot == 1`)** +| - Use `cla`, `hold on`, `plot`, `grid on`, `box on` for plotting +| - Plot `x`, `y`, `yHat2`, `yHat3`, change points `xCp2`, `xCp3` +| - Set plot title, axis limits, and font size | |-- **End** +| - Return `Cp2`, `s2`, `Cp3`, `s3` From 2bcfc2d5a080b4282c5193d56d495138918225c3 Mon Sep 17 00:00:00 2001 From: James Emberton <60827102+j-emberton@users.noreply.github.com> Date: Mon, 22 Jul 2024 14:40:45 +0100 Subject: [PATCH 16/17] Create ustar_cp --- docs/flow_diagrams/ustar_cp/ustar_cp | 124 +++++++++++++++++++++++++++ 1 file changed, 124 insertions(+) create mode 100644 docs/flow_diagrams/ustar_cp/ustar_cp diff --git a/docs/flow_diagrams/ustar_cp/ustar_cp b/docs/flow_diagrams/ustar_cp/ustar_cp new file mode 100644 index 00000000..e2973c7e --- /dev/null +++ b/docs/flow_diagrams/ustar_cp/ustar_cp @@ -0,0 +1,124 @@ +# uStarTh Method Flow Diagram + +## Overview + +The `ustar_cp` method processes and evaluates the u* threshold (uStarTh) using change-point detection (cpd) for site-year data. The process involves the following main steps: + +1. **Initialization** +2. **Input and Output Path Validation** +3. **Data Loading and Parsing** +4. **Data Filtering** +5. **Data Transformation** +6. **Model Fitting and Change-Point Detection** +7. **Result Aggregation** +8. **Output Writing** + +## Flow Diagram + +### Detailed Steps and Function Calls + +1. **Initialization** + - Set initial parameters and preallocate arrays for outputs. + - **Functions**: `nanmedian()`, `ones()`, `NaN()` + - **Purpose**: Prepares the environment and variables for subsequent computations. + - **Code Example**: + ```matlab + exitcode = 0; + warning off; + USTAR_INDEX = 1; + NEE_INDEX = 2; + TA_INDEX = 3; + PPFD_INDEX = 4; + RG_INDEX = 5; + input_columns_names = {'USTAR', 'NEE', 'TA', 'PPFD_IN', 'SW_IN'}; + ``` + +2. **Input and Output Path Validation** + - Validate and set input and output paths. + - **Functions**: `exist()`, `pwd()`, `mkdir()` + - **Purpose**: Ensures correct paths are used for reading input data and writing output data. + - **Code Example**: + ```matlab + if 0 == exist('input_folder') + input_folder = [pwd '\']; + end + if 0 == exist('output_folder') + output_folder = [pwd '\']; + end + mkdir(output_folder); + ``` + +3. **Data Loading and Parsing** + - Load and parse data from input files. + - **Functions**: `dir()`, `fopen()`, `textscan()`, `importdata()` + - **Purpose**: Reads data from specified input files and prepares it for processing. + - **Code Example**: + ```matlab + d = dir([input_folder, '*_qca_ustar_*.csv']); + fid = fopen([input_folder, d(n).name], 'r'); + dataset = textscan(fid, '%[^\n]'); + imported_data = importdata([input_folder, d(n).name], ',', 9 + length(notes)); + ``` + +4. **Data Filtering** + - Exclude invalid data points and outliers. + - **Functions**: `find()`, `sum()`, `isnan()` + - **Purpose**: Ensures data quality by removing invalid or extreme values. + - **Code Example**: + ```matlab + uStar = data(:, columns_index(USTAR_INDEX)); + NEE = data(:, columns_index(NEE_INDEX)); + Ta = data(:, columns_index(TA_INDEX)); + Rg = data(:, columns_index(RG_INDEX)); + uStar(uStar == -9999) = NaN; + NEE(NEE == -9999) = NaN; + ``` + +5. **Data Transformation** + - Perform necessary data transformations and calculations. + - **Functions**: `mod()`, `sum()` + - **Purpose**: Prepares data for change-point detection by normalizing or transforming it. + - **Code Example**: + ```matlab + nrPerDay = mod(numel(uStar), 365); + if nrPerDay == 0; nrPerDay = mod(numel(uStar), 364); end + t = 1 + (1 / nrPerDay); + ``` + +6. **Model Fitting and Change-Point Detection** + - Fit models to the data and detect change-points. + - **Custom Functions**: `cpdBootstrapUStarTh4Season20100901()`, `cpdEvaluateUStarTh20100901()` + - **Purpose**: Models the relationship between variables and detects points where statistical properties change. + - **Code Example**: + ```matlab + [Cp2, Stats2, Cp3, Stats3] = cpdBootstrapUStarTh4Season20100901(t, NEE, uStar, T, fNight, fPlot, cSiteYr, nBoot); + [Cp, n, tW, CpW, cMode, cFailure, fSelect, sSine, FracSig, FracModeD, FracSelect] = cpdAssignUStarTh20100901(Stats2, fPlot, cSiteYr); + ``` + +7. **Result Aggregation** + - Aggregate and summarize results. + - **Functions**: `dlmwrite()`, `fopen()`, `fprintf()`, `datestr()` + - **Purpose**: Summarizes and aggregates results for final output. + - **Code Example**: + ```matlab + dlmwrite([output_folder, char(site), '_uscp_', char(year), '.txt'], Cp, 'precision', 8); + fid = fopen([output_folder, char(site), '_uscp_', char(year), '.txt'], 'a'); + fprintf(fid, '\n;processed with ustar_mp 1.0 on %s\n', datestr(clock)); + fclose(fid); + ``` + +8. **Output Writing** + - Write the final output to specified files. + - **Functions**: `dlmwrite()`, `fopen()`, `fprintf()` + - **Purpose**: Ensures the results are saved to the output files for further analysis or reporting. + - **Code Example**: + ```matlab + dlmwrite([output_folder, char(site), '_uscp_', char(year), '.txt'], Cp, 'precision', 8); + fid = fopen([output_folder, char(site), '_uscp_', char(year), '.txt'], 'a'); + fprintf(fid, '\n;processed with ustar_mp 1.0 on %s\n', datestr(clock)); + fclose(fid); + ``` + +## Conclusion + +The `ustar_cp` method is a comprehensive process involving data loading, filtering, model fitting, change-point detection, and result aggregation to evaluate the u* threshold for site-year data. Each step involves specific function calls and MATLAB library functions to ensure accurate and efficient processing. From 17bf933b27d752d077ade97288198e264c3fdc7a Mon Sep 17 00:00:00 2001 From: James Emberton <60827102+j-emberton@users.noreply.github.com> Date: Mon, 22 Jul 2024 14:51:23 +0100 Subject: [PATCH 17/17] Delete oneflux_steps/ustar_cp/cpdFindChangePoint20100901 --- .../ustar_cp/cpdFindChangePoint20100901 | 56 ------------------- 1 file changed, 56 deletions(-) delete mode 100644 oneflux_steps/ustar_cp/cpdFindChangePoint20100901 diff --git a/oneflux_steps/ustar_cp/cpdFindChangePoint20100901 b/oneflux_steps/ustar_cp/cpdFindChangePoint20100901 deleted file mode 100644 index 4beff9f1..00000000 --- a/oneflux_steps/ustar_cp/cpdFindChangePoint20100901 +++ /dev/null @@ -1,56 +0,0 @@ -# Functional Flow Diagram for `cpdFindChangePoint20100901` Function - -## Start -| -|-- **Initialize Function** -| - Define function with inputs: `xx`, `yy`, `fPlot`, `cPlot` -| -|-- **Initialize Outputs** -| - Set `Cp2`, `Cp3` to NaN -| - Initialize `s2` and `s3` structures with default NaN values -| -|-- **Exclude Missing Data** -| - Reshape `xx` and `yy` to column vectors -| - Find and exclude NaN values in `xx` and `yy` -| - Calculate number of valid data points `n` -| - If `n` is less than 10, return -| -|-- **Exclude Extreme Linear Regression Outliers** -| - Perform linear regression on `x` and `y` to get regression coefficients `a` -| - Calculate predicted values `yHat` and residuals `dy` -| - Calculate mean `mdy` and standard deviation `sdy` of residuals -| - Find and exclude outliers beyond `ns` (4) standard deviations -| - Calculate number of valid data points `n` -| - If `n` is less than 10, return -| -|-- **Compute Null Hypothesis Models** -| - Compute mean of `y` as `yHat2` and `SSERed2` -| - Perform linear regression on `x` and `y` to get `yHat3` and `SSERed3` -| - Set `nRed2 = 1`, `nFull2 = 2`, `nRed3 = 2`, `nFull3 = 3` -| -|-- **Compute F Scores** -| - Initialize `MT`, `Fc2`, `Fc3` arrays with NaN values -| - Set `nEndPtsN = 3` and calculate `nEndPts` -| - Loop through each data point to compute F scores: -| | - Fit 2-parameter model and compute `Fc2` -| | - Fit 3-parameter model and compute `Fc3` -| -|-- **Assign Change Points** -| - Find `Fmax2` and `iCp2`, set `xCp2` -| - Perform linear regression for 2-parameter model and calculate `yHat2` -| - Calculate p-value `p2` and assign `Cp2` if significant -| - Find `Fmax3` and `iCp3`, set `xCp3` -| - Perform linear regression for 3-parameter model and calculate `yHat3` -| - Calculate p-value `p3` and assign `Cp3` if significant -| -|-- **Assign Values to s2 and s3** -| - Check if `iCp2` is within valid range, if so, assign values to `s2` -| - Check if `iCp3` is within valid range, if so, assign values to `s3` -| -|-- **Plot Results (if enabled)** -| - If `fPlot` is 1: -| | - Plot `x`, `y`, `yHat2`, `yHat3`, `xCp2`, `xCp3` -| | - Set plot title and adjust plot limits -| | - Format plot appearance -| -|-- **End**