Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ml_analyzed error for bad last line of adc files #57

Open
hsosik opened this issue Apr 14, 2020 · 13 comments
Open

ml_analyzed error for bad last line of adc files #57

hsosik opened this issue Apr 14, 2020 · 13 comments

Comments

@hsosik
Copy link

hsosik commented Apr 14, 2020

The MATLAB and Python versions of the ml_analyzed calculation produce an error in some cases where no inhibit time is available from hdr file AND last line of adc file is bad. Previously we had a special case for ml < 0 (from last adc line), but some cases instead produce ml >> 5 ml (not realistic). I think the better criteria may be to compare the 2nd and 23rd entries on the last line to see if they are more different than a few 10s of milliseconds (the normal diff). In the matlab script just commited I've replaced:
if ml_analyzed(count) <= 0
with
if abs(adc.Var23(end)-adc.Var2(end)) > 0.1

I haven't fully tested this, but it works for bin D20180829T144312_IFCB125, which previously gave ml_analyzed = 32 ml, and now gives 3.7348 ml.

@joefutrelle
Copy link
Owner

joefutrelle commented Apr 14, 2020

For my reference, here is the relevant part of the Python code and the corresponding part of the MATLAB code

The equivalent to those columns in the v2 schema are ADC_TIME (Var2) and RUN_TIME (Var23)

@joefutrelle
Copy link
Owner

This is implemented and generates the same result for that bin. I can go ahead and push it now, or if you want to do further testing I can wait for that.

@hsosik
Copy link
Author

hsosik commented Apr 21, 2020

I did some more testing and found a need to update the criteria for detecting a bad last line in the adc file. The matlab script is now like this:
if abs(adc.Var23(end)-adc.Var2(end)) > 0.3

https://github.com/hsosik/ifcb-analysis/blob/master/IFCB_tools/IFCB_volume_analyzed_fromADC.m

@joefutrelle
Copy link
Owner

I modified my solution accordingly and created a PR.

#59

@joefutrelle
Copy link
Owner

@hsosik are we ready to go on this? If so I will merge the PR and deploy.

@hsosik
Copy link
Author

hsosik commented May 22, 2020

In the matlab implementation, I am now handling another (rare) special case where multiple lines at the end of the adc file are bad (zero run and inhibit times). [I had a previous commit into ifcb-analysis, that had an incorrect / incomplete implementation of this solution--now I committed what I think is a working version of IFCB_volume_analyzed_fromADC.m]

Can you add this case to the python implementation? In matlab, I'm doing this:
%minor case files with 0 runtime and inhibit time in numerous rows at file end
if ml_analyzed(count) <= 0
runtime = adc.Var2(end-1); %next best info after runtime
ii = find(adc.Var23);
modeinhibittime = mode(diff(adc.Var24(ii)));
inhibittime = adc.Var24(ii(end)) + (size(adc,1)-length(ii)) * modeinhibittime;
looktime = runtime - inhibittime;
ml_analyzed(count) = flowrate.*looktime/60;
end

@hsosik
Copy link
Author

hsosik commented May 23, 2020

By the way, here is a bin to test the new special case: \sosiknas1\IFCB_data\NESLTER_transect\data\2019\D20190205\D20190205T122609_IFCB127.adc

@joefutrelle
Copy link
Owner

What is the correct result for that bin? (so I can check my output)

@joefutrelle
Copy link
Owner

Can you briefly describe the algorithm? It looks quite different from the other cases in that it's computing some statistics over the whole file.

@hsosik
Copy link
Author

hsosik commented May 26, 2020

What is the correct result for that bin? (so I can check my output)

Correct is a strong word for this case....but my answer is 1.5722 millilters for that bin.

@hsosik
Copy link
Author

hsosik commented May 26, 2020

Can you briefly describe the algorithm? It looks quite different from the other cases in that it's computing some statistics over the whole file.

Yes, that's correct. It's like the previous case in getting run time from the end of column 2 of adc, but the inhibittime column has a lot of bad (zero) values in multiple rows at the bottom of the file. So, I'm estimating total inhibittime by taking the last good value (from col 24) and adding an estimate for the rows after that. I assume each row with missing inhibittime has a value equal to the mode of the measured inhibit times in col 24 (i.e., the "typical" dead time to handle a trigger).

@joefutrelle
Copy link
Owner

implemented in #61

@hsosik
Copy link
Author

hsosik commented Sep 8, 2022

Is the python implementation of this already pushed and included with the solution for #70?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants