Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement L1 Norm for level2 Misfit Calculation #295

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

Aayushi-10
Copy link

This update introduces the implementation of the L1 norm for level2 misfit calculations in MTUQ. This addition provides a more robust alternative, particularly in cases where data may contain outliers.
The L1 norm computation has been seamlessly integrated into the existing misfit evaluation framework, ensuring backward compatibility with the current L2 implementation. Users can now choose between L1 and L2 norms based on their specific requirements.
To demonstrate its effectiveness, the following comparisons between L1 and L2 norms are included, using both, all stations and the best stations:
L1 (All Stations): https://github.com/Aayushi-10/Test/blob/main/l1_all_stations.png
L1 (Best Stations): https://github.com/Aayushi-10/Test/blob/main/l1_best_stations.png
L2 (All Stations): https://github.com/Aayushi-10/Test/blob/main/l2_all_stations.png
L2 (Best Stations): https://github.com/Aayushi-10/Test/blob/main/l2_best_stations.png
This update aligns with findings from Figure 12 of Silwal 2016 (https://doi.org/10.1002/2015JB012588) which illustrates how L1 norm solutions exhibit enhanced stability and resilience against noise compared to L2 norm solutions.

@rmodrak
Copy link
Member

rmodrak commented Jan 29, 2025

Hi @Aayushi-10 , I am trying to understand both the pull request as well as Silwal2016. Agreed that L1 capability would be very useful. Let me respond further in a bit if that's alright. Also others might weigh in of course. Thank you

@rmodrak
Copy link
Member

rmodrak commented Jan 29, 2025

I'm looking at the new implementation. For each waveform, the inner loop computes d^2 + s^2 - 2sd. This factorizes to the L2 norm |s - d|^2.

After computing the L2 norm for each waveform, you do later take the square root. I would tend to call the result a sort of "hybrid L1-L2 norm". Something like this may already implemented here

https://github.com/rmodrak/mtuq/blob/master/mtuq/misfit/waveform/c_ext_L2.c#L268

@rmodrak
Copy link
Member

rmodrak commented Jan 29, 2025

In the end, there seems to be some inconsistency in the literature with regard to what is considered L1, L2 and hybrid.

Is this something Silwal2016 addresses? I could be missing it...

Curious what Zhao & Helmberger mean when they refer to L1. I can only access the abstract now. When I return from travel I will try to check the full text...

@thurinj
Copy link
Member

thurinj commented Jan 30, 2025

Hi @ryan, @Aayushi-10 ,

Here's what I've gathered regarding the norms defined in MTUQ. I think we inherit directly from the Zhao and Helmberger 1994 paper:

image

where the L2 is essentially the squared L2-norm.
Based on the general definition of the Lp norm:
image

the classical L2 norm is the one that is called "hybrid" in MTUQ which is expressed in the C code as:

// hybrid L2 norm
L2_sum += dt * weights(ista,ic) * L2_tmp;

// hybrid L1-L2 norm
L2_sum += dt * weights(ista,ic) * pow(L2_tmp, 0.5);

I've always been puzzled by the denomination, which I think is non-standard (note that our/Zhao & Helmberger "L2" norm is not expressed in the same units as the L1 norm, which could be an issue).

I think it's worth keeping this in mind, as we might want to fix this in the future, to be consistent with typical definition of the Euclidean norm from the measure and integration theory literature.

@rmodrak
Copy link
Member

rmodrak commented Jan 30, 2025 via email

@rmodrak
Copy link
Member

rmodrak commented Jan 30, 2025

The inline graphics didn't appear for me via email. I just saw them now the browser.

norms

I think the Zhao and Helmberger defintions above are the classical definitions for most types of applied math.

@Aayushi-10
Copy link
Author

Hi @rmodrak , @thurinj ,
Screenshot (518)
As mentioned in Silwal (2016), Equations 4 and 5, the L2 norm used here is its squared version. This is because we are not directly comparing the values of these two norms but rather the moment tensors that yield the minima.
As pointed out, I first computed 𝑟𝑖^2 directly, then applied the square root to obtain the residuals (𝑟𝑖), and finally summed them in the outermost loop, following the definition of the L1 norm. A similar implementation can be found in capuaf (https://github.com/uafgeotools/capuaf/blob/master/sub_misfit.c#L97), which references the approach by Zhao and Helmberger.
Looking forward to your input on this.

@rmodrak
Copy link
Member

rmodrak commented Jan 30, 2025

Hi @Aayushi-10,

Very helpful, thanks for the response, which confirms that Silwal and Tape 2016's definitions of L1 and L2 misfit functions diverge from Zhao and Helmberger's.

I'd argue for continuing to use Zhao and Helmberger's definitions in mtuq, as they are already written into the docs here and implemented in the code itself.

If there is agreement on this, then I could try to suggest some helpful ways to leverage the new pull request.

thanks
Ryan

@vsilwal
Copy link
Member

vsilwal commented Jan 31, 2025

Hi @rmodrak , @thurinj ,

Yes, our 2016 implementation was a hybrid approach combining L1 and L2 norms.

In CAPUAF and the current implementation by @Aayushi-10, the L2 norm is applied at the time-series level, while the L1 norm is used at the station window/component level. This approach was chosen to minimize errors primarily at problematic stations or components.

In my view, a true L1 norm would involve summing the absolute values of the entire time-series across all stations and windows. However, this method hasn’t yielded good results for me.

I think, the results likely depend on the level at which the hybridization (or L1 norm) is applied:

True L1: Summing the absolute values of individual time-series elements level.
Hybrid Option 1: Compute the L2 norm for the entire time-series and take the square root at the station level.
Hybrid Option 2: Compute the L2 norm for the entire time-series and take the square root at the station component/window level (our implementation) - provides good results as shown by @Aayushi-10 and 2016 paper
True L2: Compute the L2 norm for the entire time-series and sum for all stations and components and square root at the end.

I agree that both L1 and L2 norms should have the same units after applying absolute summation in L1, or a final square root in L2.

References:
L2 Norm
L1 Norm

@rmodrak
Copy link
Member

rmodrak commented Jan 31, 2025

Hi @vsilwal , very clear description, thank you! I think it'd be great to implement the four different options you mentioned

@rmodrak
Copy link
Member

rmodrak commented Jan 31, 2025

I think the new pull request duplicates what is already implemented via norm='hybrid'. If anyone is interested, they could feel free to double check that the existing norm='hybrid’ works, and that it gives the same result as the pull request

This existing norm='hybrid' approach, which adds only a short conditional block, does avoid having a lot of code duplication between two very similar cython modules. However, there could still be ways to improve the existing implementation, which would be welcome.

As another idea, anyone interested could feel free to implement a L1 norm in the sense of Zhao and Helmberger. I had tried to do this in the past, but the cython bottlenecks turn out to be very different, and I couldn't get anywhere near the performance from the L1 norm as we do from the L2 norm implementation. Any progress on this could be very helpful.

@thurinj
Copy link
Member

thurinj commented Feb 4, 2025

Hi all,

After carefully reviewing the code and directly testing @Aayushi-10 test branch, I confirm Ryan's impression, which is that this implementation is strictly identical to the "hybrid" mode currently implemented in c_ext_L2.c. Essentially, it redoes

L2_sum += dt * weights(ista, ic) * pow(L2_tmp, 0.5);

line, and hence it corresponds 1-to-1 to the hybrid misfit mode.

Regarding the possibility of using the precomputed correlation matrices (greens_greens, data_data, and greens_data) to build an L1 norm in C: from what I have checked, we cannot. Correct me if I'm wrong @rmodrak , but loosely speaking these matrices store the sums of products under the form

$\sum_t G_t^{(i)} G_t^{(j)}$
$\sum_t d_t^2$
and $\sum_t G_t^{(i)} d_t$
where $_t$ is the time dependency and thus we can't just go back to $\sum|s-d|$, where we need access to the whole synthetic and data array to perform the sample-wise comparison.

I've encountered issue in the waveform plotting where all the synthetic traces were misaligned, despite all the time-shift numbers seeming to be correct. The synthetic plot look like if the time-shifts were all maxed out:

image

which is not supposed to occur on the latest master branch, so make sure you're working with the latest version of MTUQ (I don't think the issue comes from my machine).

I hope this helps!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants