Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Possible improvements #7

Open
wants to merge 5 commits into
base: main
Choose a base branch
from
Open

Possible improvements #7

wants to merge 5 commits into from

Conversation

MaxGhi8
Copy link

@MaxGhi8 MaxGhi8 commented Nov 5, 2024

Good evening,
meanwhile, I thank you for your work which I found very interesting. I have tried to look at your implementation and I have the following three observations/questions.

  • I swap the scheduler_step and scheduller_gamma in _OtherModels/TrainFNO.py
  • It seems to me that the number of parameters for the FNO structure is not counted correctly since complex parameters are also counted only once and I would count them as two (one real parameter and one complex parameter). So I tried to extend the code so that the complex parameters are also taken into account following the previous observation.
  • The last question concerns the calculation of the L^1 relative norm. It seems to me that in your code it is calculated
$$\frac{ \sum_{i=1}^{N} \| u_i - u_{\theta, i} \|_{L^1} }{ \sum_{i=1}^{N} \| u_i \|_{L^1} },$$

instead of

$$\frac1N \sum_{i=1}^{N} \frac{ \| u_i - u_{\theta, i} \|_{L^1} }{ \| u_i \|_{L^1} }.$$

So I sent you the version that seems to me to be more correct for the calculation of the relative L1 norm. Anyway I tested and the difference between the two norms is not significant in your tests because the outputs of the dataset are renormalized, but still they are different quantities in general case.

Let me know what you think and thanks again for your work.
Best regards,
Massimiliano Ghiotto

@bogdanraonic3
Copy link
Member

Dear Max,

Thank you for the remarks!

I agree that the loss function could/should be changed. However, I believe that both definitions lead to good approximations.

I will check the parameter count for the FNO.

Best regards,
Bogdan

@MaxGhi8
Copy link
Author

MaxGhi8 commented Nov 11, 2024

Hi Bogdan,
in your workflow both the definitions of the norm are similar and both lead to good approximations. I have already tested and (after some epochs) achieved identical results, so there are no problems.

This is due, in my opinion, to the fact that your output is normalized (standard practice in machine learning) but in general applications, this can bring some bugs.
For example, if you have $N = 100$ examples, with $u_i$ the standard approximation and $u_i^{\theta}$ the N.O. approximation. If we suppose to have $| u_0 | = 10^n$, $| u_0 - u_0^{\theta} | = 10^{n-3}$, then the relative error for the first example is equal to $10^{-3}$. Then takes $| u_i| = 1$ and $| u_i - u_i^{\theta} | = 1$, for all $ i > 0$ in order to have relative error equal to $1$ for every $i$.
Then with

$$err_1 = \frac{ \sum_{i=1}^{N} \| u_i - u_{\theta, i} \|_{L^1} }{ \sum_{i=1}^{N} \| u_i \|_{L^1} },$$

we get

$$err_1 = \frac{10^{n-3} + 99}{10^n + 99} \approx 10^{-3}, \quad if\ n >> 1.$$

Instead, if you use the second option

$$err_2 = \frac1N \sum_{i=1}^{N} \frac{ \| u_i - u_{\theta, i} \|_{L^1} }{ \| u_i \|_{L^1} }$$

you got the following error

$$err_2 = \frac{1}{100} (10^{-3} + 99) \approx 1.$$

Since $99$ of the error is equal to $1$ then the total error should be the second one.

Let me know what you think about it and thank you so much for the response.
Best regards,
Max

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants