Resource Health fine tuning lacking #208

stevedistef · 2024-05-02T15:00:56Z

Check for previous/existing GitHub issues

I have checked for previous/existing GitHub issues

Description

One of my customers implemented AMBA GA on Production Subscriptions, and they are seeing a lot of ResourceHealthUnhealthyAlert. We can see from the policy it can be turned off, but they would like to fine tune it (thresholding)? Is that possible?

When checking the alert itself, we see only these options. Is there any way to fine tune this alert?

Brunoga-MS · 2024-05-03T11:03:44Z

Hello @stevedistef ,
thanks for your feedback. Based on the UI provided by the Service / Resource Health alerts, it seems there are no options to further fine tune these alerts. The only possible fine tuning I can see here is the reduction of the statuses listed in the Previous resource status for which I need to investigate internally. However, one question came to mind: are these alerts fired for the same resource or for resources that are just named the same but located into different subscriptions or resource groups?

Thanks,
Bruno.

stevedistef · 2024-05-03T15:02:18Z

Buon Giorno @bruno and as thanks as always helping us adopt AMBA. You and the whole tiger team you have there under @paulgrimley really make it much easier than a DIY project :-)

OK so on this one, as we discussed, I can also see this environment has these resource health alerts and I filtered for one of the resources which shows up alot, sometimes with the same time stamp. I also filtered for only the last 30 days and then sorted by time:

When we checked the two which seemed to be redundant, they are actually different (2 different alerts).
WHen we check the first one with that same timestamp, we see this:

WHen we check the other which came at the same time, we see this different alert:

so the question becomes do we really need to see both....
When examining the actual alert, we see this: (I clicked on Alert Rule in previous screen shot to get here):

ANd then EDIT:
we see that perhaps we have set up too many "previous conditions":

I am going to ask the team using AMBA to go to Monito:ALerts:Alert Rules and edit the Resource Health alter for each of their subscriptions, removing the two previous conditions, and save it.
so repeating this step 4x in this case:

We will see if this is acceptable....

stevedistef · 2024-05-03T18:12:12Z

Customer trying over the weekend!

dbelso · 2024-05-13T06:59:40Z

Hi All,
I tried the workaround and it seemed working for few days but now the issue got worse and flood our Monitoring page.

Brunoga-MS · 2024-05-13T08:00:11Z

Hello @dbelso and @stevedistef ,
from your communication it looks like the fine tuning we applied was partially working. At this point we need to investigate further to understand why this is happening. We will keep you posted.

Thanks,
Bruno.

MarcoJanse · 2024-10-25T12:38:12Z

I have a sort like question. First of all, I was wondering why the ResourceHealthUnhealtyAlert has a target scope of All resources in Subscription>. I was expecting the MonitorDisable parameter to disable the ResourceHealth alerts for all the resources depending on the tag value I had set. (like Dev or Sandbox).

The amount of events from ResourceHealth for just one VM that's being powered off is quite overwhelming.

There's even an alert when the status does not actually transition:

For now, I have created a suppression rule for a couple of test VMs that are stopped/started frequently.
Is there a better way to do this that I'm not seeing?

stevedistef added the question Further information is requested label May 2, 2024

Brunoga-MS self-assigned this May 3, 2024

JoeyBarnes added the Pattern: ALZ 🚁 Issues / PR's related to the ALZ Pattern label Jul 10, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Resource Health fine tuning lacking #208

Resource Health fine tuning lacking #208

stevedistef commented May 2, 2024

Brunoga-MS commented May 3, 2024

stevedistef commented May 3, 2024

stevedistef commented May 3, 2024

dbelso commented May 13, 2024

Brunoga-MS commented May 13, 2024 •

edited

Loading

MarcoJanse commented Oct 25, 2024

Resource Health fine tuning lacking #208

Resource Health fine tuning lacking #208

Comments

stevedistef commented May 2, 2024

Check for previous/existing GitHub issues

Description

Brunoga-MS commented May 3, 2024

stevedistef commented May 3, 2024

stevedistef commented May 3, 2024

dbelso commented May 13, 2024

Brunoga-MS commented May 13, 2024 • edited Loading

MarcoJanse commented Oct 25, 2024

Brunoga-MS commented May 13, 2024 •

edited

Loading