Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature: Multi-Instance Integration with Unified View #133

Open
EthraZa opened this issue Nov 19, 2024 · 2 comments
Open

Feature: Multi-Instance Integration with Unified View #133

EthraZa opened this issue Nov 19, 2024 · 2 comments

Comments

@EthraZa
Copy link

EthraZa commented Nov 19, 2024

I'd like to propose a new feature that enables multiple instances of Tianji to run simultaneously in different locations while consolidating data from all instances into a single main instance.

Objective:

  • Improve system availability and responsiveness by running multiple instances across various regions.
  • Provide a unified view of the system by integrating data from different instances, ensuring a seamless and comprehensive experience.

Perspectives:

  • Decentralized Architecture:
    Run separate instances in distinct locations with the ability to visually integrate their information, allowing for a more robust and fault-tolerant setup.
  • Hybrid Approach:
    Operate a single main instance alongside minimal agents deployed in other locations to collect uptime monitoring data, which is then periodically synced with the central instance.

Conclusion:

I would appreciate your thoughts on this proposed feature. I'm particularly interested in hearing from the code maintainers regarding the feasibility of implementing multi-instance integration and unified views.
Are there any technical challenges or concerns that need to be addressed?
What would be the estimated effort required to develop this feature?

Thank you.

@moonrailgun
Copy link
Contributor

Thank you for your point of view. I think this point of view is very interesting. I think the main usage is the monitor function, because only the monitor is time-sensitive. For places that are far away (such as the United States to France), there is a natural delay of 200ms+.

For this situation, I introduced trending mode, focusing on the user's point of view to view the trend of changes at different times.

Regarding your opinion, I think the best way is to deploy multiple proxies/agents on different nodes to regularly capture requests and report the results to tianji

In fact, in an abstract sense, we can even let tianji's monitor actively receive external numerical reports and use tianji as a simple time series database (of course it is not), which seems to be more flexible.

Or do you have other ideas?

Looking forward to your reply

@EthraZa
Copy link
Author

EthraZa commented Nov 28, 2024

I am thrilled that my suggestion was so well received.

I think I am not the only one who has problems when the location that is running the monitoring has Internet problems that may not be happening to the rest of the world.

In fact, I already imagined that the preference would be for remote agents since Tianji already has this feature with Servers and Telemetry, the option of running a complete Tianji in the remote environment would only be for the case of implementing monitoring in the agents would be too costly.

In this case, I believe that leveraging what Tianji does with Servers is already halfway there, and then it would so be needed to implement in this agent the replica of the monitoring mechanism of the main instance.

Regarding receiving "external numerical reports" it could be very interesting for cases where the remote agent cannot be a Tianji agent, but it is a service that already exists and is capable of active integration. I can see the benefits here.

So, a possible implementation of this functionality could follow the following itinerary:

1- Add/adapt API and schema to receive external numerical data. At this point, it would be already possible to create a simple script in the cron of a remote server that pings, formats the result and sends it to a Tianji instance.

2- Implement agents that mimic the monitoring of the Tianji instance. Now, that simple ping script could be replaced by the more complete agent.

3- Modify the UI to accommodate specific settings for monitoring agents. If a remote agent is going to do partial monitoring of targets, then it is necessary to have this setting in the Monitor UI to select who will be monitored remotely. (Perhaps specific agents should monitor groups of targets instead of single targets - kind of the same concept used in Groups in Pages.)

4- Implement the mechanism for generating and installing remote monitoring agents.

5- Do any necessary updates to Pages layout to reflect the monitoring location.

I see I got ahead of myself by suggesting a development timeline, but I'm excited about the potential implementation! :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants