Skip to content

Explain data flow of self-hosted Sentry's architecture #3585

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
aldy505 opened this issue Feb 24, 2025 · 6 comments · May be fixed by getsentry/sentry-docs#13745
Open

Explain data flow of self-hosted Sentry's architecture #3585

aldy505 opened this issue Feb 24, 2025 · 6 comments · May be fixed by getsentry/sentry-docs#13745
Assignees

Comments

@aldy505
Copy link
Collaborator

aldy505 commented Feb 24, 2025

Problem Statement

From @kanadaj on Discord:

We need a step by step description of what data goes where and last time I checked we kinda lacked that. It's hard to know which container you have to debug if something isn't ingesting, since we have dozens of Kafka queues and processing services, and I know some of the services fees data back from Kafka into Kafka.

And yes, there is a rough outline with a chart somewhere, but it's not really specific enough at this point

Solution Brainstorm

No response

@aldy505
Copy link
Collaborator Author

aldy505 commented Mar 31, 2025

Question: What do you think would best to explain this, since a diagram would probably result in duplicated info with https://develop.sentry.dev/application-architecture/overview/

Probably it's better to explain what each container does, instead of the event flow?

....or probably... just update this frequently? https://github.com/getsentry/event-ingestion-graph -- and make it more detailed, since I don't think Relay only publish to "ingest-events" topic.

@BYK
Copy link
Member

BYK commented Mar 31, 2025

@hubertdeng123, thoughts? (or just tag other people? Maybe @untitaker or @markstory)

@getsantry getsantry bot moved this to Waiting for: Product Owner in GitHub Issues with 👀 3 Mar 31, 2025
@markstory
Copy link
Member

I agree we could do a better job on documenting how the various consumers and tasks interact for ingestion. The event-ingestion-graph diagram/document might be a good place to add the additional context of which topics and consumer pods are involved.

@hubertdeng123
Copy link
Member

IMO, if we want to explain the data flow here of Sentry's architecture in a diagram we should do so in terms of groups of containers, otherwise the diagram could get pretty overwhelming. Ingest consumers, snuba consumers, post-process-forwarders, etc. Like Mark mentioned above, mapping of topic to consumers would be really useful context to add

@aldy505
Copy link
Collaborator Author

aldy505 commented Apr 1, 2025

The event-ingestion-graph diagram/document might be a good place to add the additional context of which topics and consumer pods are involved.

@markstory Would the better idea is to move the event-ingestion-graph into the sentry-docs (dev section) instead? That way, it'll get noticed relatively quick by the employee if they know some parts of the ingestion pipeline is changing.

IMO, if we want to explain the data flow here of Sentry's architecture in a diagram we should do so in terms of groups of containers, otherwise the diagram could get pretty overwhelming. Ingest consumers, snuba consumers, post-process-forwarders, etc.

@hubertdeng123 I agree with this one.

@aldy505
Copy link
Collaborator Author

aldy505 commented Apr 7, 2025

Saw @markstory's thumb of approval. I'll work on this sometime next week.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: No status
Status: No status
Development

Successfully merging a pull request may close this issue.

4 participants