Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Are there any plans to make Solid Queue more performant on Postgres? #508

Open
salmonsteak1 opened this issue Feb 4, 2025 · 4 comments
Open

Comments

@salmonsteak1
Copy link

Hey there, after running Solid Queue for a few months, we're seeing occasional spikes in our CPU usage. From what I understand in the README, there are some limitations due to Postgres not implementing loose index scan and I've ensured that my configuration accounts for these limitations. For example, my queue is defined to be *, and we've avoided pausing queues.

But we still sometimes see a spike in the query execution time for SELECT solid_queue_ready_execution. It seems like this query is already using an index scan but somehow the query still takes almost half a second to run, and it was executed over 500k times within the span of around 15 minutes. This is coming from Solid Queue itself, and not from me viewing the dashboard on Mission Control, but I also see similar queries that take this long when I view the Queues tab in Mission Control.

Here's some screenshots of the said query:

Image Image

Are there any plans to make Solid Queue more performant on Postgres? Thank you!

@rosa
Copy link
Member

rosa commented Feb 4, 2025

Hey @salmonsteak1, I'm afraid that query is already completely optimized, but I've got another improvement in mind to reduce the polling frequency.

It seems like this query is already using an index scan but somehow the query still takes almost half a second to run

Is it possible your DB is under some additional load unrelated to this that makes it so slow?

@salmonsteak1
Copy link
Author

Hey @rosa, we've actually increased our polling interval to 0.5 seconds as the default of 0.1s was consuming way too much CPU

We're running Solid Queue on a separate Cloud SQL instance that only has a single DB that's dedicated to solid queue, so I doubt that the DB is under any additional load

I'd actually like to ask if a fix for the pausing of queues on a Postgres DB is on the roadmap anytime soon? Due to the nature of our traffic, we wanted to use that feature to pause specific queues when there is a surge in jobs. Given the current situation, pausing it under such surges would put even more load on the DB which really isn't ideal. Thanks!

@rosa
Copy link
Member

rosa commented Feb 4, 2025

I'd actually like to ask if a fix for the pausing of queues on a Postgres DB is on the roadmap anytime soon?

Not anytime soon, no. There's a way to improve it with a recursive CTE like proposed here, but it needs to be different than that proposal.

The improvement I have in mind to reduce the polling frequency would help in your case I think, (the idea is to configure and poll more jobs than what the worker has capacity for, and keep them in memory) but even with that, you'd still see bad performance. I don't know why your DB is so slow in these cases. How many rows do you have normally in the solid_queue_ready_executions table? It feels like something is off there.

@salmonsteak1
Copy link
Author

We'd normally have around <100 jobs in our solid_queue_ready_executions table. Occasionally, there will be a spike in the number of jobs due to a surge in traffic, and I think that's when these long queries occur

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants