-
Notifications
You must be signed in to change notification settings - Fork 4k
Description
Describe the problem
This query can be very inefficient when both the system.table_metadata and the system.namespace tables are large: https://github.com/cockroachdb/cockroach/blob/master/pkg/sql/tablemetadatacache/table_metadata_updater.go#L155
DELETE FROM system.table_metadata
WHERE table_id IN (
SELECT table_id
FROM system.table_metadata
WHERE table_id NOT IN (
SELECT id FROM system.namespace
)
LIMIT $1
)
RETURNING table_id
We had a cluster which initially contained several thousand tables and thousands of schemas created through an automated process. The updating of the system.table_metadata table took a long time and so the cluster owner decided to drop the majority of their tables. However, the database still contained around 80,000 schema objects so the namespace table was still very large. The inner SELECT table_id took nearly 15 minutes to return a batch of just 20 rows to be deleted. With a lot of recently dropped objects (in the thousands), this step was taking forever. In the end, I had to manually remove rows from the system.table_metadata table.
Suggestion
We replace the above query with the following:
DELETE FROM system.table_metadata
WHERE table_id NOT IN (
SELECT id FROM system.descriptor
)
LIMIT $1
RETURNING table_id
It does mean we wait a little longer to remove the row (we wait for the table to be GC'd rather than from the point it is dropped), but the above took just a couple of seconds.
Jira issue: CRDB-57333