Investigate fuzzy search query performance discrepancy #2164

He1DAr · 2024-11-12T15:57:35Z

There’s a significant difference in performance when running the same query locally vs. in the dev environment, even though the query uses the same execution strategy in both environments.

Query:

SELECT 
	  *
      FROM stacks_blockchain_api.txs
      WHERE canonical = true AND microblock_canonical = true AND block_height <= 157521 AND burn_block_time <= 1730591999
	  AND (
        CASE 
          WHEN EXISTS (
            SELECT 1
            FROM pg_extension
            WHERE extname = 'pg_trgm'
          ) 
          THEN similarity(contract_call_function_name, 'swab') > 0.3
          ELSE contract_call_function_name ILIKE '%swab%'
          END
      )
      ORDER BY burn_block_time DESC, block_height DESC, microblock_sequence DESC, tx_index DESC
      LIMIT 30
      OFFSET 0;

Analysis output:

[
  {
    "Plan": {
      "Node Type": "Limit",
      "Parallel Aware": false,
      "Async Capable": false,
      "Startup Cost": 821.56,
      "Total Cost": 887.96,
      "Plan Rows": 30,
      "Plan Width": 2475,
      "Actual Startup Time": 730.829,
      "Actual Total Time": 730.837,
      "Actual Rows": 30,
      "Actual Loops": 1,
      "Plans": [
        {
          "Node Type": "Seq Scan",
          "Parent Relationship": "InitPlan",
          "Subplan Name": "InitPlan 1 (returns $0)",
          "Parallel Aware": false,
          "Async Capable": false,
          "Relation Name": "pg_extension",
          "Alias": "pg_extension",
          "Startup Cost": 0.00,
          "Total Cost": 1.01,
          "Plan Rows": 1,
          "Plan Width": 0,
          "Actual Startup Time": 0.019,
          "Actual Total Time": 0.020,
          "Actual Rows": 1,
          "Actual Loops": 1,
          "Filter": "(extname = 'pg_trgm'::name)",
          "Rows Removed by Filter": 1
        },
        {
          "Node Type": "Incremental Sort",
          "Parent Relationship": "Outer",
          "Parallel Aware": false,
          "Async Capable": false,
          "Startup Cost": 820.54,
          "Total Cost": 13030750.71,
          "Plan Rows": 5886325,
          "Plan Width": 2475,
          "Actual Startup Time": 730.827,
          "Actual Total Time": 730.831,
          "Actual Rows": 30,
          "Actual Loops": 1,
          "Sort Key": ["txs.burn_block_time DESC", "txs.block_height DESC", "txs.microblock_sequence DESC", "txs.tx_index DESC"],
          "Presorted Key": ["txs.burn_block_time"],
          "Full-sort Groups": {
            "Group Count": 1,
            "Sort Methods Used": ["quicksort"],
            "Sort Space Memory": {
              "Average Sort Space Used": 85,
              "Peak Sort Space Used": 85
            }
          },
          "Plans": [
            {
              "Node Type": "Index Scan",
              "Parent Relationship": "Outer",
              "Parallel Aware": false,
              "Async Capable": false,
              "Scan Direction": "Backward",
              "Index Name": "txs_burn_block_time_index",
              "Relation Name": "txs",
              "Alias": "txs",
              "Startup Cost": 0.43,
              "Total Cost": 12688723.86,
              "Plan Rows": 5886325,
              "Plan Width": 2475,
              "Actual Startup Time": 717.984,
              "Actual Total Time": 730.722,
              "Actual Rows": 31,
              "Actual Loops": 1,
              "Index Cond": "(burn_block_time <= 1730591999)",
              "Rows Removed by Index Recheck": 0,
              "Filter": "(canonical AND microblock_canonical AND (block_height <= 157521) AND CASE WHEN $0 THEN (similarity(contract_call_function_name, 'swab'::text) > '0.3'::double precision) ELSE (contract_call_function_name ~~* '%swab%'::text) END)",
              "Rows Removed by Filter": 981643
            }
          ]
        }
      ]
    },
    "Planning Time": 0.363,
    "Triggers": [
    ],
    "Execution Time": 730.896
  }
]

The text was updated successfully, but these errors were encountered:

zone117x · 2025-03-24T15:14:32Z

We've had several issues similar to this: a postgres query is relatively fast on a local machine, but slow in dev/staging. We've also seen instances where the query seems faster in prod or local when those environments are not under any load. I.e. a query may be fast on a highly provisioned machine that is not under load, but once it's under load that query ends end becoming a significant bottleneck.

It's generally a good rule of thumb that if a query is not performant in the dev/stg environment, then it will not be performant when used at scale in the prod environment.

With this particular query, I don't think we can squeeze out about 1-2 orders of magnitude better performance unless we make a significant change to the approach. I think we'd need to do something like: create a new table that is responsible for tracking search terms, and create a search query against that table. That approach may work but keep in mind it's more involved than it might seem at face value. It involves:

Create a new table in a sql migration file
Optimize the indexes for very fast search queries
Update several areas in the data ingestion code so that we write to this new table
Write an optimize sql read query against that new table

github-project-automation bot added this to API Board Nov 12, 2024

github-project-automation bot moved this to 🆕 New in API Board Nov 12, 2024

smcclellan assigned zone117x Nov 12, 2024

smcclellan moved this from 🆕 New to 🏗 In Progress in API Board Nov 18, 2024

smcclellan moved this from 🏗 In Progress to 📋 Backlog in API Board Nov 18, 2024

andresgalante added the explorer-redesign label Mar 24, 2025

saralab added the fuzzy-search label Mar 24, 2025

zone117x linked a pull request Mar 25, 2025 that will close this issue

feat: tx list function name fuzzy search #2102

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Investigate fuzzy search query performance discrepancy #2164

Investigate fuzzy search query performance discrepancy #2164

He1DAr commented Nov 12, 2024

zone117x commented Mar 24, 2025

Investigate fuzzy search query performance discrepancy #2164

Investigate fuzzy search query performance discrepancy #2164

Comments

He1DAr commented Nov 12, 2024

zone117x commented Mar 24, 2025