Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[YSQL] Seek Forward Optimization is not modeled in cost model #25902

Open
1 task done
gauravk-in opened this issue Feb 5, 2025 · 0 comments
Open
1 task done

[YSQL] Seek Forward Optimization is not modeled in cost model #25902

gauravk-in opened this issue Feb 5, 2025 · 0 comments
Assignees
Labels
area/ysql Yugabyte SQL (YSQL) kind/enhancement This is an enhancement of an existing feature priority/medium Medium priority issue status/awaiting-triage Issue awaiting triage

Comments

@gauravk-in
Copy link
Contributor

gauravk-in commented Feb 5, 2025

Jira Link: DB-15215

Description

Seek forward optimization reduces the number of expensive seek operations needed in lookup of a key in DocDB.

Before seeking to the next key, we check if the key can be found in the subsequent few tuples by performing cheaper next operations. Only if the key is not found, we proceed to seek to the key. The number of nexts performed is controlled by the max_nexts_to_avoid_seek yb-master gflag.

In the cost model, we currently assume that seek-forward optimization yields no benefit, or in other words the key is never found in the subsequent next operations.

For example,

# CREATE TABLE test (k1 INT, k2 INT, PRIMARY KEY (k1 ASC, k2 ASC));
# INSERT INTO test (SELECT s1, s2 FROM generate_series(1, 20) s1, generate_series(1, 20) s2);
# ANALYZE test;
# SET yb_enable_base_scans_cost_model = ON;

In the following examples, the keys we are close by and the optimization will help reduce the number of seeks. The cost model over-estimates the seeks.

# /*+ IndexScan(test) */ EXPLAIN (ANALYZE, DEBUG, DIST) SELECT * FROM test WHERE k2 IN (4, 5, 6, 7);
                                                      QUERY PLAN                                                       
-----------------------------------------------------------------------------------------------------------------------
 Index Scan using test_pkey on test  (cost=180.00..1130.52 rows=80 width=8) (actual time=3.339..3.405 rows=80 loops=1)
   Index Cond: (k2 = ANY ('{4,5,6,7}'::integer[]))
...
   Metric rocksdb_number_db_seek: 41.000
   Metric rocksdb_number_db_next: 140.000
...
   Estimated Seeks: 100
   Estimated Nexts: 200
...

# EXPLAIN (ANALYZE, DEBUG, DIST) SELECT * FROM test WHERE k2 IN (4, 6, 8, 10);
colodb=# /*+ IndexScan(test) */ EXPLAIN (ANALYZE, DEBUG, DIST) SELECT * FROM test WHERE k2 IN (4, 6, 8, 10);
                                                      QUERY PLAN                                                       
-----------------------------------------------------------------------------------------------------------------------
 Index Scan using test_pkey on test  (cost=180.00..1130.52 rows=80 width=8) (actual time=6.214..6.278 rows=80 loops=1)
   Index Cond: (k2 = ANY ('{4,6,8,10}'::integer[]))
...
   Metric rocksdb_number_db_seek: 41.000
   Metric rocksdb_number_db_next: 200.000
...
   Estimated Seeks: 100
   Estimated Nexts: 200
...

In the following case however, when keys are further apart, DocDB must perform seeks. In this case, the estimates from the model are correct.

# /*+ IndexScan(test) */ EXPLAIN (ANALYZE, DEBUG, DIST) SELECT * FROM test WHERE k2 IN (4, 7, 10, 13);
                                                      QUERY PLAN                                                       
-----------------------------------------------------------------------------------------------------------------------
 Index Scan using test_pkey on test  (cost=180.00..1130.52 rows=80 width=8) (actual time=3.632..3.695 rows=80 loops=1)
   Index Cond: (k2 = ANY ('{4,7,10,13}'::integer[]))
...
   Metric rocksdb_number_db_seek: 101.000
   Metric rocksdb_number_db_next: 200.000
...
   Estimated Seeks: 100
   Estimated Nexts: 200
...

Issue Type

kind/enhancement

Warning: Please confirm that this issue does not contain any sensitive information

  • I confirm this issue does not contain any sensitive information.
@gauravk-in gauravk-in added area/ysql Yugabyte SQL (YSQL) status/awaiting-triage Issue awaiting triage labels Feb 5, 2025
@gauravk-in gauravk-in self-assigned this Feb 5, 2025
@yugabyte-ci yugabyte-ci added kind/enhancement This is an enhancement of an existing feature priority/medium Medium priority issue labels Feb 5, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/ysql Yugabyte SQL (YSQL) kind/enhancement This is an enhancement of an existing feature priority/medium Medium priority issue status/awaiting-triage Issue awaiting triage
Projects
None yet
Development

No branches or pull requests

2 participants