Merge branch 'tickets/DM-46111'

lsst · Nov 15, 2024 · f151af9 · f151af9
2 parents 720292e + dff5ded
commit f151af9
Show file tree

Hide file tree

Showing 92 changed files with 10,075 additions and 1,823 deletions.
diff --git a/doc/_static/ingest-options-pull.png b/doc/_static/ingest-options-pull.png
diff --git a/doc/_static/ingest-options-push.png b/doc/_static/ingest-options-push.png
diff --git a/doc/_static/ingest-options-read.png b/doc/_static/ingest-options-read.png
diff --git a/doc/_static/ingest-options.pptx b/doc/_static/ingest-options.pptx
diff --git a/doc/_static/ingest-table-types-dependent.png b/doc/_static/ingest-table-types-dependent.png
diff --git a/doc/_static/ingest-table-types-partitioned.png b/doc/_static/ingest-table-types-partitioned.png
diff --git a/doc/_static/ingest-table-types-regular.png b/doc/_static/ingest-table-types-regular.png
diff --git a/doc/_static/ingest-table-types.pptx b/doc/_static/ingest-table-types.pptx
diff --git a/doc/_static/ingest-trans-multiple-chunks.png b/doc/_static/ingest-trans-multiple-chunks.png
diff --git a/doc/_static/ingest-trans-multiple-one.png b/doc/_static/ingest-trans-multiple-one.png
diff --git a/doc/_static/ingest-trans-multiple-scattered.png b/doc/_static/ingest-trans-multiple-scattered.png
diff --git a/doc/_static/ingest-trans-multiple.pptx b/doc/_static/ingest-trans-multiple.pptx
diff --git a/doc/_static/ingest-transaction-fsm.png b/doc/_static/ingest-transaction-fsm.png
diff --git a/doc/_static/ingest-transaction-fsm.pptx b/doc/_static/ingest-transaction-fsm.pptx
diff --git a/doc/_static/ingest-transactions-aborted.png b/doc/_static/ingest-transactions-aborted.png
diff --git a/doc/_static/ingest-transactions-aborted.pptx b/doc/_static/ingest-transactions-aborted.pptx
diff --git a/doc/_static/ingest-transactions-failed.png b/doc/_static/ingest-transactions-failed.png
diff --git a/doc/_static/ingest-transactions-failed.pptx b/doc/_static/ingest-transactions-failed.pptx
diff --git a/doc/_static/ingest-transactions-resolved.png b/doc/_static/ingest-transactions-resolved.png
diff --git a/doc/_static/ingest-transactions-resolved.pptx b/doc/_static/ingest-transactions-resolved.pptx
diff --git a/doc/_static/subchunks.png b/doc/_static/subchunks.png
diff --git a/doc/admin/data-table-indexes.rst b/doc/admin/data-table-indexes.rst
diff --git a/doc/admin/director-index.rst b/doc/admin/director-index.rst
@@ -0,0 +1,108 @@
+.. _admin-director-index:
+
+Director Index
+==============
+
+The *director* indexes in Qserv are optional metadata tables associated with the *director* tables, which are explained in:
+
+- :ref:`ingest-api-concepts-table-types` (CONCEPTS)
+
+Each row in the index table refers to the corresponding row in the related *director* table. The association is done via
+the unique identifier of rows in the *director* table. In additon to the unique identifier, the index table also contains
+the number of a chunk (column ``chunkId``) which contains the row in the *director* table. The index table is used to speed up the queries that
+use the primary keys of *director* tables to reference rows.
+
+Here is an example of the index table schema and the schema of the corresponding *director* table ``Object``:
+
+.. code-block:: sql
+
+    CREATE TABLE qservMeta.test101__Object (
+        objectId BIGINT NOT NULL,
+        chunkId INT NOT NULL,
+        subChunkId INT NOT NULL,
+        PRIMARY KEY (objectId)
+    );
+
+    CREATE TABLE test101.Object (
+        objectId BIGINT NOT NULL,
+        ..
+    );  
+
+The index allows to speed up the following types of queries:
+
+- point queries (when an identifier is known)
+- ``JOIN`` queries (when the *director* table is used as a reference table by the dependent tables)
+
+Point queries can be executed without scanning all chunk tables of the *director* table. Once the chunk number is known,
+the query will be sent to the corresponding chunk table at a worker node where the table resides. For example,
+the following query can be several orders of magnitude faster with the index:
+
+.. code-block:: sql
+
+    SELECT * FROM test101.Object WHERE objectId = 123456;
+
+The index is optional. If the index table is not found in the Qserv Czar's database, queries will be executed
+by scanning all chunk tables of the *director* table.
+
+The index table can be built in two ways:
+
+- Automatically by the Qserv Replication/Ingest system during transaction commit time if the corresponding flag
+  was set as ``auto_build_secondary_index=1`` when calling the database registration service:
+
+  - :ref:`ingest-db-table-management-register-db` (REST)
+
+  .. note::
+
+    The index tables that are built automatically will be MySQL-partitioned. The partitioning is done
+    to speed up the index construction process and to benefit from using the distributed transactions
+    mechanism implemented in the Qserv Ingest system:
+
+    - :ref:`ingest-api-concepts-transactions` (CONCEPTS)
+
+    Having too many partitions in the index table can slow down user queries that use the index. Another side
+    effect of the partitions is an increased size of the table. The partitions can be consolidated at the database
+    *publishing* stage as described in the following section:
+
+    - :ref:`ingest-api-concepts-publishing-data` (CONCEPTS)
+
+- Manually, on the *published* databases using the following service:
+
+  - :ref:`ingest-director-index-build` (REST)
+
+  Note that the index tables built by this service will not be partitioned.
+
+The following example illustrates rebuilding the index of the *director* table ``Object`` that resides in
+the *published* database ``test101``:
+
+.. code-block:: bash
+
+    curl localhost:25081/ingest/index/secondary \
+        -X POST -H "Content-Type: application/json" \
+        -d '{"database":"test101", "director_table":"Object","rebuild":1,"local":1}'
+
+.. warning::
+
+  The index rebuilding process can be time-consuming and potentially affect the performance of user query processing
+  in Qserv. Depending on the size of the *director* table, the process can take from several minutes to several hours.
+  For *director* tables exceeding 1 billion rows, the process can be particularly lengthy.
+  It's recommended to perform the index rebuilding during a maintenance window or when the system load is low.
+
+Notes on the MySQL table engine configuration for the index
+-----------------------------------------------------------
+
+The current implementation of the Replication/Ingest system offers the following options for the implementation
+of index table:
+
+- ``innodb``: https://mariadb.com/kb/en/innodb/
+- ``myisam``: https://mariadb.com/kb/en/myisam-storage-engine/
+
+Each engine has its own pros and cons.
+
+The ``innodb`` engine is the default choice. The option is controlled by the following configuration parameter of the Master
+Replication Controller:
+
+- ``(controller,director-index-engine)``
+
+The parameter can be set via the command line when starting the controller:
+
+- ``--controller-director-index-engine=<engine>``
diff --git a/doc/admin/index.rst b/doc/admin/index.rst
@@ -1,15 +1,14 @@
-.. warning::
 
-   **Information in this guide is known to be outdated.** A documentation sprint is underway which will
-   include updates and revisions to this guide.
+.. _admin:
 
-####################
-Administration Guide
-####################
+#####################
+Administrator's Guide
+#####################
 
 .. toctree::
    :maxdepth: 4
 
    k8s
-   qserv-ingest/index
-   test-set
+   row-counters
+   data-table-indexes
+   director-index
diff --git a/doc/admin/qserv-ingest/index.rst b/doc/admin/qserv-ingest/index.rst
diff --git a/doc/admin/row-counters.rst b/doc/admin/row-counters.rst
@@ -0,0 +1,176 @@
+
+.. _admin-row-counters:
+
+=========================
+Row counters optimization
+=========================
+
+.. _admin-row-counters-intro:
+
+Introduction
+------------
+
+Soon after the initial public deployment of Qserv, it was noticed that numerous users were executing the following query:
+
+.. code-block:: sql
+
+    SELECT COUNT(*) FROM <database>.<table>
+
+Typically, Qserv handles this query by distributing it to all workers, which then count the rows in each chunk table and aggregate the results
+at the Czar. This process is akin to the one used for *shared scan* (or simply *scan*) queries. The performance of these *scan* queries can
+fluctuate based on several factors:
+
+- The number of chunks in the target table
+- The number of workers available
+- The presence of other concurrent queries (particularly slower ones)
+
+In the best-case scenario, such a scan would take seconds; in the worst case, it could take many minutes or even hours.
+This has led to frustration among users, as this query appears to be (and indeed is) a very trivial non-scan query.
+
+To address this situation, Qserv includes a built-in optimization specifically for this type of query.
+Here's how it works: Qserv Czar maintains an optional metadata table for each data table, which stores the row count for each
+chunk. This metadata table is populated and managed by the Qserv Replication system. If the table is found, the query
+optimizer will use it to determine the number of rows in the table without the need to scan all the chunks.
+
+Note that this optimization is currently optional for the following reasons:
+
+- Collecting counters requires scanning all chunk tables, which can be time-consuming. Performing this during
+  the catalog *publishing* phase would extend the ingest time and increase the likelihood of workflow instabilities
+  (generally, the longer an operation takes, the higher the probability of encountering infrastructure-related failures).
+- The counters are not necessary for the data ingest process itself. They are merely optimizations for query performance.
+- Building the counters before the ingested data have been quality assured (Q&A-ed) may not be advisable.
+- The counters may need to be rebuilt if the data are modified (e.g., after making corrections to the ingested catalogs).
+
+The following sections provide detailed instructions on building, managing, and utilizing the row counters, along with formal
+descriptions of the corresponding REST services.
+
+.. note::
+
+    In the future, the per-chunk counters will be used for optimizing another class of unconditional queries
+    presented below:
+
+    .. code-block:: sql
+
+        SELECT * FROM <database>.<table> LIMIT <N>
+        SELECT `col`,`col2` FROM <database>.<table> LIMIT <N>
+
+    For these "indiscriminate" data probes, Qserv would dispatch chunk queries to a subset of random chunks that have enough
+    rows to satisfy the requirements specified in ``LIMIT <N>``.
+
+.. _admin-row-counters-build:
+
+Building and deploying
+----------------------
+
+.. warning::
+
+    Depending on the scale of a catalog (data size of the affected table), it may take a while before this operation
+    will be complete.
+
+.. note::
+
+    Please, be advised that the very same operation could be performed at the catalog publishing time as explained in:
+
+    - :ref:`ingest-db-table-management-publish-db` (REST)
+
+    The decision to perform this operation during catalog publishing or as a separate step, as described in this document,
+    is left to the discretion of Qserv administrators or developers of the ingest workflows. It is generally recommended
+    to make it a separate stage in the ingest workflow. This approach can expedite the overall transition time of a catalog
+    to its final published state. Ultimately, row counters optimization is optional and does not impact the core functionality
+    of Qserv or the query results presented to users.
+
+To build and deploy the counters, use the following REST service:
+
+- :ref:`ingest-row-counters-deploy` (REST)
+
+The service needs to be invoked for every table in the ingested catalog. Here is a typical example of using this service,
+which will work even if the same operation was performed previously:
+
+.. code-block:: bash
+
+    curl http://localhost:25080/ingest/table-stats \
+      -X POST -H "Content-Type: application/json" \
+      -d '{"database":"test101",
+           "table":"Object",
+           "overlap_selector":"CHUNK_AND_OVERLAP",
+           "force_rescan":1,
+           "row_counters_state_update_policy":"ENABLED",
+           "row_counters_deploy_at_qserv":1,
+           "auth_key":""}'
+
+This method is applicable to all table types: *director*, *dependent*, *ref-match*, or *regular* (fully replicated).
+If the counters already exist in the Replication system's database, they will be rescanned and redeployed.
+
+It is advisable to compare Qserv's performance for executing the aforementioned queries before and after running this operation.
+Typically, if the table statistics are available in Qserv, the result should be returned in a small fraction of
+a second (approximately 10 milliseconds) on a lightly loaded Qserv.
+
+.. _admin-row-counters-delete:
+
+Deleting
+--------
+
+In certain situations, such as when there is suspicion that the row counters were inaccurately scanned or during the quality
+assurance (Q&A) process of the ingested catalog, a data administrator might need to remove the counters and allow Qserv
+to perform a full table scan. This can be achieved using the following REST service:
+
+- :ref:`ingest-row-counters-delete` (REST)
+
+Similarly to the previously mentioned service, this one should also be invoked for each table requiring attention. Here is
+an example:
+
+.. code-block:: bash
+
+    curl http://localhost:25080/ingest/table-stats/test101/Object \
+      -X DELETE -H "Content-Type: application/json" \
+      -d '{"overlap_selector":"CHUNK_AND_OVERLAP","qserv_only":1,"auth_key":""}'
+
+Note that with the parameters shown above, the statistics will be removed from Qserv only.
+This means the system would not need to rescan the tables again if the statistics need to be rebuilt. The counters could simply
+be redeployed later at Qserv. To remove the counters from the Replication system's persistent state as well,
+the request should have ``qserv_only=0``.
+
+An alternative approach, detailed in the next section, is to instruct Qserv to bypass the counters for query optimization.
+
+
+.. _admin-row-counters-disable:
+
+Disabling the optimization at run-time
+---------------------------------------
+
+.. warning::
+
+    This is a global setting that affects all users of Qserv. All new queries will be run without the optimization.
+    It should be used with caution. Typically, it is intended for use by the Qserv data administrator to investigate
+    suspected issues with Qserv or the catalogs it serves.
+
+To complement the previously explained methods for scanning, deploying, or deleting row counters for query optimization,
+Qserv also supports a run-time switch. This switch can be turned on or off by submitting the following statements via
+the Qserv front-ends:
+
+.. code-block:: sql
+
+    SET GLOBAL QSERV_ROW_COUNTER_OPTIMIZATION = 1
+    SET GLOBAL QSERV_ROW_COUNTER_OPTIMIZATION = 0
+
+The default behavior of Qserv, when the variable is not set, is to enable the optimization for tables where the counters
+are available.
+
+.. _admin-row-counters-retrieve:
+
+Inspecting
+----------
+
+It's also possible to retrieve the counters from the Replication system's state using the following REST service:
+
+.. code-block:: bash
+
+    curl http://localhost:25080/ingest/table-stats/test101/Object \
+      -X GET -H "Content-Type: application/json" \
+      -d '{"auth_key":""}'
+
+- :ref:`ingest-row-counters-inspect` (REST)
+
+The retrieved information can be utilized for multiple purposes, including investigating potential issues with the counters,
+monitoring data distribution across chunks, or creating visual representations of chunk density maps. Refer to the REST service
+documentation for more details on this topic.
diff --git a/doc/admin/test-set.rst b/doc/admin/test-set.rst
diff --git a/doc/conf.py b/doc/conf.py
@@ -42,6 +42,7 @@
     r"^https://rubinobs.atlassian.net/wiki/",
     r"^https://rubinobs.atlassian.net/browse/",
     r"^https://www.slac.stanford.edu/",
+    r".*/_images/",
 ]
 
 html_additional_pages = {