binding in-lists in 10g

本文介绍Oracle 10g中新的IN-List绑定方法MEMBEROF,并通过性能对比展示了其与传统TABLE操作符的区别。实验表明,在小规模集合中MEMBEROF使用更简便,但在大规模数据集上效率远低于TABLE方法。

binding in-lists in 10g

This article demonstrates a new method for binding IN-lists in 10g, using the new collection condition MEMBER OF. Also included is a common pre-10g method and a performance comparison (with dramatic results which leads me to conclude that MEMBER OF is much simpler to use but should only be used with very small sets or lists).

Incidentally, how to bind IN-lists (i.e. how to pass a list of values in one parameter and have Oracle treat the single parameter as a list) is a common problem cited on many web forums. Unfortunately, what many developers do is construct a dynamic SQL statement and concatenate a delimited string to the end to represent the list of values. In addition to shifting from static to dynamic SQL, this approach also makes it impossible to use bind variables and, for form-based applications, also leaves the application wide open to SQL injection. An alternative to this is to turn the delimited string into a collection first and reference the collection in the SQL.

In this article, we will use examples that pass collections of data as IN-lists, rather than lists of values composed as delimited strings.

setup

First we will create some simple collection types to use in our examples, as follows.

SQL> CREATE OR REPLACE TYPE varchar2_ntt AS TABLE OF VARCHAR2(4000);
  2  /

Type created.
SQL> CREATE OR REPLACE TYPE number_ntt AS TABLE OF NUMBER;
  2  /

Type created.

in-list binding in previous versions

The correct way to bind an in-list, regardless of Oracle version from 8.0 onwards, is to have the list-values passed as a collection. We can then cast the collection to a pseudo-table (using the TABLE operator). We will see an example of this below. First we will create a simple procedure to find all objects in a list of categories. The list of categories will be passed to our procedure as a collection.

SQL> CREATE PROCEDURE which_objects( which_types_in IN varchar2_ntt ) AS
  2  BEGIN
  3     FOR r IN (SELECT object_name
  4               FROM   user_objects
  5               WHERE  object_type IN (SELECT column_value
  6                                      FROM   TABLE(which_types_in)))
  7     LOOP
  8        DBMS_OUTPUT.PUT_LINE( r.object_name );
  9     END LOOP;
 10  END;
 11  /

Procedure created.

Our client program generates the "in-list" collection and calls the procedure, as follows.

SQL> DECLARE
  2     nt_types varchar2_ntt := varchar2_ntt('TABLE','TYPE','PACKAGE');
  3  BEGIN
  4     which_objects( which_types_in => nt_types );
  5  END;
  6  /
TGT
PLSQL_PROFILER_DATA
PLSQL_PROFILER_UNITS
SRC
PROF_REPORT_UTILITIES
PLSQL_PROFILER_RUNS
PKG
XPLAN_OT
XPLAN_NTT
XPLAN
NUMBER_NTT
VARCHAR2_NTT
ETL_PKG

PL/SQL procedure successfully completed.

in-list binding in 10g

We will now see the 10g alternative using the new MEMBER OF condition for collections. The syntax is very simple and doesn't require us to cast the collection to a pseudo-table.

SQL> CREATE PROCEDURE which_objects_10g( which_types_in IN varchar2_ntt ) AS
  2  BEGIN
  3     FOR r IN (SELECT object_name
  4               FROM   user_objects
  5               WHERE  object_type MEMBER OF which_types_in)
  6     LOOP
  7        DBMS_OUTPUT.PUT_LINE( r.object_name );
  8     END LOOP;
  9  END;
 10  /

Procedure created.

We call this version of the procedure in the same way, as follows.

SQL> DECLARE
  2     nt_types varchar2_ntt := varchar2_ntt('TABLE','TYPE','PACKAGE');
  3  BEGIN
  4     which_objects_10g( which_types_in => nt_types );
  5  END;
  6  /
TGT
PLSQL_PROFILER_DATA
PLSQL_PROFILER_UNITS
SRC
PROF_REPORT_UTILITIES
PLSQL_PROFILER_RUNS
PKG
XPLAN_OT
XPLAN_NTT
XPLAN
NUMBER_NTT
VARCHAR2_NTT
ETL_PKG

PL/SQL procedure successfully completed.

a simple performance test

We will compare the new and old methods for IN-list binding under a couple of simple scenarios, as follows:

  • constraining a SQL statement as an IN-list; and
  • testing a literal value against an IN-list in PL/SQL.
test one: filtering a table based on an in-list

For our first performance test, we will compare methods for binding an IN-list to a SQL statement, using a version of Tom Kyte's RUNSTATS utility. We will create a table with 1 million rows and query it with an IN-list collection of approximately 100 keys. First, we create the sample table as follows.

SQL> CREATE TABLE million_rows
  2  AS
  3     SELECT ROWNUM AS id
  4     ,      RPAD('x',100) AS data
  5     FROM   dual
  6     CONNECT BY ROWNUM < 1000000;

Table created.

SQL> ALTER TABLE million_rows ADD
  2     CONSTRAINT million_rows_pk
  3     PRIMARY KEY (id);

Table altered.

SQL> BEGIN
  2     DBMS_STATS.GATHER_TABLE_STATS(
  3        USER, 'MILLION_ROWS', estimate_percent=>100
  4        );
  5  END;
  6  /

PL/SQL procedure successfully completed.

We will now create a small procedure to compare the new and old methods. This procedure will load a collection with a parameterised sample of IDs to test the TABLE operator and MEMBER OF condition at different volumes.

SQL> CREATE PROCEDURE compare_in_list( sample_in IN NUMBER ) IS
  2
  3     nt_ids number_ntt := number_ntt();
  4
  5     TYPE million_rows_aat IS TABLE OF million_rows%ROWTYPE
  6        INDEX BY PLS_INTEGER;
  7     aa_rows million_rows_aat;
  8
  9  BEGIN
 10
 11     /* Fetch a sample of data to use as the in-list... */
 12     EXECUTE IMMEDIATE
 13        'SELECT id FROM million_rows SAMPLE (' || sample_in || ')'
 14     BULK COLLECT INTO nt_ids;
 15
 16     DBMS_OUTPUT.PUT_LINE( 'IN-list size : ' || nt_ids.COUNT );
 17
 18     runstats_pkg.rs_start;
 19
 20     /* Fetch the matching records using the TABLE method... */
 21     SELECT * BULK COLLECT INTO aa_rows
 22     FROM   million_rows
 23     WHERE  id IN (SELECT column_value FROM TABLE(CAST(nt_ids AS number_ntt)));
 24
 25     runstats_pkg.rs_middle;
 26
 27     /* ...and the 10g MEMBER method... */
 28     SELECT * BULK COLLECT INTO aa_rows
 29     FROM   million_rows
 30     WHERE  id MEMBER OF nt_ids;
 31
 32     runstats_pkg.rs_stop(1000);
 33
 34  END;
 35  /

Procedure created.

We will test with a small sample of the MILLION_ROWS table (representing an IN-list of approximately 100 rows), as follows.

SQL> exec compare_in_list( 0.01 );
IN-list size : 107
Run1 ran in 110 hsecs
Run2 ran in 1606 hsecs
Run1 ran in 6.85% of the time




Name                                  Run1        Run2        Diff
STAT..recursive cpu usage                3       1,230       1,227
LATCH.multiblock read objects            0       2,156       2,156
LATCH.simulator hash latch             111       3,907       3,796
STAT..physical reads cache pre           0      14,112      14,112
STAT..free buffer inspected            128      15,169      15,041
STAT..consistent gets                  323      15,396      15,073
STAT..consistent gets from cac         323      15,396      15,073
STAT..session logical reads            323      15,396      15,073
STAT..free buffer requested            101      15,195      15,094
STAT..physical reads                   101      15,195      15,094
STAT..physical reads cache             101      15,195      15,094
LATCH.cache buffers lru chain          105      15,264      15,159
STAT..no work - consistent rea           1      15,385      15,384
STAT..table scan blocks gotten           0      15,385      15,385
LATCH.object queue header oper         210      30,458      30,248
LATCH.cache buffers chains             623      46,141      45,518
STAT..table scan rows gotten             0   1,000,000   1,000,000


Run1 latches total versus run2 -- difference and pct
        Run1        Run2        Diff        Pct
       1,435      99,876      98,441      1.44%

PL/SQL procedure successfully completed.

We can see a dramatic difference in performance between the two methods, with the new MEMBER OF condition taking 16 times longer than when we queried the collection as a pseudo-table. The key to this lack of performance is evident in the statistics, particularly the "table scan rows gotten". This indicates that Oracle is using a full table scan with the MEMBER OF method. We can verify this below with EXPLAIN PLAN (although we will use a literal collection rather than a bind variable for convenience).

SQL> EXPLAIN PLAN SET STATEMENT_ID = 'MEMBER' FOR
  2     SELECT *
  3     FROM   million_rows
  4     WHERE  id MEMBER OF number_ntt(1,2,3);

Explained.

SQL> SELECT * FROM TABLE(DBMS_XPLAN.DISPLAY(NULL, 'MEMBER'));

PLAN_TABLE_OUTPUT
--------------------------------------------------
Plan hash value: 1095651555

-------------------------------------------------- ...
| Id  | Operation         | Name         | Rows  | ...
-------------------------------------------------- ...
|   0 | SELECT STATEMENT  |              | 50000 | ...
|*  1 |  TABLE ACCESS FULL| MILLION_ROWS | 50000 | ...
-------------------------------------------------- ...

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter("ID"MEMBER OF"NUMBER_NTT"(1,2,3))

13 rows selected.

As indicated by the RUNSTATS results, Oracle has chosen a full table scan for the query, with an estimated cardinality of 50,000 (a 20% selectivity) for the filter. We can compare this with the CBO's treatment of a normal IN-list, as follows.

SQL> EXPLAIN PLAN SET STATEMENT_ID = 'IN' FOR
  2     SELECT *
  3     FROM   million_rows
  4     WHERE  id IN (1,2,3);

Explained.

SQL> SELECT * FROM TABLE(DBMS_XPLAN.DISPLAY(NULL, 'IN'));

PLAN_TABLE_OUTPUT
----------------------------------------------------------------
Plan hash value: 127701680

---------------------------------------------------------------- ...
| Id  | Operation                    | Name            | Rows  | ...
---------------------------------------------------------------- ...
|   0 | SELECT STATEMENT             |                 |     3 | ...
|   1 |  INLIST ITERATOR             |                 |       | ...
|   2 |   TABLE ACCESS BY INDEX ROWID| MILLION_ROWS    |     3 | ...
|*  3 |    INDEX RANGE SCAN          | MILLION_ROWS_PK |     3 | ...
---------------------------------------------------------------- ...

Predicate Information (identified by operation id):
---------------------------------------------------

   3 - access("ID"=1 OR "ID"=2 OR "ID"=3)

15 rows selected.

Oracle is able to recognise the list of literal values in this case and make the correct cardinality estimates, leading to an index range scan. Of course, this article is about parameterised IN-lists, so if we compare the new MEMBER method with the original TABLE operator method, we can see that Oracle handles the collection in a very different way.

SQL> EXPLAIN PLAN SET STATEMENT_ID = 'TABLE' FOR
  2     SELECT *
  3     FROM   million_rows
  4     WHERE  id IN (SELECT column_value FROM TABLE(number_ntt(1,2,3)));

Explained.

SQL> SELECT * FROM TABLE(DBMS_XPLAN.DISPLAY(NULL, 'TABLE'));

PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------
Plan hash value: 3090652353

--------------------------------------------------------------------------- ...
| Id  | Operation                               | Name            | Rows  | ...
--------------------------------------------------------------------------- ...
|   0 | SELECT STATEMENT                        |                 |   255 | ...
|   1 |  NESTED LOOPS                           |                 |   255 | ...
|   2 |   SORT UNIQUE                           |                 |       | ...
|   3 |    COLLECTION ITERATOR CONSTRUCTOR FETCH|                 |       | ...
|   4 |   TABLE ACCESS BY INDEX ROWID           | MILLION_ROWS    |     1 | ...
|*  5 |    INDEX UNIQUE SCAN                    | MILLION_ROWS_PK |     1 | ...
--------------------------------------------------------------------------- ...

Predicate Information (identified by operation id):
---------------------------------------------------

   5 - access("ID"=VALUE(KOKBF$))

17 rows selected.

With the TABLE operator method, Oracle chooses a nested loops join using the unique index on MILLION_ROWS, hence the major performance improvement. This is due to the semantic difference between the two methods. With the TABLE operator, Oracle unnests the values from the collection and then uses these to probe the MILLION_ROWS_PK index. With the MEMBER condition, however, we only have a filter condition, not a join. We are asking for records from MILLION_ROWS where the ID is in the collection, rather than the other way round. This means that Oracle must probe the collection with values from the MILLION_ROWS.ID column. This has a dramatic effect on performance and there seems to be no way to reverse this behaviour.

There are a number of ways we can try to correct the cardinality or force Oracle to use an index, for example:

  • OPT_ESTIMATE hint, e.g. /*+ OPT_ESTIMATE(table, million_rows, scale_rows=0.0001) */ ;
  • CARDINALITY hint, e.g. /*+ CARDINALITY(million_rows, 3) */ ;
  • FIRST_ROWS hint;
  • INDEX or INDEX_RS hints, e.g. INDEX(_RS)(million_rows, million_rows_pk) ;
  • OPTIMIZER_INDEX_COST_ADJ parameter, e.g. ALTER SESSION SET optimizer_index_cost_adj = 5 .

With any of these methods, Oracle will not rewrite our query in any way. There simply is no way around the fact that we are filtering MILLION_ROWS against the collection. The best we can achieve is an index full scan, forced by an appropriate hint, as follows.

SQL> EXPLAIN PLAN SET STATEMENT_ID = 'INDEX' FOR
  2     SELECT /*+ INDEX(million_rows) */ *
  3     FROM   million_rows
  4     WHERE  id MEMBER OF number_ntt(1,2,3);

Explained.

SQL> SELECT * FROM TABLE(DBMS_XPLAN.DISPLAY(NULL, 'INDEX'));

PLAN_TABLE_OUTPUT
-------------------------------------------------------------------
Plan hash value: 857746450

--------------------------------------------------------------- ...
| Id  | Operation                   | Name            | Rows  | ...
--------------------------------------------------------------- ...
|   0 | SELECT STATEMENT            |                 | 50000 | ...
|   1 |  TABLE ACCESS BY INDEX ROWID| MILLION_ROWS    | 50000 | ...
|*  2 |   INDEX FULL SCAN           | MILLION_ROWS_PK | 50000 | ...
--------------------------------------------------------------- ...

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - filter("ID"MEMBER OF"NUMBER_NTT"(1,2,3))

14 rows selected.

It appears that an index range scan is simply not available as an access path when using this new condition, because Oracle will not reverse the direction of the filter.

The above performance tests were conducted using many probes of a small collection. It follows, therefore, that the larger the collection, the greater the disparity in performance between TABLE and MEMBER, with the latter becoming almost unusable. Some example timings are as follows.

Sample SizeIN-List SizeIN TABLE() (s)MEMBER OF (s)
0.00150.15.1
0.011071.116.1
0.198413.7107.2
1.010,00021.4>1,200

Note that this behaviour was last tested on an 11g Release 1 database (11.1.0.6) and it is unchanged.

test two: probing an in-list with a single value

The previous tests measured the performance of a parameterised IN-list in SQL, which is the most common application for this technique. However, we can also use the MEMBER condition in PL/SQL tests. For our second performance test, we will repeatedly probe a parameterised list with a single value, again comparing the original TABLE operator with the new MEMBER condition. For completeness, we will also measure the performance of a loop through the parameterised list to search for the single value. We will use an anonymous block for these tests, as follows.

SQL> DECLARE
  2     in_list  number_ntt := number_ntt();
  3     v_cnt    PLS_INTEGER;
  4     v_hits   PLS_INTEGER := 0;
  5  BEGIN
  6
  7     /* Add 100 elements to the collection "IN-list"... */
  8     in_list.EXTEND(100);
  9     FOR i IN 1 .. 100 LOOP
 10        in_list(i) := i;
 11     END LOOP;
 12
 13     /* TABLE operator... */
 14     timer.snap;
 15     FOR i IN 1 .. 10000 LOOP
 16        SELECT COUNT(*) INTO v_cnt FROM TABLE(in_list) WHERE column_value = i;
 17        IF v_cnt > 0 THEN
 18           v_hits := v_hits + 1;
 19        END IF;
 20     END LOOP;
 21     timer.show('TABLE  (hits=' || v_hits || ')');
 22
 23     /* MEMBER condition... */
 24     v_hits := 0;
 25     timer.snap;
 26     FOR i IN 1 .. 10000 LOOP
 27        IF i MEMBER OF in_list THEN
 28           v_hits := v_hits + 1;
 29        END IF;
 30     END LOOP;
 31     timer.show('MEMBER (hits=' || v_hits || ')');
 32
 33     /* Collection loop... */
 34     v_hits := 0;
 35     timer.snap;
 36     FOR i IN 1 .. 10000 LOOP
 37        FOR ii IN 1 .. in_list.COUNT LOOP
 38           IF i = in_list(ii) THEN
 39              v_hits := v_hits + 1;
 40              EXIT;
 41           END IF;
 42        END LOOP;
 43     END LOOP;
 44     timer.show('LOOP   (hits=' || v_hits || ')');
 45
 46  END;
 47  /
[TABLE  (hits=100)] 1.22 seconds
[MEMBER (hits=100)] 0.04 seconds
[LOOP   (hits=100)] 0.21 seconds

PL/SQL procedure successfully completed.

As there is no table access involved, we have used a simple TIMER package to measure relative performance. We can see that for this type of IN-list probing, the new MEMBER condition is more efficient and effective than any other method, so it might be worth using this technique under certain circumstances.

further reading

For a good discussion of techniques for binding IN-lists prior to 10g, particularly when they are supplied as delimited strings instead of collections, see this article by William Robertson. For more information on new collection conditions and operators in 10g, see this oracle-developer.net article . The TIMER and RUNSTATS packages used in the performance tests for this article can be found on the Utilities page of this site.

source code

The source code for the examples in this article can be downloaded from here .

Adrian Billington, June 2004 (updated September 2008)

 

 

 

原文:http://www.oracle-developer.net/display.php?id=301

gapinyc@DESKTOP-9QS7RL5:~/superset-prod$ docker-compose build --no-cache superset docker-compose down -v # 删除旧卷(谨慎!如有数据请备份) docker-compose up -d [+] Building 78.2s (12/12) FINISHED => [internal] load local bake definitions 0.0s => => reading from stdin 558B 0.0s => [internal] load build definition from Dockerfile.superset 0.0s => => transferring dockerfile: 1.32kB 0.0s => [internal] load metadata for docker.io/apache/superset:latest 0.2s => [internal] load .dockerignore 0.0s => => transferring context: 2B 0.0s => CACHED [1/6] FROM docker.io/apache/superset:latest@sha256:09735adaae3945c5a8fabbeca31a7954c8c74de195702ba61622aee8604d5c55 0.0s => => resolve docker.io/apache/superset:latest@sha256:09735adaae3945c5a8fabbeca31a7954c8c74de195702ba61622aee8604d5c55 0.0s => [2/6] RUN apt-get update && apt-get install -y --no-install-recommends build-essential libssl-dev libffi-dev python3-dev 39.8s => [3/6] RUN python -m ensurepip --upgrade || (wget https://bootstrap.pypa.io/get-pip.py -O /tmp/get-pip.py && python /tmp/get-pip.py --user) 3.0s => [4/6] RUN python -m pip --version 0.4s => [5/6] RUN python -m pip install --upgrade pip && python -m pip install "gevent>=21.12.0" psutil pymysql flask-appbuilder 21.2s => [6/6] RUN python -c "import gevent; print(f'✅ Successfully imported gevent {gevent.__version__}')" 0.4s => exporting to image 12.7s => => exporting layers 9.7s => => exporting manifest sha256:110799b44b40f865ae592e760327dbea89f0afe78fe517ca6ab7795fc458eb1b 0.0s => => exporting config sha256:73dad9617916977e888fa45071347822fbd95057cd2070b82b89ddf25cf8ea83 0.0s => => exporting attestation manifest sha256:62f7f32d24dba9199b0a69458cb80018db2d1c448676f5ed9814b128e183eb0d 0.0s => => exporting manifest list sha256:b80232b9847656b65d53b3ba460598c5cba5428d2ae0555e98010618df3a4f9c 0.0s => => naming to docker.io/library/superset-with-gevent:latest 0.0s => => unpacking to docker.io/library/superset-with-gevent:latest 2.9s => resolving provenance for metadata file 0.0s [+] Building 1/1 ✔ superset-with-gevent Built 0.0s [+] Running 6/6 ✔ Container superset_init Removed 0.1s ✔ Container superset_celery Removed 0.1s ✔ Container nginx Removed 0.7s ✔ Container superset_app Removed 1.6s ✔ Container redis Removed 0.5s ✔ Network superset-prod_superset-net Removed 0.4s [+] Running 6/6 ✔ Network superset-prod_superset-net Created 0.0s ✔ Container redis Started 0.6s ✔ Container superset_app Started 0.7s ✔ Container superset_celery Started 0.9s ✔ Container nginx Started 1.0s ✔ Container superset_init Started 0.8s gapinyc@DESKTOP-9QS7RL5:~/superset-prod$ docker exec superset_app ps aux | grep gunicorn superset 1 0.3 0.3 34064 27992 ? Ss 00:50 0:00 /app/.venv/bin/python3 /app/.venv/bin/gunicorn -w 4 -k gevent --bind 0.0.0.0:8088 superset.app:create_app() superset 7 10.4 2.8 989260 224104 ? Sl 00:50 0:06 /app/.venv/bin/python3 /app/.venv/bin/gunicorn -w 4 -k gevent --bind 0.0.0.0:8088 superset.app:create_app() superset 8 10.3 2.8 989260 224564 ? Sl 00:50 0:06 /app/.venv/bin/python3 /app/.venv/bin/gunicorn -w 4 -k gevent --bind 0.0.0.0:8088 superset.app:create_app() superset 9 10.2 2.8 989092 224508 ? Sl 00:50 0:05 /app/.venv/bin/python3 /app/.venv/bin/gunicorn -w 4 -k gevent --bind 0.0.0.0:8088 superset.app:create_app() superset 10 10.8 2.8 989204 224768 ? Sl 00:50 0:06 /app/.venv/bin/python3 /app/.venv/bin/gunicorn -w 4 -k gevent --bind 0.0.0.0:8088 superset.app:create_app() gapinyc@DESKTOP-9QS7RL5:~/superset-prod$ docker exec superset_app python -c "from PIL import Image; print('OK')" OK gapinyc@DESKTOP-9QS7RL5:~/superset-prod$ docker logs superset_app | grep -i error [2025-10-22 00:50:25 +0000] [1] [INFO] Starting gunicorn 23.0.0 [2025-10-22 00:50:25 +0000] [1] [INFO] Listening at: http://0.0.0.0:8088 (1) [2025-10-22 00:50:25 +0000] [1] [INFO] Using worker: gevent [2025-10-22 00:50:25 +0000] [7] [INFO] Booting worker with pid: 7 [2025-10-22 00:50:25 +0000] [8] [INFO] Booting worker with pid: 8 [2025-10-22 00:50:25 +0000] [9] [INFO] Booting worker with pid: 9 [2025-10-22 00:50:26 +0000] [10] [INFO] Booting worker with pid: 10 2025-10-22 00:50:29,899:INFO:superset.initialization:Setting database isolation level to READ COMMITTED 2025-10-22 00:50:29,900:INFO:superset.initialization:Setting database isolation level to READ COMMITTED /app/.venv/lib/python3.10/site-packages/flask_limiter/extension.py:333: UserWarning: Using the in-memory storage for tracking rate limits as no storage was explicitly specified. This is not recommended for production use. See: https://flask-limiter.readthedocs.io#configuring-a-storage-backend for documentation about configuring the storage backend. warnings.warn( /app/.venv/lib/python3.10/site-packages/flask_limiter/extension.py:333: UserWarning: Using the in-memory storage for tracking rate limits as no storage was explicitly specified. This is not recommended for production use. See: https://flask-limiter.readthedocs.io#configuring-a-storage-backend for documentation about configuring the storage backend. warnings.warn( 2025-10-22 00:50:29,937:INFO:superset.initialization:Setting database isolation level to READ COMMITTED /app/.venv/lib/python3.10/site-packages/flask_limiter/extension.py:333: UserWarning: Using the in-memory storage for tracking rate limits as no storage was explicitly specified. This is not recommended for production use. See: https://flask-limiter.readthedocs.io#configuring-a-storage-backend for documentation about configuring the storage backend. warnings.warn( 2025-10-22 00:50:29,984:INFO:superset.initialization:Setting database isolation level to READ COMMITTED /app/.venv/lib/python3.10/site-packages/flask_limiter/extension.py:333: UserWarning: Using the in-memory storage for tracking rate limits as no storage was explicitly specified. This is not recommended for production use. See: https://flask-limiter.readthedocs.io#configuring-a-storage-backend for documentation about configuring the storage backend. warnings.warn( gapinyc@DESKTOP-9QS7RL5:~/superset-prod$ docker compose superset_init Usage: docker compose [OPTIONS] COMMAND Define and run multi-container applications with Docker Options: --all-resources Include all resources, even those not used by services --ansi string Control when to print ANSI control characters ("never"|"always"|"auto") (default "auto") --compatibility Run compose in backward compatibility mode --dry-run Execute command in dry run mode --env-file stringArray Specify an alternate environment file -f, --file stringArray Compose configuration files --parallel int Control max parallelism, -1 for unlimited (default -1) --profile stringArray Specify a profile to enable --progress string Set type of progress output (auto, tty, plain, json, quiet) --project-directory string Specify an alternate working directory (default: the path of the, first specified, Compose file) -p, --project-name string Project name Management Commands: bridge Convert compose files into another model Commands: attach Attach local standard input, output, and error streams to a service's running container build Build or rebuild services commit Create a new image from a service container's changes config Parse, resolve and render compose file in canonical format cp Copy files/folders between a service container and the local filesystem create Creates containers for a service down Stop and remove containers, networks events Receive real time events from containers exec Execute a command in a running container export Export a service container's filesystem as a tar archive images List images used by the created containers kill Force stop service containers logs View output from containers ls List running compose projects pause Pause services port Print the public port for a port binding ps List containers publish Publish compose application pull Pull service images push Push service images restart Restart service containers rm Removes stopped service containers run Run a one-off command on a service scale Scale services start Start services stats Display a live stream of container(s) resource usage statistics stop Stop services top Display the running processes unpause Unpause services up Create and start containers version Show the Docker Compose version information volumes List volumes wait Block until containers of all (or specified) services stop. watch Watch build context for service and rebuild/refresh containers when files are updated Run 'docker compose COMMAND --help' for more information on a command. unknown docker command: "compose superset_init" gapinyc@DESKTOP-9QS7RL5:~/superset-prod$ docker logs superset_init Waiting for Superset DB to be ready... Loaded your LOCAL configuration at [/etc/superset/superset_config.py] 2025-10-22 00:50:42,163:ERROR:superset.app:Failed to create app Traceback (most recent call last): File "/app/.venv/lib/python3.10/site-packages/superset/app.py", line 40, in create_app app_initializer.init_app() File "/app/.venv/lib/python3.10/site-packages/superset/initialization/__init__.py", line 466, in init_app self.setup_db() File "/app/.venv/lib/python3.10/site-packages/superset/initialization/__init__.py", line 662, in setup_db pessimistic_connection_handling(db.engine) File "/app/.venv/lib/python3.10/site-packages/flask_sqlalchemy/__init__.py", line 998, in engine return self.get_engine() File "/app/.venv/lib/python3.10/site-packages/flask_sqlalchemy/__init__.py", line 1017, in get_engine return connector.get_engine() File "/app/.venv/lib/python3.10/site-packages/flask_sqlalchemy/__init__.py", line 594, in get_engine self._engine = rv = self._sa.create_engine(sa_url, options) File "/app/.venv/lib/python3.10/site-packages/flask_sqlalchemy/__init__.py", line 1027, in create_engine return sqlalchemy.create_engine(sa_url, **engine_opts) File "<string>", line 2, in create_engine File "/app/.venv/lib/python3.10/site-packages/sqlalchemy/util/deprecations.py", line 375, in warned return fn(*args, **kwargs) File "/app/.venv/lib/python3.10/site-packages/sqlalchemy/engine/create.py", line 544, in create_engine dbapi = dialect_cls.dbapi(**dbapi_args) File "/app/.venv/lib/python3.10/site-packages/sqlalchemy/dialects/mysql/pymysql.py", line 80, in dbapi return __import__("pymysql") ModuleNotFoundError: No module named 'pymysql' Traceback (most recent call last): File "/app/.venv/bin/superset", line 10, in <module> sys.exit(superset()) File "/app/.venv/lib/python3.10/site-packages/click/core.py", line 1157, in __call__ return self.main(*args, **kwargs) File "/app/.venv/lib/python3.10/site-packages/click/core.py", line 1078, in main rv = self.invoke(ctx) File "/app/.venv/lib/python3.10/site-packages/click/core.py", line 1685, in invoke super().invoke(ctx) File "/app/.venv/lib/python3.10/site-packages/click/core.py", line 1434, in invoke return ctx.invoke(self.callback, **ctx.params) File "/app/.venv/lib/python3.10/site-packages/click/core.py", line 783, in invoke return __callback(*args, **kwargs) File "/app/.venv/lib/python3.10/site-packages/click/decorators.py", line 33, in new_func return f(get_current_context(), *args, **kwargs) File "/app/.venv/lib/python3.10/site-packages/flask/cli.py", line 355, in decorator app = __ctx.ensure_object(ScriptInfo).load_app() File "/app/.venv/lib/python3.10/site-packages/flask/cli.py", line 309, in load_app app = locate_app(import_name, name) File "/app/.venv/lib/python3.10/site-packages/flask/cli.py", line 238, in locate_app return find_app_by_string(module, app_name) File "/app/.venv/lib/python3.10/site-packages/flask/cli.py", line 166, in find_app_by_string app = attr(*args, **kwargs) File "/app/.venv/lib/python3.10/site-packages/superset/app.py", line 40, in create_app app_initializer.init_app() File "/app/.venv/lib/python3.10/site-packages/superset/initialization/__init__.py", line 466, in init_app self.setup_db() File "/app/.venv/lib/python3.10/site-packages/superset/initialization/__init__.py", line 662, in setup_db pessimistic_connection_handling(db.engine) File "/app/.venv/lib/python3.10/site-packages/flask_sqlalchemy/__init__.py", line 998, in engine return self.get_engine() File "/app/.venv/lib/python3.10/site-packages/flask_sqlalchemy/__init__.py", line 1017, in get_engine return connector.get_engine() File "/app/.venv/lib/python3.10/site-packages/flask_sqlalchemy/__init__.py", line 594, in get_engine self._engine = rv = self._sa.create_engine(sa_url, options) File "/app/.venv/lib/python3.10/site-packages/flask_sqlalchemy/__init__.py", line 1027, in create_engine return sqlalchemy.create_engine(sa_url, **engine_opts) File "<string>", line 2, in create_engine File "/app/.venv/lib/python3.10/site-packages/sqlalchemy/util/deprecations.py", line 375, in warned return fn(*args, **kwargs) File "/app/.venv/lib/python3.10/site-packages/sqlalchemy/engine/create.py", line 544, in create_engine dbapi = dialect_cls.dbapi(**dbapi_args) File "/app/.venv/lib/python3.10/site-packages/sqlalchemy/dialects/mysql/pymysql.py", line 80, in dbapi return __import__("pymysql") ModuleNotFoundError: No module named 'pymysql' Loaded your LOCAL configuration at [/etc/superset/superset_config.py] 2025-10-22 00:50:45,273:ERROR:superset.app:Failed to create app Traceback (most recent call last): File "/app/.venv/lib/python3.10/site-packages/superset/app.py", line 40, in create_app app_initializer.init_app() File "/app/.venv/lib/python3.10/site-packages/superset/initialization/__init__.py", line 466, in init_app self.setup_db() File "/app/.venv/lib/python3.10/site-packages/superset/initialization/__init__.py", line 662, in setup_db pessimistic_connection_handling(db.engine) File "/app/.venv/lib/python3.10/site-packages/flask_sqlalchemy/__init__.py", line 998, in engine return self.get_engine() File "/app/.venv/lib/python3.10/site-packages/flask_sqlalchemy/__init__.py", line 1017, in get_engine return connector.get_engine() File "/app/.venv/lib/python3.10/site-packages/flask_sqlalchemy/__init__.py", line 594, in get_engine self._engine = rv = self._sa.create_engine(sa_url, options) File "/app/.venv/lib/python3.10/site-packages/flask_sqlalchemy/__init__.py", line 1027, in create_engine return sqlalchemy.create_engine(sa_url, **engine_opts) File "<string>", line 2, in create_engine File "/app/.venv/lib/python3.10/site-packages/sqlalchemy/util/deprecations.py", line 375, in warned return fn(*args, **kwargs) File "/app/.venv/lib/python3.10/site-packages/sqlalchemy/engine/create.py", line 544, in create_engine dbapi = dialect_cls.dbapi(**dbapi_args) File "/app/.venv/lib/python3.10/site-packages/sqlalchemy/dialects/mysql/pymysql.py", line 80, in dbapi return __import__("pymysql") ModuleNotFoundError: No module named 'pymysql' Traceback (most recent call last): File "/app/.venv/bin/superset", line 10, in <module> sys.exit(superset()) File "/app/.venv/lib/python3.10/site-packages/click/core.py", line 1157, in __call__ return self.main(*args, **kwargs) File "/app/.venv/lib/python3.10/site-packages/click/core.py", line 1078, in main rv = self.invoke(ctx) File "/app/.venv/lib/python3.10/site-packages/click/core.py", line 1685, in invoke super().invoke(ctx) File "/app/.venv/lib/python3.10/site-packages/click/core.py", line 1434, in invoke return ctx.invoke(self.callback, **ctx.params) File "/app/.venv/lib/python3.10/site-packages/click/core.py", line 783, in invoke return __callback(*args, **kwargs) File "/app/.venv/lib/python3.10/site-packages/click/decorators.py", line 33, in new_func return f(get_current_context(), *args, **kwargs) File "/app/.venv/lib/python3.10/site-packages/flask/cli.py", line 355, in decorator app = __ctx.ensure_object(ScriptInfo).load_app() File "/app/.venv/lib/python3.10/site-packages/flask/cli.py", line 309, in load_app app = locate_app(import_name, name) File "/app/.venv/lib/python3.10/site-packages/flask/cli.py", line 238, in locate_app return find_app_by_string(module, app_name) File "/app/.venv/lib/python3.10/site-packages/flask/cli.py", line 166, in find_app_by_string app = attr(*args, **kwargs) File "/app/.venv/lib/python3.10/site-packages/superset/app.py", line 40, in create_app app_initializer.init_app() File "/app/.venv/lib/python3.10/site-packages/superset/initialization/__init__.py", line 466, in init_app self.setup_db() File "/app/.venv/lib/python3.10/site-packages/superset/initialization/__init__.py", line 662, in setup_db pessimistic_connection_handling(db.engine) File "/app/.venv/lib/python3.10/site-packages/flask_sqlalchemy/__init__.py", line 998, in engine return self.get_engine() File "/app/.venv/lib/python3.10/site-packages/flask_sqlalchemy/__init__.py", line 1017, in get_engine return connector.get_engine() File "/app/.venv/lib/python3.10/site-packages/flask_sqlalchemy/__init__.py", line 594, in get_engine self._engine = rv = self._sa.create_engine(sa_url, options) File "/app/.venv/lib/python3.10/site-packages/flask_sqlalchemy/__init__.py", line 1027, in create_engine return sqlalchemy.create_engine(sa_url, **engine_opts) File "<string>", line 2, in create_engine File "/app/.venv/lib/python3.10/site-packages/sqlalchemy/util/deprecations.py", line 375, in warned return fn(*args, **kwargs) File "/app/.venv/lib/python3.10/site-packages/sqlalchemy/engine/create.py", line 544, in create_engine dbapi = dialect_cls.dbapi(**dbapi_args) File "/app/.venv/lib/python3.10/site-packages/sqlalchemy/dialects/mysql/pymysql.py", line 80, in dbapi return __import__("pymysql") ModuleNotFoundError: No module named 'pymysql' Loaded your LOCAL configuration at [/etc/superset/superset_config.py] 2025-10-22 00:50:48,401:ERROR:superset.app:Failed to create app Traceback (most recent call last): File "/app/.venv/lib/python3.10/site-packages/superset/app.py", line 40, in create_app app_initializer.init_app() File "/app/.venv/lib/python3.10/site-packages/superset/initialization/__init__.py", line 466, in init_app self.setup_db() File "/app/.venv/lib/python3.10/site-packages/superset/initialization/__init__.py", line 662, in setup_db pessimistic_connection_handling(db.engine) File "/app/.venv/lib/python3.10/site-packages/flask_sqlalchemy/__init__.py", line 998, in engine return self.get_engine() File "/app/.venv/lib/python3.10/site-packages/flask_sqlalchemy/__init__.py", line 1017, in get_engine return connector.get_engine() File "/app/.venv/lib/python3.10/site-packages/flask_sqlalchemy/__init__.py", line 594, in get_engine self._engine = rv = self._sa.create_engine(sa_url, options) File "/app/.venv/lib/python3.10/site-packages/flask_sqlalchemy/__init__.py", line 1027, in create_engine return sqlalchemy.create_engine(sa_url, **engine_opts) File "<string>", line 2, in create_engine File "/app/.venv/lib/python3.10/site-packages/sqlalchemy/util/deprecations.py", line 375, in warned return fn(*args, **kwargs) File "/app/.venv/lib/python3.10/site-packages/sqlalchemy/engine/create.py", line 544, in create_engine dbapi = dialect_cls.dbapi(**dbapi_args) File "/app/.venv/lib/python3.10/site-packages/sqlalchemy/dialects/mysql/pymysql.py", line 80, in dbapi return __import__("pymysql") ModuleNotFoundError: No module named 'pymysql' Traceback (most recent call last): File "/app/.venv/bin/superset", line 10, in <module> sys.exit(superset()) File "/app/.venv/lib/python3.10/site-packages/click/core.py", line 1157, in __call__ return self.main(*args, **kwargs) File "/app/.venv/lib/python3.10/site-packages/click/core.py", line 1078, in main rv = self.invoke(ctx) File "/app/.venv/lib/python3.10/site-packages/click/core.py", line 1685, in invoke super().invoke(ctx) File "/app/.venv/lib/python3.10/site-packages/click/core.py", line 1434, in invoke return ctx.invoke(self.callback, **ctx.params) File "/app/.venv/lib/python3.10/site-packages/click/core.py", line 783, in invoke return __callback(*args, **kwargs) File "/app/.venv/lib/python3.10/site-packages/click/decorators.py", line 33, in new_func return f(get_current_context(), *args, **kwargs) File "/app/.venv/lib/python3.10/site-packages/flask/cli.py", line 355, in decorator app = __ctx.ensure_object(ScriptInfo).load_app() File "/app/.venv/lib/python3.10/site-packages/flask/cli.py", line 309, in load_app app = locate_app(import_name, name) File "/app/.venv/lib/python3.10/site-packages/flask/cli.py", line 238, in locate_app return find_app_by_string(module, app_name) File "/app/.venv/lib/python3.10/site-packages/flask/cli.py", line 166, in find_app_by_string app = attr(*args, **kwargs) File "/app/.venv/lib/python3.10/site-packages/superset/app.py", line 40, in create_app app_initializer.init_app() File "/app/.venv/lib/python3.10/site-packages/superset/initialization/__init__.py", line 466, in init_app self.setup_db() File "/app/.venv/lib/python3.10/site-packages/superset/initialization/__init__.py", line 662, in setup_db pessimistic_connection_handling(db.engine) File "/app/.venv/lib/python3.10/site-packages/flask_sqlalchemy/__init__.py", line 998, in engine return self.get_engine() File "/app/.venv/lib/python3.10/site-packages/flask_sqlalchemy/__init__.py", line 1017, in get_engine return connector.get_engine() File "/app/.venv/lib/python3.10/site-packages/flask_sqlalchemy/__init__.py", line 594, in get_engine self._engine = rv = self._sa.create_engine(sa_url, options) File "/app/.venv/lib/python3.10/site-packages/flask_sqlalchemy/__init__.py", line 1027, in create_engine return sqlalchemy.create_engine(sa_url, **engine_opts) File "<string>", line 2, in create_engine File "/app/.venv/lib/python3.10/site-packages/sqlalchemy/util/deprecations.py", line 375, in warned return fn(*args, **kwargs) File "/app/.venv/lib/python3.10/site-packages/sqlalchemy/engine/create.py", line 544, in create_engine dbapi = dialect_cls.dbapi(**dbapi_args) File "/app/.venv/lib/python3.10/site-packages/sqlalchemy/dialects/mysql/pymysql.py", line 80, in dbapi return __import__("pymysql") ModuleNotFoundError: No module named 'pymysql' ✅ Superset 初始化完成! gapinyc@DESKTOP-9QS7RL5:~/superset-prod$ # Dockerfile.superset FROM apache/superset:latest # Step 1: 切换为 root 用户(必须) USER root # Step 2: 安装系统级编译依赖 RUN apt-get update && \ apt-get install -y --no-install-recommends \ build-essential \ libssl-dev \ libffi-dev \ python3-dev \ libev-dev \ libjpeg-dev \ zlib1g-dev \ && rm -rf /var/lib/apt/lists/* # Step 3: 激活虚拟环境并重新安装 pip # 注意:官方镜像的 Python 虚拟环境位于 /app/.venv ENV VIRTUAL_ENV=/app/.venv ENV PATH="$VIRTUAL_ENV/bin:$PATH" # 方法一:使用 ensurepip 强制重建 pip(推荐) RUN python -m ensurepip --upgrade || \ (wget https://bootstrap.pypa.io/get-pip.py -O /tmp/get-pip.py && \ python /tmp/get-pip.py --user) # 验证 pip 是否可用 RUN python -m pip --version # Step 4: 安装所需 Python 包 RUN python -m pip install --upgrade pip && \ python -m pip install \ "gevent>=21.12.0" \ psutil \ pymysql \ flask-appbuilder \ pillow # Step 5: 验证 gevent 是否成功导入 RUN python -c "import gevent; print(f'✅ Successfully imported gevent {gevent.__version__}')" # Step 6: 切回普通用户(安全) USER superset 这是为什么呢
10-23
### 缓存替换算法中的改进型 Pseudo-LRU 实现原理 Pseudo-LRU(伪最近最少使用)是一种广泛应用于缓存系统中的近似 LRU 替换策略。由于标准 LRU 算法在硬件实现上复杂度较高,尤其是在多路组相联缓存中,因此提出了多种简化版本的 Pseudo-LRU 算法。 #### 基本原理 Pseudo-LRU 的核心思想是通过维护一棵二叉树结构来模拟 LRU 行为,但不需要记录完整的访问历史。对于一个 n 路组相联缓存,需要 (n - 1) 个位来表示每个缓存行的使用状态[^1]。例如,在 4 路组相联缓存中,需要 3 个位来决定哪一路是最久未使用的。 改进型 Pseudo-LRU 算法通常会引入额外的状态管理机制,以提高命中率并减少硬件开销。一些研究提出通过动态调整访问位或引入辅助计数器来优化传统 Pseudo-LRU 的性能。这些改进旨在更精确地跟踪缓存行的使用频率,并在替换时优先选择真正“最不常用”的缓存块。 #### 实现细节 改进型 Pseudo-LRU 的实现通常包括以下几个关键步骤: 1. **状态树维护**:使用位向量来表示每一层的选择结果。例如,在 8 路组相联缓存中,可以构建一个深度为 3 的二叉树,每个节点代表一次访问选择的结果。 2. **访问更新**:当某一路被访问时,沿着状态树路径更新相应的位,标记该路为“最近使用”。 3. **替换决策**:根据当前状态树决定替换哪一路,通常是选择最左侧或最右侧的未被频繁访问的缓存行。 以下是一个简化的 C++ 示例代码,展示如何维护和更新 Pseudo-LRU 状态树: ```cpp #include <iostream> #include <vector> class PseudoLRU { private: std::vector<int> tree; // 状态树 int associativity; public: PseudoLRU(int ways) : associativity(ways) { // 初始化状态树大小为 (associativity - 1) tree.resize(associativity - 1, 0); } void access(int index) { // 更新状态树,index 是被访问的缓存行索引 int node = 0; for (int i = 0; i < log2(associativity); ++i) { int direction = (index >> i) & 1; tree[node] = direction; node = 2 * node + 1 + direction; } } int replace() { // 根据状态树决定替换哪一路 int node = 0; int path = 0; for (int i = 0; i < log2(associativity); ++i) { int direction = tree[node]; path |= (direction << i); node = 2 * node + 1 + direction; } return path; } }; int main() { PseudoLRU plru(4); // 创建一个 4 路组相联的 Pseudo-LRU 缓存 plru.access(1); // 模拟访问第 1 路 plru.access(3); // 模拟访问第 3 路 std::cout << "Next replacement index: " << plru.replace() << std::endl; return 0; } ``` #### 研究论文 关于改进型 Pseudo-LRU 算法的研究论文有很多,其中一些经典文献包括: - *"A Study of Replacement Policies for High-Performance Cache Memories"*,该论文比较了多种缓存替换策略,并提出了基于访问模式预测的改进方案。 - *"An Efficient Implementation of Pseudo-LRU Algorithm in Multiway Set Associative Caches"*,讨论了一种低功耗、高效的 Pseudo-LRU 实现方法。 - *"Adaptive Pseudo-LRU Replacement Policy Based on Access Frequency Prediction"*,提出了一种自适应调整状态树的改进型 Pseudo-LRU 算法[^1]。 这些研究成果为现代处理器缓存设计提供了重要的理论支持和技术指导。
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值