Parallel Data Loading Models

最新推荐文章于 2025-11-24 20:48:20 发布

原创最新推荐文章于 2025-11-24 20:48:20 发布 · 171 阅读

0 ·

CC 4.0 BY-SA版权

文章标签：

#oracle

Oracle 专栏收录该内容

193 篇文章

订阅专栏

本文详细介绍了如何通过并发会话和直接路径负载来提高数据加载速度。并发常规路径负载允许在同一系统上的多个CPU上同时执行多个加载会话，而并发直接路径负载则强调在不维护索引和有限制的情况下进行快速追加操作。对于直接路径负载，需要注意文件定位、存储属性和临时段管理，以及在加载后手动重建和并行启用约束，以提高效率。

Concurrent Conventional Path Loads

If you want faster loading, then you should consider using concurrent conventional path loads. That is, use multiple load sessions executing concurrently on a multiple-CPU system. Split the input data files into separate files on logical record boundaries, and then load each such input data file with a conventional path load session.

You can use this technique to load the same or different objects concurrently with no restrictions.

（二） Concurrent Direct Path Loads

1. Restrictions on Parallel Direct Path Loads

Neither local nor global indexes can be maintained by the load.
Rows can only be appended. REPLACE, TRUNCATE, and INSERT cannot be used (this is due to the individual loads not being coordinated). If you must truncate a table before a parallel load, then you must do it manually.

另外以下对象会自动被disable

Referential integrity constraints( foreign key)
Triggers
CHECK constraints, unless the ENABLE_CHECK_CONSTRAINTS control file option is used
If a parallel direct path load is being applied to a single partition, then you should partition the data first (otherwise, the overhead of record rejection due to a partition mismatch slows down the load).

如果段内并发必须使用parallel选项

使用concurrent session导入方式：

sqlldr USERID=scott CONTROL=load1.ctl DIRECT=TRUE PARALLEL=TRUE

sqlldr USERID=scott CONTROL=load2.ctl DIRECT=TRUE PARALLEL=TRUE

sqlldr USERID=scott CONTROL=load3.ctl DIRECT=TRUE PARALLEL=TRUE

When you perform a parallel load, SQL*Loader creates temporary segments for each concurrent session and then merges the segments upon completion. The segment created from the merge is then added to the existing segment in the database above the segment's high-water mark. The last extent used of each segment for each loader session is trimmed of any free space before being combined with the other extents of the SQL*Loader session.

因为direct path并发不会维护索引所以要手动重建，重建时使用parallel选项来加快速度。

Parameters for Parallel Direct Path Loads

Using the FILE Parameter to Specify Temporary Segments

To allow for maximum I/O throughput, Oracle recommends that each concurrent direct path load session use files located on different disks.

sqlldr USERID=scott CONTROL=load1.ctl DIRECT=TRUE PARALLEL=TRUE FILE='/dat/users01.dat'

sqlldr USERID=scott CONTROL=load2.ctl DIRECT=TRUE PARALLEL=TRUE FILE='/dat/users02.dat'

sqlldr USERID=scott CONTROL=load3.ctl DIRECT=TRUE PARALLEL=TRUE FILE='/dat/users03.dat'

使用FILE有以下限制：

nonpartitioned tables: The specified file must be in the tablespace of the table being loaded.
partitioned tables, single-partition load: The specified file must be in the tablespace of the partition being loaded.
partitioned tables, full-table load: The specified file must be in the tablespace of all partitions being loaded; that is, all partitions must be in the same tablespace.

Using the STORAGE Parameter

You can use the STORAGE parameter to specify the storage attributes of the temporary segments allocated for a parallel direct path load. If the STORAGE parameter is not used, then the storage attributes of the segment containing the object (table, partition) being loaded are used. Also, when the STORAGE parameter is not specified, SQL*Loader uses a default of 2 KB for EXTENTS.

STORAGE只能在控制文件中指定

OPTIONS (STORAGE=(INITIAL 100M NEXT 100M PCTINCREASE 0))

Use of the STORAGE parameter to specify anything other than PCTINCREASE of 0, and INITIAL or NEXT values is strongly discouraged and may be silently ignored.

Enabling Constraints After a Parallel Direct Path Load

PRIMARY KEY and UNIQUE KEY Constraints这两个约束在启用时会创建索引，这可能很费时，你应该关闭自动enable功能而手动使得parallel创建索引

PRIMARY KEY and UNIQUE KEY constraints create indexes on a table when they are enabled, and subsequently can take a significantly long time to enable after a direct path loading session if the table is very large.

You should consider enabling these constraints manually after a load (and not specifying the automatic enable feature). This enables you to manually create the required indexes in parallel to save time before enabling the constraint.