Using the Oracle Bulk Loader -2

本文介绍如何使用Oracle批量加载工具(sqlldr)进行数据导入。包括创建控制文件与数据文件的方法,以及如何处理日期类型和长字符串等特殊数据类型。同时提供了加载数据的命令示例。

Overview

To use the Oracle bulk loader, you need a control file, which specifies how data should be loaded into the database; and a data file, which specifies what data should be loaded. You will learn how to create these files in turn.

Creating the Control File

A simple control file has the following form:
LOAD DATA
INFILE <datafile>
APPEND INTO TABLE <tablename>
FIELDS TERMINATED BY '<separator>'
(<list load="" to="" names="" attribute="" all="" of="">)</list></separator></tablename></datafile>
  • <datafile></datafile> is the name of the data file. If you did not give a file name extension for <datafile></datafile>, Oracle will assume the default extension ".dat". Therefore, it is a good idea to name every data file with an extension, and specify the complete file name with the extension.
  • <tablename> </tablename> is the name of the table to which data will be loaded. Of course, it should have been created already before the bulk load operation.
  • The optional keyword APPEND says that data will be appended to <tablename> </tablename> . If APPEND is omitted, the table must be empty before the bulk load operation or else an error will occur.
  • <separator></separator> specifies the field separator for your data file. This can be any string. It is a good idea to use a string that you know will never appear in the data, so the separator will not be confused with data fields.
  • Finally, list the names of attributes of <tablename> </tablename> that are set by your data file, separated by commas and enclosed in parentheses. This list need not be the complete list of attributes in the actual schema of the table, nor must it be arranged in the same order as the attributes when the table was created -- sqlldr will match attributes to by their names in the table schema. Any attributes unspecified in the list of attributes will be set to NULL.
As a concrete example, here are the contents of a control file test.ctl:
LOAD DATA
INFILE test.dat
INTO TABLE test
FIELDS TERMINATED BY '|'
(i, s)

Creating the Data File

Each line in the data file specifies one tuple to be loaded into <tablename> </tablename> . It lists, in order, values for the attributes in the list specified in the control file, separated by <separator></separator>. As a concrete example, test.dat might look like:
1|foo
2|bar
3| baz
Recall that the attribute list of test specified in test.ctl is (i, s), where i has the type int, and s has the type char(10). As the result of loading test.dat, the following tuples are inserted into test:
(1, 'foo')
(2, 'bar')
(3, ' baz')
Some Notes of Warning
  • Note that the third line of test.dat has a blank after "|". This blank is not ignored by the loader. The value to be loaded for attribute s is ' baz', a four-character string with a leading blank. It is a common mistake to assume that 'baz', a three-character string with no leading blank, will be loaded instead. This can lead to some very frustrating problems that you will not notice until you try to query your loaded data, because ' baz' and 'baz' are different strings.

  • Oracle literally considers every single line to be one tuple, even an empty line! When it tries to load data from an empty line, however, an error would occur and the tuple will be rejected. Some text editors love to add multiple newlines to the end of a file; if you see any strange errors in your .log file about tuples with all NULL columns, this may be the cause. It shouldn't affect other tuples loaded.

  • If you are using a Microsoft text editor, such as MSWord, you will find that Bill Gates believes in ending lines with the sequence <cr> (carriage return) <lf> (line feed). The UNIX world uses only <lf>, so each <cr> becomes ^M, the last character of strings in your load file. That makes it impossible for you ever to match a stored string in an SQL query. Here's how you remove ^M symbols from your file. Let's say the file with ^M symbols is bad_myRel.dat. Then the following command will create myRel.dat without ^M symbols: </cr></lf></lf></cr>

                 cat bad_myRel.dat | tr -d '\015' > myRel.dat

    If you're an emacs fan, type in the following sequence to modify your current buffer:

                    ESC-x replace-string CTRL-q CTRL-m ENTER ENTER

    Loading Your Data

    The Oracle bulk loader is called sqlldr. It is a UNIX-level command, i.e., it should be issued directly from your UNIX shell, rather than within sqlplus. A bulk load command has the following form:
    sqlldr <yourname> control=<ctlfile> log=<logfile> bad=<badfile></badfile></logfile></ctlfile></yourname>
    Everything but sqlldr is optional -- you will be prompted for your username, password, and control file. <ctlfile></ctlfile> is the name of the control file. If no file name extension is provided, sqlldr will assume the default extension ".ctl". The name of the data file is not needed on the command line because it is specified within the control file. You may designate <logfile></logfile> as the log file. If no file name extension is provided, ".log" will be assumed. sqlldr will fill the log file with relevant information about the bulk load operation, such as the number of tuples loaded, and a description of errors that may have occurred. Finally, you may designate <badfile></badfile> as the file where bad tuples (any tuples for which an error occurs on an attempt to load them) are recorded (if they occur). Again, if no file extension is specified, Oracle uses ".bad". If no log file or bad file are specified, sqlldr will use the name of the control file with the .log and .bad extensions, respectively.

    As a concrete example, if sally wishes to run the control file test.ctl and have the log output stored in test.log, then she should type

    sqlldr sally control=test.ctl log=test.log
    Reminder: Before you run any Oracle commands such as sqlldr and sqlplus, make sure you have already set up the correct environment by sourcing /afs/ir/class/cs145/all.env (see Getting Started With Oracle).

    Loading Without a Separate Data File

    It is possible to use just the control file to load data, instead of using a separate data file. Here is an example:
    LOAD DATA
    INFILE *
    INTO TABLE test
    FIELDS TERMINATED BY '|'
    (i, s)
    BEGINDATA
    1|foo
    2|bar
    3| baz
    The trick is to specify "*" as the name of the data file, and use BEGINDATA to start the data section in the control file.

    Loading DATE Data

    The DATE datatype can have its data loaded in a format you specify with considerable flexibility. First, suppose that you have created a relation with an attribute of type DATE:
    CREATE TABLE foo (
    i int,
    d date
    );
    In the control file, when you describe the attributes of foo being loaded, you follow the attribute d by its type DATE and a date mask. A date mask specifies the format your date data will use. It is a quoted string with the following conventions:
    • Sequences of d, m, or y, denote fields in your data that will be interpreted as days, months, years, respectively. As with almost all of SQL, capitals are equally acceptable, e.g., MM is a month field.
    • The lengths of these fields specify the maximum lengths for the corresponding values in your data. However, the data can be shorter.
    • Other characters, such as dash, are treated literally, and must appear in your data if you put them in the mask.
    Here is an example control file:
    LOAD DATA
    INFILE *
    INTO TABLE foo
    FIELDS TERMINATED BY '|'
    (i, d DATE 'dd-mm-yyyy')
    BEGINDATA
    1|01-01-1990
    2|4-1-1998
    Notice that, as illustrated by the second tuple above, a field can be shorter than the corresponding field in the date mask. The punctuation "-" tells the loader that the day and month fields of the second tuple terminate early.

    Loading Long Strings

    String fields that may be longer than 255 characters, such as for data types CHAR(2000) or VARCHAR(4000), require a special CHAR(n) declaration in the control file. For example, if table foo was created as

    CREATE TABLE foo (x VARCHAR(4000));

    Then a sample control file should look like:

    LOAD DATA
    INFILE <datafile>
    INTO TABLE foo
    FIELDS TERMINATED BY '|'
    (x CHAR(4000))</datafile>
    Note that the declaration takes the form CHAR(n) regardless of whether the field type was declared as CHAR or VARCHAR.

    Entering NULL Values

    You may specify NULL values simply by entering fields in your data file without content. For example, if we were entering integer values into a table with schema (a, b, c) specified in the .ctl file, the following lines in the data file:
    3||5
    |2|4
    1||6
    ||7
    would result in inserting the following tuples in the relation:
    (3, NULL, 5)
    (NULL, 2, 4)
    (1, NULL, 6)
    (NULL, NULL, 7)
    Keep in mind that any primary keys or other constraints requiring that values be non-NULL will reject tuples for which those attributes are unspecified.

    Note:If the final field in a given row of your data file will be unspecified (NULL), you have to include the line TRAILING NULLCOLS after the FIELDS TERMINATED BY line in your control file, otherwise sqlldr will reject that tuple. sqlldr will also reject a tuple whose columns are all set to NULL in the data file.

    If you do not wish to enter values for any row of a given column, you can, as mentioned above, leave that column out of the attribute list altogether.

基于遗传算法的新的异构分布式系统任务调度算法研究(Matlab代码实现)内容概要:本文档围绕基于遗传算法的异构分布式系统任务调度算法展开研究,重点介绍了一种结合遗传算法的新颖优化方法,并通过Matlab代码实现验证其在复杂调度问题中的有效性。文中还涵盖了多种智能优化算法在生产调度、经济调度、车间调度、无人机路径规划、微电网优化等领域的应用案例,展示了从理论建模到仿真实现的完整流程。此外,文档系统梳理了智能优化、机器学习、路径规划、电力系统管理等多个科研方向的技术体系与实际应用场景,强调“借力”工具与创新思维在科研中的重要性。; 适合人群:具备一定Matlab编程基础,从事智能优化、自动化、电力系统、控制工程等相关领域研究的研究生及科研人员,尤其适合正在开展调度优化、路径规划或算法改进类课题的研究者; 使用场景及目标:①学习遗传算法及其他智能优化算法(如粒子群、蜣螂优化、NSGA等)在任务调度中的设计与实现;②掌握Matlab/Simulink在科研仿真中的综合应用;③获取多领域(如微电网、无人机、车间调度)的算法复现与创新思路; 阅读建议:建议按目录顺序系统浏览,重点关注算法原理与代码实现的对应关系,结合提供的网盘资源下载完整代码进行调试与复现,同时注重从已有案例中提炼可迁移的科研方法与创新路径。
【微电网】【创新点】基于非支配排序的蜣螂优化算法NSDBO求解微电网多目标优化调度研究(Matlab代码实现)内容概要:本文提出了一种基于非支配排序的蜣螂优化算法(NSDBO),用于求解微电网多目标优化调度问题。该方法结合非支配排序机制,提升了传统蜣螂优化算法在处理多目标问题时的收敛性和分布性,有效解决了微电网调度中经济成本、碳排放、能源利用率等多个相互冲突目标的优化难题。研究构建了包含风、光、储能等多种分布式能源的微电网模型,并通过Matlab代码实现算法仿真,验证了NSDBO在寻找帕累托最优解集方面的优越性能,相较于其他多目标优化算法表现出更强的搜索能力和稳定性。; 适合人群:具备一定电力系统或优化算法基础,从事新能源、微电网、智能优化等相关领域研究的研究生、科研人员及工程技术人员。; 使用场景及目标:①应用于微电网能量管理系统的多目标优化调度设计;②作为新型智能优化算法的研究与改进基础,用于解决复杂的多目标工程优化问题;③帮助理解非支配排序机制在进化算法中的集成方法及其在实际系统中的仿真实现。; 阅读建议:建议读者结合Matlab代码深入理解算法实现细节,重点关注非支配排序、拥挤度计算和蜣螂行为模拟的结合方式,并可通过替换目标函数或系统参数进行扩展实验,以掌握算法的适应性与调参技巧。
评论
成就一亿技术人!
拼手气红包6.0元
还能输入1000个字符
 
红包 添加红包
表情包 插入表情
 条评论被折叠 查看
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值