Þ E. OTHER THINGS BEING EQUAL, THE WINDOW WITH THE LONGEST TO RUN...

10. þ E. Other things being equal, the window with the longest to run will open

(or remain open).

ý A, B, C, and D. Only one window can be open at once, and windows

can overlap. The algorithms that manage overlapping windows are not well

documented, but neither C nor D is definitively correct.

CHAPTER 23

Moving and Reorganizing Data

Exam Objectives

In this chapter you will learn to

• 052.17.1 Describe and Use Methods to Move Data

(Directory Objects, SQL*Loader, External Tables)

• 052.17.2 Explain the General Architecture of Oracle Data Pump

• 052.17.3 Use Data Pump Export and Import to Move Data Between

Oracle Databases

• 053.16.2 Describe the Concepts of Transportable Tablespaces and Databases

• 053.16.1 Manage Resumable Space Allocation

• 053.16.3 Reclaim Wasted Space from Tables and Indexes by Using the Segment

Shrink Functionality

831

There are many situations where bulk transfers of data into a database or between

databases are necessary. Common cases include populating a data warehouse with

data extracted from transaction processing systems, or copying data from live systems

to test or development environments. As entering data with standard INSERT statements

is not always the best way to do large-scale operations, the Oracle database comes

with facilities designed for bulk operations. These are SQL*Loader and Data Pump.

There is also the option of reading data without ever actually inserting it into the

database; this is accomplished through the use of external tables.

Data loading operations, as well as DML, may fail because of space problems. This

can be an appalling waste of time. The resumable space allocation mechanism can

provide a way to ameliorate the effect of space problems. There are also techniques to

reclaim space that is inappropriately assigned to objects and make it available for reuse.

SQL*Loader

In many cases you will be faced with a need to do a bulk upload of datasets generated

from some third-party system. This is the purpose of SQL*Loader. The input files may

be generated by anything, but as long as the layout conforms to something that

SQL*Loader can understand, it will upload the data successfully. Your task as DBA

is to configure a SQL*Loader controlfile that can interpret the contents of the input

datafiles; SQL*Loader will then insert the data.

Architecturally, SQL*Loader is a user process like any other: it connects to the

database via a server process. To insert rows, it can use two techniques: conventional

or direct path. A conventional insert uses absolutely ordinary INSERT statements. The

SQL*Loader user process constructs an INSERT statement with bind variables in the

VALUES clause and then reads the source datafile to execute the INSERT once for each

row to be inserted. This method uses the database buffer cache and generates undo

and redo data: these are INSERT statements like any others, and normal commit

processing makes them permanent.

The direct path load bypasses the database buffer cache. SQL*Loader reads the

source datafile and sends its contents to the server process. The server process then

assembles blocks of table data in its PGA and writes them directly to the datafiles. The

write is above the high water mark of the table and is known as a data save. The high

water mark is a marker in the table segment above which no data has ever been

written: the space above the high water mark is space allocated to the table that has

not yet been used. Once the load is complete, SQL*Loader shifts the high water mark

up to include the newly written blocks, and the rows within them are then immediately

visible to other users. This is the equivalent of a COMMIT. No undo is generated, and

if you wish, you can switch off the generation of redo as well. For these reasons, direct

path loading is extremely fast, and furthermore it should not impact on your end

users, because interaction with the SGA is kept to a minimum.

Direct path loads are very fast, but they do have drawbacks:

• Referential integrity constraints must be dropped or disabled for the duration

of the operation.

• INSERT triggers do not fire.

• The table will be locked against DML from other sessions.

• It is not possible to use direct path for clustered tables.

These limitations are a result of the lack of interaction with the SGA while the

load is in progress.

EXAM TIP Only UNIQUE, PRIMARY KEY, and NOT NULL constraints are

enforced during a direct path load; INSERT triggers do not fire; the table is

locked for DML.

SQL*Loader uses a number of files. The input datafiles are the source data that it will

upload into the database. The controlfile is a text file with directives telling SQL*Loader

how to interpret the contents of the input files, and what to do with the rows it extracts

P AR T III

from them. Log files summarize the success (or otherwise) of the job, with detail of any

errors. Rows extracted from the input files may be rejected by SQL*Loader (perhaps

because they do not conform to the format expected by the controlfile) or by the

database (for instance, insertion might violate an integrity constraint); in either case

they are written out to a bad file. If rows are successfully extracted from the input but

rejected because they did not match some record selection criterion, they are written

out to a reject file.

The controlfile is a text file instructing SQL*Loader on how to process the input

datafiles. It is possible to include the actual data to be loaded on the controlfile, but you

would not normally do this; usually, you will create one controlfile and reuse it, on a

regular basis, with different input datafiles. The variety of input formats that SQL*Loader

can understand is limited only by your ingenuity in constructing a controlfile.

Consider this table:

SQL> desc dept; Name Null? Type --- --- --- DEPTNO NOT NULL NUMBER(2) DNAME VARCHAR2(14) LOC VARCHAR2(13)

And this source datafile, named DEPTS.TXT: