Pythian Blog: Technical Track

Oracle Data Pump 11g: Little-Known New Feature

While perusing the Oracle 11g Data Pump documents recently, I noticed a new parameter that was introduced in Oracle 10g, but I had missed it there. The parameter is TABLE_EXISTS_ACTION, and it applies only to the Data Pump Import.

Basically, this feature allows you to decide how to handle importing data if a table already exists. The options are SKIP (defaut), APPEND, TRUNCATE, and REPLACE. Do these parameters look familiar? Possibly reminiscent of the SQL*Loader parameters (INTO TABLE x INSERT/REPLACE/TRUNCATE/APPEND)? They are very similar and they work the same way, with the exception of SKIP.

Here’s an explanation of each option.

  • SKIP: The default value for TABLE_EXISTS_ACTION. If the table exists, then SKIP will cause the table (and any related data) to not be loaded. The table will be skipped.
  • TRUNCATE: This will truncate the data in the table and load the data. If the table already exists, no metadata will be applied.
  • APPEND: This will append the data to the end of the table. Again, if the table already exists, no metadata will be applied.
  • REPLACE: This will drop the existing object, recreate the object with the information from the dumpfile, and then load the data.

Let’s create a simple test case to demonstrate.

drop table test1;

create table test1 
  ( col1 number primary key, 
    col2 number default 99,
    col3 number );

create index ix_test1_col2 on test1 ( col2);

create trigger trg_test1 
before insert or update on test1
for each row
begin
   :new.col3 := :new.col1 + :new.col2;
end;
/

begin
   for i in 1 .. 1001 loop
      insert into test1 (col1) values (i);
   end loop;
end;
/

select col1, col2, col3 from test1 where rownum < 5;

      COL1       COL2       COL3
---------- ---------- ----------
         1         99        100
         2         99        101
         3         99        102
         4         99        103

In the first test, we will do a full export of the table, modify the data and drop the trigger, and then do a Data Pump import with the REPLACE option. After the Data Pump import, we can see that the data is the same and that the trigger has been recreated.

expdp babette/babette tables=test1 dumpfile=test.1.dmp

update test1 set col1 = col1 + 2000, col2 = 500;

drop trigger trg_test1;

select col1, col2, col3 from test1 where rownum < 5;

            COL1       COL2       COL3
      ---------- ---------- ----------
            2001        500       2501
            2002        500       2502
            2003        500       2503
            2004        500       2504

impdp babette/babette dumpfile=test1.dmp table_exists_action=replace

--
-- data is the same as before update ( was replace with data in dumpfile)
--
SQL> select count(*), col2 from test1 group by col2;

  COUNT(*)       COL2
---------- ----------
      1001        500

select col1, col2, col3 from test1 where rownum < 5;

      COL1       COL2       COL3
---------- ---------- ----------
         1         99        100
         2         99        101
         3         99        102
         4         99        103
--
-- trigger has been re-created
-- 
SQL> select trigger_name from user_triggers;

TRIGGER_NAME
------------------------------
TRG_TEST1

In this second test, we will update the data (so we don’t get PK violations on importing data). We will reuse the export dumpfile previously created. After the Data Pump import, notice that the data has been added instead of replaced.

update test1 set col1 = col1 + 2000, col2 = col3 / 2;

impdp babette/babette dumpfile=test1.dmp table_exists_action=append

-- data is appended and trigger is effective

select count(*), grouping
  from (select case when col2 between 1 and 100 then 1
                    when col2 between 101 and 499 then 2
                    when col2 between 500 and 1000 then 3
                    when col2 between 1001 and 99999 then 4
               end grouping from test1) a
group by grouping

  COUNT(*)   GROUPING
---------- ----------
      1001          1
      1001          4

In this third test, we will restore the original data (using expdp REPLACE). Before the import, we will drop the trigger. After the Data Pump import, we can see that the data has been added. We also notice that the trigger on the table has not been re-created. This is because the table existed, so metadata was not applied.

impdp babette/babette dumpfile=test1.dmp table_exists_action=replace

drop trigger trg_test1;

update test1 set col1 = col1 + 2000, col2 = col3 / 2;

impdp babette/babette dumpfile=test1.dmp table_exists_action=append

-- notice that the append did NOT execute any DDL 
-- due to the existence of the table ALL metdata is excluded.

SQL> select trigger_name from user_triggers;
no rows selected

So what if we want to apply the metadata from the export file even if the table already exists? Oracle has provided a way to do that too. We can use an INCLUDE statement to include the type of objects you want to include. However, in my testing I found this only worked when I selected CONTENT=METADATA_ONLY, and it did not work when I did CONTENT=ALL (data and metadata)

impdp babette/babette dumpfile=test1.dmp content=metadata_only include=trigger

How does this compare with the original import? It behaves very similarly: import does not apply any metadata if the object already exists. However, this behaviour can be overridden with IGNORE=Y parameter.

exp babette/babette tables=test1 file=test1.exp.dmp

update test1 set col1 = col1 + 2000, col2 = col3 / 2;

drop trigger trg_test1;

imp babette/babette file=test1.exp.dmp full=y ignore=Y

SQL> select count(*) from test1

  COUNT(*)
----------
      2002

SQL> select trigger_name from user_triggers;

TRIGGER_NAME
------------------------------
TRG_TEST1

Having looked at the parameter options and how it works, the next logical question is, how can this be useful?

There have been times where I have had full schema imports that failed part-way through. One option available was to figure out which tables had not yet been done, and try to dynamically create a TABLES= parameter to load those. But, if the tables were re-imported, the data would have been duplicated. Alternatively, I would clean up the schema and start again. With 11g, the TABLE_EXISTS_ACTION gives us more flexibility. We can simply re-run the import and Oracle will automatically skip all objects that already exist.

We can also use the REPLACE option to undo all object changes and restore to a baseline. For example, you have an export of test data that you use to refresh certain tables in a schema. Rather than having to manually locate and remove the old versions of the tables (which may have had DDL changes applied in a test environment), you simply run your Data Pump import with the REPLACE option.

Having unearthed this parameter for myself and put it to work, I am sure that others can think of lots of other uses for it. Please add your comments to show how you have used this parameter.

No Comments Yet

Let us know what you think

Subscribe by email