170 likes | 572 Views
Extraction, Transformation, and Loading (ETL). Loading. Objectives. After completing this lesson, you should be able to implement the following methods that are available for loading data: SQL*Loader External tables OCI and direct-path APIs Data Pump Export/import. Data-Loading Mechanisms.
E N D
Objectives • After completing this lesson, you should be able to implement the following methods that are available for loading data: • SQL*Loader • External tables • OCI and direct-path APIs • Data Pump • Export/import
Data-Loading Mechanisms • You can use the following mechanisms for loading a data warehouse: • SQL*Loader • External tables • OCI and direct-path APIs • Export/import • Data Pump
Loading Mechanisms • SQL*Loader loads a formatted flat file into an existing table. • It can perform basic transformations while loading. • Direct-path loading may be used to decrease the load time. • When you use this method, data in the flat file is not accessible until the data is loaded.
SQL*Loader: Example • Control file used for loading the SALES table: • The fact table can be loaded with the following command: • LOAD DATA INFILE sh_sales.dat APPEND INTO TABLE sales • FIELDS TERMINATED BY "|" • (PROD_ID, CUST_ID, TIME_ID, CHANNEL_ID, PROMO_ID, QUANTITY_SOLD, AMOUNT_SOLD) • $ sqlldr sh/sh control=sh_sales.ctl \ direct=true
Loading Mechanisms • External tables are read-only tables where the data is stored outside the database in flat files. • The data can be queried like a virtual table, using any supported language inside the database. • No DML is allowed and no indexes can be created. • The metadata for an external table is created using a CREATE TABLE statement. • An external table describes how the external data should be presented to the database.
Applications of External Tables • External tables: • Allow external data to be queried and joined directly and in parallel without requiring it to be loaded into the database • Eliminate the need for staging the data within the database for ETL in data warehousing applications • Are useful in environments where an external source has to be joined with database objects and then transformed • Are useful when the external data is large and not queried frequently • Complement SQL*Loader functionalities: • Transparent parallelism • Full SQL capabilities for direct-path insertion
Example of Defining External Tables CREATE TABLE sales_delta_xt ( prod_id NUMBER(6), cust_id NUMBER, time_id DATE,unit_cost, unit_price ...) ORGANIZATION external ( -- External Table TYPE oracle_loader–- Access Driver DEFAULT DIRECTORY data_dir–- Files Directory ACCESS PARAMETERS –- Similar to SQL*Loader ( RECORDS DELIMITED BY NEWLINE CHARACTERSET US7ASCII BADFILE log_dir:'sh_sales_%p.bad' LOGFILE log_dir:'sh_sales_%p.log_xt' FIELDS TERMINATED BY "|" LDRTRIM ) location ( 'sales_delta.dat', data_dir2:'sales_delta2.dat' )) PARALLEL 5 –- Independent from the number of files REJECT LIMIT UNLIMITED;
Populating External Tables with Data Pump CREATE TABLE emp_ext (first_name, last_name, department_name) ORGANIZATION EXTERNAL ( TYPE ORACLE_DATAPUMP DEFAULT DIRECTORY ext_dir LOCATION ('emp1.exp',) ) AS SELECT e.first_name,e.last_name,d.department_name FROM employees e, departments d WHERE e.department_id = d.department_id AND d.department_name in ('Marketing', 'Purchasing');
Privileges for External Tables • Access to the external tables for other users requires: • SELECT on the table definition • READ access to the directory containing the file • WRITE access to the directory for the bad file and log file GRANT SELECT ON sh.sales_delta_xt TO oe; GRANT READ ON DIRECTORY data_dir TO oe; GRANT WRITE ON DIRECTORY log_dir TO oe;
Defining External TablesUsing SQL*Loader • After creating a control file, SQL*Loader can generate a log file with the SQL commands to: • Create the metadata for the external table • Insert the data into the target table • Drop the metadata for the external table sqlldr sh/sh control=sales_dec00.ctl EXTERNAL_TABLE=GENERATE_ONLY LOG=sales_dec00.sql
Data Dictionary Information for External Tables • DBA_EXTERNAL_LOCATIONS • OWNER • TABLE_NAME • LOCATION • DIRECTORY_OWNER • DIRECTORY_NAME • DBA_DIRECTORIES • OWNER • DIRECTORY_NAME • DIRECTORY_PATH • DBA_EXTERNAL_TABLES • OWNER • NAME • TYPE_OWNER • TYPE_NAME • DEFAULT_DIRECTORY_OWNER • DEFAULT_DIRECTORY_NAME • REJECT_LIMIT
Changing External Data Properties • Using the ALTER TABLE command, you can change: • DEFAULT DIRECTORY • ACCESS PARAMETERS • LOCATION • REJECT_LIMIT • Degree of parallelism • Useful in situations where external files are changing: ALTER TABLE sales_delta_xt LOCATION ('newfile1.dat')
Other Loading Methods • OCI and direct-path APIs: • Allow transformation and loading at the same time • Access an online source • Do not require an intermediary step such as a flat file • Export/import: • Is good for small loads • Allows for easy transfers between Oracle databases on different operating systems
Summary • In this lesson, you should have learned how to implement the following methods that are available for loading data: • SQL*Loader • External tables • OCI and direct-path APIs • Data Pump • Export/import
Practice 5: Overview • This practice covers the following topics: • Loading data from a flat file using SQL*Loader • Loading data from a flat file using external tables