The data warehouse ETL toolkit

Kimball, Ralph.

The data warehouse ETL toolkit practical techniques for extracting, cleaning, conforming, and delivering data / [recurso electrónico] : Ralph Kimball, Joe Caserta. - Indianapolis, IN : Wiley, c2004. - 1 online resource (xxxiv, 491 p.) : ill.

Includes index.

Requirements, Realities, and Architecture -- Surrounding the Requirements -- The Mission of the Data Warehouse -- The Mission of the ETL Team -- ETL Data Structures -- To Stage or Not to Stage -- Designing the Staging Area -- Data Structures in the ETL System -- Planning and Design Standards -- Data Flow -- Extracting -- The Logical Data Map -- Building the Logical Data Map -- Integrating Heterogeneous Data Sources -- The Challenge of Extracting from Disparate Platforms -- Mainframe Sources -- Flat Files -- XML Sources -- Web Log Sources -- ERP System Sources -- Extracting Changed Data -- Cleaning and Conforming -- Defining Data Quality -- Assumptions -- Design Objectives -- Cleaning Deliverables -- Screens and Their Measurements -- Conforming Deliverables -- Delivering Dimension Tables -- The Basic Structure of a Dimension -- The Grain of a Dimension -- The Basic Load Plan for a Dimension -- Flat Dimensions and Snowflaked Dimensions -- Date and Time Dimensions -- Big Dimensions -- Small Dimensions -- One Dimension or Two -- Dimensional Roles -- Dimensions as Subdimensions of Another Dimension -- Degenerate Dimensions -- Slowly Changing Dimensions -- Type 1 Slowly Changing Dimension (Overwrite) -- Type 2 Slowly Changing Dimension (Partitioning History) -- Precise Time Stamping of a Type 2 Slowly Changing Dimension -- Type 3 Slowly Changing Dimension (Alternate Realities) -- Hybrid Slowly Changing Dimensions -- Late-Arriving Dimension Records and Correcting Bad Data -- Multivalued Dimensions and Bridge Tables.

Use copy

Annotation Cowritten by Ralph Kimball, the world's leading data warehousing authority, whose previous books have sold more than 150,000 copiesDelivers real-world solutions for the most time- and labor-intensive portion of data warehousing-data staging, or the extract, transform, load (ETL) processDelineates best practices for extracting data from scattered sources, removing redundant and inaccurate data, transforming the remaining data into correctly formatted data structures, and then loading the end product into the data warehouseOffers proven time-saving ETL techniques, comprehensive guidance on building dimensional structures, and crucial advice on ensuring data quality.


Electronic reproduction.
[S.l.] :
HathiTrust Digital Library,
2010.

0764579231 (electronic bk.) 9780764579233 (electronic bk.) 0764567578 (paper/website) 9780764567575 (paper/website)


Data warehousing.
Database design.
COMPUTERS--Desktop Applications--Databases.
COMPUTERS--Database Management--General.
COMPUTERS--System Administration--Storage & Retrieval.
Data warehousing.
Database design.
Electronic book collection.
Entrepôts de données (Informatique)
Bases de données--Conception.
Data warehousing.
Database design.
Data-Warehouse-Konzept.


Electronic books.

QA76.9.D37 / K53 2004eb

005.74

Con tecnología Koha