Status Updates From The Data Warehouse Toolkit:...
The Data Warehouse Toolkit: The Definitive Guide to Dimensional Modeling by
Status Updates Showing 1-30 of 1,028
Ísis Santos Costa
is on page 576 of 608
p. 542
21 • Big Data Analytics (cont.)
↳ Recommended Best Practices for Big Data (cont.)
... ↳ Data Governance Best Practices for Big Data
... ... ↳ There is No Such Thing as Big Data Governance
... ... ↳ Dimensionalize the Data before Applying Governance
... ... ↳ Privacy is the Most Important Governance Perspective
... ... ↳ Don't Choose Big Data over Governance
— May 06, 2026 05:24PM
Add a comment
21 • Big Data Analytics (cont.)
↳ Recommended Best Practices for Big Data (cont.)
... ↳ Data Governance Best Practices for Big Data
... ... ↳ There is No Such Thing as Big Data Governance
... ... ↳ Dimensionalize the Data before Applying Governance
... ... ↳ Privacy is the Most Important Governance Perspective
... ... ↳ Don't Choose Big Data over Governance
Ísis Santos Costa
is on page 574 of 608
p. 540
21 • Big Data Analytics (cont.)
↳ Recommended Best Practices for Big Data (cont.)
... ↳ Data Modeling Best Practices for Big Data (cont.)
(...)
... ... ↳ Declare Structure at Analysis Time
... ... ↳ Load Data as Simple Name-Value Pairs
... ... ↳ Rapidly Prototype Using Data Virtualization
— May 06, 2026 05:23PM
Add a comment
21 • Big Data Analytics (cont.)
↳ Recommended Best Practices for Big Data (cont.)
... ↳ Data Modeling Best Practices for Big Data (cont.)
(...)
... ... ↳ Declare Structure at Analysis Time
... ... ↳ Load Data as Simple Name-Value Pairs
... ... ↳ Rapidly Prototype Using Data Virtualization
Ísis Santos Costa
is on page 574 of 608
p. 540
21 • Big Data Analytics (cont.)
↳ Recommended Best Practices for Big Data (cont.)
... ↳ Data Modeling Best Practices for Big Data
... ... ↳ Think Dimensionally
... ... ↳ Integrate Separate Data Sources with Conformed Dimensions
... ... ↳ Anchor Dimensions with Durable Surrogate Keys
... ... ↳ Expect to Integrate Structured and Unstructured Data
... ... ↳ Use Slowly Changing Dimensions
— May 06, 2026 05:23PM
Add a comment
21 • Big Data Analytics (cont.)
↳ Recommended Best Practices for Big Data (cont.)
... ↳ Data Modeling Best Practices for Big Data
... ... ↳ Think Dimensionally
... ... ↳ Integrate Separate Data Sources with Conformed Dimensions
... ... ↳ Anchor Dimensions with Durable Surrogate Keys
... ... ↳ Expect to Integrate Structured and Unstructured Data
... ... ↳ Use Slowly Changing Dimensions
Ísis Santos Costa
is on page 571 of 608
p. 537
21 • Big Data Analytics (cont.)
↳ Recommended Best Practices for Big Data (cont.)
... ↳ Architecture Best Practices for Big Data (cont.)
Data Highway
➤ Raw Source ➤ Real Time Cache ➤ Business Activity Cache ➤ Top Line Cache ➤ DW and Long Time Series (Daily, Periodic, Annual)
— May 06, 2026 05:21PM
Add a comment
21 • Big Data Analytics (cont.)
↳ Recommended Best Practices for Big Data (cont.)
... ↳ Architecture Best Practices for Big Data (cont.)
Data Highway
➤ Raw Source ➤ Real Time Cache ➤ Business Activity Cache ➤ Top Line Cache ➤ DW and Long Time Series (Daily, Periodic, Annual)
Ísis Santos Costa
is on page 571 of 608
p. 537
21 • Big Data Analytics (cont.)
↳ Recommended Best Practices for Big Data (cont.)
... ↳ Architecture Best Practices for Big Data (cont.)
(...)
... ... ↳ Avoid Boundary Crashes
... ... ↳ Move Prototypes to a Private Cloud
... ... ↳ Strive for Performance Improvements
... ... ↳ Monitor Compute Resources
... ... ↳ Exploit In-Database Analytics
— May 06, 2026 05:20PM
Add a comment
21 • Big Data Analytics (cont.)
↳ Recommended Best Practices for Big Data (cont.)
... ↳ Architecture Best Practices for Big Data (cont.)
(...)
... ... ↳ Avoid Boundary Crashes
... ... ↳ Move Prototypes to a Private Cloud
... ... ↳ Strive for Performance Improvements
... ... ↳ Monitor Compute Resources
... ... ↳ Exploit In-Database Analytics
Ísis Santos Costa
is on page 571 of 608
p. 537
21 • Big Data Analytics (cont.)
↳ Recommended Best Practices for Big Data (cont.)
... ↳ Architecture Best Practices for Big Data
... ... ↳ Plan a Data Highway
... ... ↳ Build a Fact Extractor from Big Data
... ... ↳ Build Comprehensive Ecosystems
... ... ↳ Plan for Data Quality
... ... ↳ Add Value to Data as Soon as Possible
... ... ↳ Implement Backflow to Earlier Caches
... ... ↳ Implement Streaming Data
— May 06, 2026 05:20PM
Add a comment
21 • Big Data Analytics (cont.)
↳ Recommended Best Practices for Big Data (cont.)
... ↳ Architecture Best Practices for Big Data
... ... ↳ Plan a Data Highway
... ... ↳ Build a Fact Extractor from Big Data
... ... ↳ Build Comprehensive Ecosystems
... ... ↳ Plan for Data Quality
... ... ↳ Add Value to Data as Soon as Possible
... ... ↳ Implement Backflow to Earlier Caches
... ... ↳ Implement Streaming Data
Ísis Santos Costa
is on page 567 of 608
p. 533
21 • Big Data Analytics (cont.)
↳ Recommended Best Practices for Big Data
... ↳ Management Best Practices for Big Data
... ... ↳ Structure Big Data Environments Around Analytics
... ... ↳ Delay Building Legacy Environments
... ... ↳ Build from Sandbox Results
... ... ↳ Try Simple Applications First
— May 06, 2026 05:18PM
Add a comment
21 • Big Data Analytics (cont.)
↳ Recommended Best Practices for Big Data
... ↳ Management Best Practices for Big Data
... ... ↳ Structure Big Data Environments Around Analytics
... ... ↳ Delay Building Legacy Environments
... ... ↳ Build from Sandbox Results
... ... ↳ Try Simple Applications First
Ísis Santos Costa
is on page 564 of 608
p. 530
21 • Big Data Analytics (cont.)
↳ Big Data Overview (cont.)
Use Cases (cont.)
➤ Loan risk analysis and insurance policy underwriting
➤ Customer churn analysis
... ↳ Extended RDBMS Architecture
... ↳ MapReduce/Hadoop Architecture
... ↳ Comparison of Big Data Architectures
— May 06, 2026 05:17PM
Add a comment
21 • Big Data Analytics (cont.)
↳ Big Data Overview (cont.)
Use Cases (cont.)
➤ Loan risk analysis and insurance policy underwriting
➤ Customer churn analysis
... ↳ Extended RDBMS Architecture
... ↳ MapReduce/Hadoop Architecture
... ↳ Comparison of Big Data Architectures
Ísis Santos Costa
is on page 564 of 608
p. 530
21 • Big Data Analytics (cont.)
↳ Big Data Overview (cont.)
Use Cases (cont.)
➤ Smart utility meters
➤ Building sensors
➤ Satellite image comparison
➤ CAT scan comparison
➤ Financial account fraud detection and intervention
➤ Computer system hacking detection and intervention
➤ Online game gesture tracking
➤ Big science data analysis
➤ Generic name-value pair analysis
— May 06, 2026 05:14PM
Add a comment
21 • Big Data Analytics (cont.)
↳ Big Data Overview (cont.)
Use Cases (cont.)
➤ Smart utility meters
➤ Building sensors
➤ Satellite image comparison
➤ CAT scan comparison
➤ Financial account fraud detection and intervention
➤ Computer system hacking detection and intervention
➤ Online game gesture tracking
➤ Big science data analysis
➤ Generic name-value pair analysis
Ísis Santos Costa
is on page 564 of 608
p. 530
21 • Big Data Analytics
↳ Big Data Overview
Use Cases
➤ Search ranking
➤ Ad tracking
➤ Location and proximity tracking
➤ Causal factor discovery
➤ Social CRM
➤ Document similarity testing
➤ Genomics analysis
➤ Cohort group discovery
➤ In-flight aircraft status
— May 06, 2026 05:13PM
Add a comment
21 • Big Data Analytics
↳ Big Data Overview
Use Cases
➤ Search ranking
➤ Ad tracking
➤ Location and proximity tracking
➤ Causal factor discovery
➤ Social CRM
➤ Document similarity testing
➤ Genomics analysis
➤ Cohort group discovery
➤ In-flight aircraft status
Ísis Santos Costa
is on page 560 of 608
p. 526
20 • ETL System Design + Dev Process & Tasks (cont.)
↳ Real-Time Implications
... ↳ Real-Time Triage
... ↳ Real-Time Architecture Trade-Offs
➤ Replace Batch Files
➤ Limit Data Quality Screenings
➤ Post Facts with Available Dimensions
➤ Eliminate Data Staging
... ↳ Real-Time Partitions in the Presentation Server
... ... ↳ Transaction Real-Time Partition
... ... ↳ Periodic Snapshot Real-Time Partition
— Apr 30, 2026 07:56PM
Add a comment
20 • ETL System Design + Dev Process & Tasks (cont.)
↳ Real-Time Implications
... ↳ Real-Time Triage
... ↳ Real-Time Architecture Trade-Offs
➤ Replace Batch Files
➤ Limit Data Quality Screenings
➤ Post Facts with Available Dimensions
➤ Eliminate Data Staging
... ↳ Real-Time Partitions in the Presentation Server
... ... ↳ Transaction Real-Time Partition
... ... ↳ Periodic Snapshot Real-Time Partition
Ísis Santos Costa
is on page 554 of 608
p. 520
20 • ETL System Design + Dev Process & Tasks (cont.)
↳ ETL Process Overview (cont.)
... ↳ Develop Incremental ETL Processing
... ... ↳ Step 9: Aggregate Table and OLAP Loads
... ... ↳ Step 10: ETL System Operation and Automation
➤ Schedule Jobs
➤ Automatically Handle Predictable Exceptions and Errors
➤ Gracefully Handle Unpredictable Errors
— Apr 09, 2026 04:02PM
Add a comment
20 • ETL System Design + Dev Process & Tasks (cont.)
↳ ETL Process Overview (cont.)
... ↳ Develop Incremental ETL Processing
... ... ↳ Step 9: Aggregate Table and OLAP Loads
... ... ↳ Step 10: ETL System Operation and Automation
➤ Schedule Jobs
➤ Automatically Handle Predictable Exceptions and Errors
➤ Gracefully Handle Unpredictable Errors
Ísis Santos Costa
is on page 552 of 608
p. 518
20 • ETL System Design + Dev Process & Tasks (cont.)
↳ ETL Process Overview (cont.)
... ↳ Develop Incremental ETL Processing
... ... ↳ Step 8: Fact Table Incremental Processing
... ... ... ↳ Incremental Fact Table Load
... ... ... ↳ Load Snapshot Fact Tables
... ... ... ↳ Speed Up the Load Cycle
✓ More Frequent Loading
✓ Parallel Processing
✓ Parallel Structure
— Apr 09, 2026 03:56PM
Add a comment
20 • ETL System Design + Dev Process & Tasks (cont.)
↳ ETL Process Overview (cont.)
... ↳ Develop Incremental ETL Processing
... ... ↳ Step 8: Fact Table Incremental Processing
... ... ... ↳ Incremental Fact Table Load
... ... ... ↳ Load Snapshot Fact Tables
... ... ... ↳ Speed Up the Load Cycle
✓ More Frequent Loading
✓ Parallel Processing
✓ Parallel Structure
Ísis Santos Costa
is on page 551 of 608
p. 518
20 • ETL System Design + Dev Process & Tasks (cont.)
↳ ETL Process Overview (cont.)
... ↳ Develop Incremental ETL Processing
... ... ↳ Step 8: Fact Table Incremental Processing
... ... ... ↳ Late Arriving Facts & the Surrogate Key Pipeline
If SCD Type 2 attributes are involved, a lookup must be performed using valid from / to information. Even fresh, this not necessarily is the currently valid information.
— Mar 09, 2026 07:53AM
Add a comment
20 • ETL System Design + Dev Process & Tasks (cont.)
↳ ETL Process Overview (cont.)
... ↳ Develop Incremental ETL Processing
... ... ↳ Step 8: Fact Table Incremental Processing
... ... ... ↳ Late Arriving Facts & the Surrogate Key Pipeline
If SCD Type 2 attributes are involved, a lookup must be performed using valid from / to information. Even fresh, this not necessarily is the currently valid information.
Ísis Santos Costa
is on page 550 of 608
p. 516
20 • ETL System Design + Dev Process & Tasks (cont.)
↳ ETL Process Overview (cont.)
... ↳ Develop Incremental ETL Processing
... ... ↳ Step 8: Fact Table Incremental Processing
... ... ... ↳ FT Transformations & Surrogate Key Pipeline
Handling referential integrity violations:
✗ Halt the load
✓ Throw away (when not relevant)
✗ Write errors file
✓ Create dummy dim + surrogate key
✗ Map to single unknown dim
— Feb 28, 2026 06:05AM
Add a comment
20 • ETL System Design + Dev Process & Tasks (cont.)
↳ ETL Process Overview (cont.)
... ↳ Develop Incremental ETL Processing
... ... ↳ Step 8: Fact Table Incremental Processing
... ... ... ↳ FT Transformations & Surrogate Key Pipeline
Handling referential integrity violations:
✗ Halt the load
✓ Throw away (when not relevant)
✗ Write errors file
✓ Create dummy dim + surrogate key
✗ Map to single unknown dim
Ísis Santos Costa
is on page 550 of 608
p. 516
20 • ETL System Design + Dev Process & Tasks (cont.)
↳ ETL Process Overview (cont.)
... ↳ Develop Incremental ETL Processing (cont.)
... ... ↳ Step 8: Fact Table Incremental Processing (cont.)
... ... ... ↳ Fact Table Extract & Data Quality Checkpoint
« Data is written to a staging area.
Data Quality metrics of the raw data are computed. »
— Feb 27, 2026 08:14AM
Add a comment
20 • ETL System Design + Dev Process & Tasks (cont.)
↳ ETL Process Overview (cont.)
... ↳ Develop Incremental ETL Processing (cont.)
... ... ↳ Step 8: Fact Table Incremental Processing (cont.)
... ... ... ↳ Fact Table Extract & Data Quality Checkpoint
« Data is written to a staging area.
Data Quality metrics of the raw data are computed. »
Ísis Santos Costa
is on page 549 of 608
p. 515
20 • ETL System Design + Dev Process & Tasks (cont.)
↳ ETL Process Overview (cont.)
... ↳ Develop Incremental ETL Processing (cont.)
... ... ↳ Step 8: Fact Table Incremental Processing
You can stop a historic load process;
« the incremental processing, by contrast, must be fully automated. »
— Feb 22, 2026 09:49AM
Add a comment
20 • ETL System Design + Dev Process & Tasks (cont.)
↳ ETL Process Overview (cont.)
... ↳ Develop Incremental ETL Processing (cont.)
... ... ↳ Step 8: Fact Table Incremental Processing
You can stop a historic load process;
« the incremental processing, by contrast, must be fully automated. »
Ísis Santos Costa
is on page 548 of 608
p. 514
20 • ETL System Design + Dev Process & Tasks (cont.)
↳ ETL Process Overview (cont.)
... ↳ Develop Incremental ETL Processing
... ... ↳ Step 7: Dimension Table Incremental Processing
... ... ... ↳ Dimension Table Extracts
... ... ... ↳ Identify New and Changed Dimension Rows
... ... ... ↳ Process Changes to Dimension Attributes
— Feb 21, 2026 05:31AM
Add a comment
20 • ETL System Design + Dev Process & Tasks (cont.)
↳ ETL Process Overview (cont.)
... ↳ Develop Incremental ETL Processing
... ... ↳ Step 7: Dimension Table Incremental Processing
... ... ... ↳ Dimension Table Extracts
... ... ... ↳ Identify New and Changed Dimension Rows
... ... ... ↳ Process Changes to Dimension Attributes
Ísis Santos Costa
is on page 546 of 608
p. 512
20 • ETL System Design + Dev Process & Tasks (cont.)
↳ ETL Process Overview (cont.)
... ↳ Develop One-Time Historic Load Processing
... ... ↳ Step 6: Perform the Fact Table Historic Load (cont.)
... ... ... ↳ Fact Table Loading
« The main concern when loading the fact table is load performance. Some database technilogies support fast loading with a specified batch size. »
— Feb 20, 2026 10:09AM
Add a comment
20 • ETL System Design + Dev Process & Tasks (cont.)
↳ ETL Process Overview (cont.)
... ↳ Develop One-Time Historic Load Processing
... ... ↳ Step 6: Perform the Fact Table Historic Load (cont.)
... ... ... ↳ Fact Table Loading
« The main concern when loading the fact table is load performance. Some database technilogies support fast loading with a specified batch size. »
Ísis Santos Costa
is on page 545 of 608
p. 511
20 • ETL System Design + Dev Process & Tasks (cont.)
↳ ETL Process Overview (cont.)
... ↳ Develop One-Time Historic Load Processing
... ... ↳ Step 6: Perform the Fact Table Historic Load (cont.)
... ... ... ↳ Fact Table Transformations
➤ Null Fact Values
➤ Improve Fact Table Content
➤ Pipeline the Dimension Surrogate Key Lookup
➤ Assign Audit Dimension Key
— Feb 19, 2026 01:53PM
Add a comment
20 • ETL System Design + Dev Process & Tasks (cont.)
↳ ETL Process Overview (cont.)
... ↳ Develop One-Time Historic Load Processing
... ... ↳ Step 6: Perform the Fact Table Historic Load (cont.)
... ... ... ↳ Fact Table Transformations
➤ Null Fact Values
➤ Improve Fact Table Content
➤ Pipeline the Dimension Surrogate Key Lookup
➤ Assign Audit Dimension Key
Ísis Santos Costa
is on page 543 of 608
p. 509
20 • ETL System Design + Dev Process & Tasks (cont.)
↳ ETL Process Overview (cont.)
... ↳ Develop One-Time Historic Load Processing
... ... ↳ Step 6: Perform the Fact Table Historic Load (cont.)
... ... ... ↳ Fact Table Transformations
« A lot of time (is spent) improving the dimension table, facts usually require modest transformation. »
System values like -1 shall be replaced by NULL. All FKs be NOT NULL.
— Feb 17, 2026 05:34AM
Add a comment
20 • ETL System Design + Dev Process & Tasks (cont.)
↳ ETL Process Overview (cont.)
... ↳ Develop One-Time Historic Load Processing
... ... ↳ Step 6: Perform the Fact Table Historic Load (cont.)
... ... ... ↳ Fact Table Transformations
« A lot of time (is spent) improving the dimension table, facts usually require modest transformation. »
System values like -1 shall be replaced by NULL. All FKs be NOT NULL.
Ísis Santos Costa
is on page 542 of 608
p. 508
20 • ETL System Design + Dev Process & Tasks (cont.)
↳ ETL Process Overview (cont.)
... ↳ Develop One-Time Historic Load Processing
... ... ↳ Step 6: Perform the Fact Table Historic Load
... ... ... ↳ Historic Fact Table Extracts
... ... ... ↳ Audit Statistics
It is not always possible to tie the data back to source systems; in these cases, it is crucial that reasons for differences be documented clearly.
— Feb 15, 2026 06:34AM
Add a comment
20 • ETL System Design + Dev Process & Tasks (cont.)
↳ ETL Process Overview (cont.)
... ↳ Develop One-Time Historic Load Processing
... ... ↳ Step 6: Perform the Fact Table Historic Load
... ... ... ↳ Historic Fact Table Extracts
... ... ... ↳ Audit Statistics
It is not always possible to tie the data back to source systems; in these cases, it is crucial that reasons for differences be documented clearly.
Ísis Santos Costa
is on page 541 of 608
p. 507
20 • ETL System Design + Dev Process & Tasks (cont.)
↳ ETL Process Overview (cont.)
... ↳ Develop One-Time Historic Load Processing
... ... ↳ Step 5: Populate Dim Tables w Historic Data (cont.)
... ... ... ↳ Dim Table Loading
... ... ... ↳ Load Type 2 Dimension Table History
... ... ... ↳ Populate Date and Other Static Dimensions
— Feb 14, 2026 05:58AM
Add a comment
20 • ETL System Design + Dev Process & Tasks (cont.)
↳ ETL Process Overview (cont.)
... ↳ Develop One-Time Historic Load Processing
... ... ↳ Step 5: Populate Dim Tables w Historic Data (cont.)
... ... ... ↳ Dim Table Loading
... ... ... ↳ Load Type 2 Dimension Table History
... ... ... ↳ Populate Date and Other Static Dimensions
Ísis Santos Costa
is on page 540 of 608
p. 506
20 • ETL System Design + Dev Process & Tasks (cont.)
↳ ETL Process Overview (cont.)
... ↳ Develop One-Time Historic Load Processing
... ... ↳ Step 5: Populate Dim Tables w Historic Data
... ... ... ↳ Populate Type 1 Dim Tables
... ... ... ↳ Dim Transformations
➤ Simple Data Transformations
➤ Combine from Separate Sources
➤ Decode Production Codes
➤ Validate M2M one-to-one
➤ Dimension Surrogate Key Assignment
— Feb 10, 2026 09:06AM
Add a comment
20 • ETL System Design + Dev Process & Tasks (cont.)
↳ ETL Process Overview (cont.)
... ↳ Develop One-Time Historic Load Processing
... ... ↳ Step 5: Populate Dim Tables w Historic Data
... ... ... ↳ Populate Type 1 Dim Tables
... ... ... ↳ Dim Transformations
➤ Simple Data Transformations
➤ Combine from Separate Sources
➤ Decode Production Codes
➤ Validate M2M one-to-one
➤ Dimension Surrogate Key Assignment








