Overview
As a Data Architect or Data Engineer, you've invested significant effort designing your data warehouse, building ETL/ELT pipelines, and ensuring data flows correctly from source systems to analytical models. Validatar helps you automatically validate that your data transformations work as expected—without writing mountains of test scripts.
This guide will help you get immediate value from Validatar by focusing on the testing patterns most relevant to your role:
- Source-to-Target Validation - Ensure data moves correctly between staging and your warehouse
- Row Count and Aggregate Checks - Quick wins to validate loads completed successfully
- Schema Drift Detection - Catch unexpected metadata changes before they break downstream processes
- Template-Based Testing - Scale your test coverage across hundreds of tables efficiently
By the end of this guide, you'll have:
- Connected your data warehouse to Validatar
- Created your first template test covering multiple tables
- Set up automated row count validation for your key fact tables
- Understood how to detect schema changes automatically
Prerequisites
Before starting, ensure you have:
- Access to Validatar - Either Cloud or Server edition (log in here)
- A Project created - Tests live within projects (create a project)
- At least one Data Source configured - Your data warehouse connection (create a data source)
- Schema metadata ingested - Validatar needs to know your tables and columns (ingest metadata)
Part 1: Understanding Key Features for Data Architects
Template Tests - Scale Your Testing
The biggest time-saver for Data Architects is Template Testing. Instead of writing individual tests for each table, you define a test pattern once and apply it across all matching objects.
Example Use Cases:
- Validate all fact tables have non-null surrogate keys
- Check all staging tables have matching row counts in target tables
- Ensure all date columns fall within expected ranges
Template tests use metadata references to dynamically generate child tests. When you add new tables to your warehouse, Validatar can automatically create tests for them.
Source-to-Target Testing
One of the most common validation needs is ensuring data flows correctly from source to target. Validatar's comparison testing makes this straightforward:
| Test Type | Use Case |
|---|---|
| Row Count Match | Verify staging table count matches target after ETL |
| Aggregate Comparison | Compare SUM, AVG, MIN, MAX between source and target |
| Full Data Comparison | Row-by-row validation for critical tables |
| Key-Based Comparison | Match records by business key and compare specific columns |
Part 2: Quick Wins - Your First Tests
Let's create tests that deliver immediate value. We'll start with simple validations and progress to template-based testing.
Quick Win #1: Row Count Validation
Row count checks are the fastest way to catch ETL failures. Here's how to create one:
- Navigate to your project's Test Repository
- Click New Test > New Standard Test
- Configure the test:
- Name:
Fact_Sales - Row Count Check - Test Data Set: Select your fact table
- Data Retrieval: Choose "Row Count"
- Name:
- Set the Result Criteria:
- Compare to Defined Rule
- Condition:
Row Count > 0
- Save and run the test
For ongoing validation, compare today's row count against yesterday's using "Compare to Previous Runs" with an acceptable variance threshold (e.g., +/- 10%).
Quick Win #2: Source-to-Target Count Match
Validate that your ETL loaded all records:
- Create a new Standard Test
- Test Data Set:
- Data Source:
Staging_DB - Query:
SELECT COUNT(*) as row_count FROM staging.customers
- Data Source:
- Compare To: Another Data Set
- Comparison Data Set:
- Data Source:
Data_Warehouse - Query:
SELECT COUNT(*) as row_count FROM dw.dim_customer
- Data Source:
- Result Criteria: Data sets should match exactly
Quick Win #3: Null Check on Required Columns
Ensure critical columns aren't null:
- Create a new Standard Test
- Test Data Set:
SELECT COUNT(*) as null_count FROM dw.dim_customer WHERE customer_key IS NULL - Result Criteria:
null_count = 0
Part 3: Template Testing - Scale Your Coverage
Now let's leverage templates to create tests across multiple tables at once.
Creating a Row Count Template
This template will create row count validation tests for all your fact tables:
- Navigate to Test Repository > New Test > New Template Test
- Template Configuration:
- Name:
Fact Table Row Count Validation - Target Object Type: Table
- Name:
- Object Selection: Choose "Specific Metadata Objects"
- Filter tables where name starts with
Fact_or matches your naming convention
- Filter tables where name starts with
- Test Data Set:
SELECT COUNT(*) as row_count FROM {{schema.name}}.{{table.name}} - Result Criteria:
row_count > 0 - Materialize the child tests
Validatar will create individual tests for each matching fact table. When you add new fact tables, simply re-materialize to include them.
Creating a Primary Key Uniqueness Template
Ensure all your dimension tables have unique surrogate keys:
- Create a new Template Test
- Name:
Dimension PK Uniqueness - Target Object Type: Table (filter to
Dim_*tables) - Test Data Set:
SELECT {{table.cf.primary_key_column}}, COUNT(*) as duplicates FROM {{schema.name}}.{{table.name}} GROUP BY {{table.cf.primary_key_column}} HAVING COUNT(*) > 1 - Result Criteria: Row count = 0 (no duplicates)
Template tests use {{placeholder}} syntax for metadata references. Ensure your metadata has been properly ingested so these references resolve correctly.
Part 4: Schema Drift Detection
Data Architects need to know when source schemas change unexpectedly. Validatar's metadata profiling helps you detect:
- New columns added to source tables
- Columns dropped or renamed
- Data type changes
- New tables appearing in schemas
Setting Up Schema Monitoring
- Navigate to Settings > Data Sources > Your Source System
- Go to the Metadata tab
- Configure Scheduled Refresh:
- Set a daily or weekly schedule
- Enable change tracking
- Review changes in the Catalog Explorer
You can also create tests that explicitly validate schema expectations:
-- Test: Customer table should have exactly 15 columns
SELECT COUNT(*) as column_count
FROM INFORMATION_SCHEMA.COLUMNS
WHERE TABLE_NAME = 'dim_customer'
With result criteria: column_count = 15
Part 5: Organizing Your Tests
As your test suite grows, organization becomes critical.
Recommended Folder Structure
Project: Data Warehouse Validation
├── Source-to-Target Tests/
│ ├── Staging to DW/
│ └── Source System to Staging/
├── Data Quality Tests/
│ ├── Null Checks/
│ ├── Uniqueness/
│ └── Referential Integrity/
├── Row Count Validations/
│ ├── Fact Tables/
│ └── Dimension Tables/
└── Schema Tests/
└── Structure Validation/
Part 6: Automating Test Execution
Creating a Job
Jobs group tests for scheduled execution:
- Navigate to Job Repository > New Job
- Add your tests to the job
- Configure execution settings:
- Parallel execution for faster runs
- Set up a schedule aligned with your ETL runs
Integration with Data Pipelines
For pipeline integration, use the Validatar API to:
- Trigger test execution after ETL completes
- Poll for results and fail pipelines on test failures
- Include data quality metrics in your observability platform
Next Steps
You've now established a solid foundation for automated data validation. Here's where to go next:
| Goal | Resource |
|---|---|
| Explore all test configuration options | Configuring a Test |
| Learn about data profiling | Data Profiles |
| Set up notifications for test failures | Turning On Notifications |
| Build custom reports | Creating Reports |
| Understand the test lifecycle | Test Automation Lifecycle |
Summary
As a Data Architect, Validatar helps you:
- Save time with template-based testing across hundreds of objects
- Catch issues early with source-to-target validation
- Prevent surprises with automated schema drift detection
- Scale confidently knowing your data pipelines are validated
Start with the quick wins to prove immediate value, then progressively expand your test coverage using templates. Your future self (and your stakeholders) will thank you.
Check out our FAQs or contact support if you have questions about implementing these testing patterns.