Getting Started as a Data Architect

Overview

As a Data Architect or Data Engineer, you've invested significant effort designing your data warehouse, building ETL/ELT pipelines, and ensuring data flows correctly from source systems to analytical models. Validatar helps you automatically validate that your data transformations work as expected—without writing mountains of test scripts.

This guide will help you get immediate value from Validatar by focusing on the testing patterns most relevant to your role:

Source-to-Target Validation - Ensure data moves correctly between staging and your warehouse
Row Count and Aggregate Checks - Quick wins to validate loads completed successfully
Schema Drift Detection - Catch unexpected metadata changes before they break downstream processes
Template-Based Testing - Scale your test coverage across hundreds of tables efficiently

What You'll Accomplish

By the end of this guide, you'll have:

Connected your data warehouse to Validatar
Created your first template test covering multiple tables
Set up automated row count validation for your key fact tables
Understood how to detect schema changes automatically

Prerequisites

Before starting, ensure you have:

Access to Validatar - Either Cloud or Server edition (log in here)
A Project created - Tests live within projects (create a project)
At least one Data Source configured - Your data warehouse connection (create a data source)
Schema metadata ingested - Validatar needs to know your tables and columns (ingest metadata)

Part 1: Understanding Key Features for Data Architects

Template Tests - Scale Your Testing

The biggest time-saver for Data Architects is Template Testing. Instead of writing individual tests for each table, you define a test pattern once and apply it across all matching objects.

Example Use Cases:

Validate all fact tables have non-null surrogate keys
Check all staging tables have matching row counts in target tables
Ensure all date columns fall within expected ranges

Template tests use metadata references to dynamically generate child tests. When you add new tables to your warehouse, Validatar can automatically create tests for them.

Source-to-Target Testing

One of the most common validation needs is ensuring data flows correctly from source to target. Validatar's comparison testing makes this straightforward:

Test Type	Use Case
Row Count Match	Verify staging table count matches target after ETL
Aggregate Comparison	Compare SUM, AVG, MIN, MAX between source and target
Full Data Comparison	Row-by-row validation for critical tables
Key-Based Comparison	Match records by business key and compare specific columns

Part 2: Quick Wins - Your First Tests

Let's create tests that deliver immediate value. We'll start with simple validations and progress to template-based testing.

Quick Win #1: Row Count Validation

Row count checks are the fastest way to catch ETL failures. Here's how to create one:

Navigate to your project's Test Repository
Click New Test > New Standard Test
Configure the test:
- Name: Fact_Sales - Row Count Check
- Test Data Set: Select your fact table
- Data Retrieval: Choose "Row Count"
Set the Result Criteria:
- Compare to Defined Rule
- Condition: Row Count > 0
Save and run the test

Pro Tip

For ongoing validation, compare today's row count against yesterday's using "Compare to Previous Runs" with an acceptable variance threshold (e.g., +/- 10%).

Quick Win #2: Source-to-Target Count Match

Validate that your ETL loaded all records:

Create a new Standard Test
Test Data Set:
- Data Source: Staging_DB
- Query: SELECT COUNT(*) as row_count FROM staging.customers
Compare To: Another Data Set
Comparison Data Set:
- Data Source: Data_Warehouse
- Query: SELECT COUNT(*) as row_count FROM dw.dim_customer
Result Criteria: Data sets should match exactly

Quick Win #3: Null Check on Required Columns

Ensure critical columns aren't null:

Create a new Standard Test

Test Data Set:

SELECT COUNT(*) as null_count
FROM dw.dim_customer
WHERE customer_key IS NULL

Result Criteria: null_count = 0

Part 3: Template Testing - Scale Your Coverage

Now let's leverage templates to create tests across multiple tables at once.

Creating a Row Count Template

This template will create row count validation tests for all your fact tables:

Navigate to Test Repository > New Test > New Template Test
Template Configuration:
- Name: Fact Table Row Count Validation
- Target Object Type: Table
Object Selection: Choose "Specific Metadata Objects"
- Filter tables where name starts with Fact_ or matches your naming convention

Test Data Set:

SELECT COUNT(*) as row_count
FROM {{schema.name}}.{{table.name}}

Result Criteria: row_count > 0
Materialize the child tests

Validatar will create individual tests for each matching fact table. When you add new fact tables, simply re-materialize to include them.

Creating a Primary Key Uniqueness Template

Ensure all your dimension tables have unique surrogate keys:

Create a new Template Test
Name: Dimension PK Uniqueness
Target Object Type: Table (filter to Dim_* tables)

Test Data Set:

SELECT {{table.cf.primary_key_column}}, COUNT(*) as duplicates
FROM {{schema.name}}.{{table.name}}
GROUP BY {{table.cf.primary_key_column}}
HAVING COUNT(*) > 1

Result Criteria: Row count = 0 (no duplicates)

Metadata Reference Syntax

Template tests use {{placeholder}} syntax for metadata references. Ensure your metadata has been properly ingested so these references resolve correctly.

Part 4: Schema Drift Detection

Data Architects need to know when source schemas change unexpectedly. Validatar's metadata profiling helps you detect:

New columns added to source tables
Columns dropped or renamed
Data type changes
New tables appearing in schemas

Setting Up Schema Monitoring

Navigate to Settings > Data Sources > Your Source System
Go to the Metadata tab
Configure Scheduled Refresh:
- Set a daily or weekly schedule
- Enable change tracking
Review changes in the Catalog Explorer

You can also create tests that explicitly validate schema expectations:

-- Test: Customer table should have exactly 15 columns
SELECT COUNT(*) as column_count
FROM INFORMATION_SCHEMA.COLUMNS
WHERE TABLE_NAME = 'dim_customer'

With result criteria: column_count = 15

Part 5: Organizing Your Tests

As your test suite grows, organization becomes critical.

Recommended Folder Structure

Project: Data Warehouse Validation
├── Source-to-Target Tests/
│   ├── Staging to DW/
│   └── Source System to Staging/
├── Data Quality Tests/
│   ├── Null Checks/
│   ├── Uniqueness/
│   └── Referential Integrity/
├── Row Count Validations/
│   ├── Fact Tables/
│   └── Dimension Tables/
└── Schema Tests/
    └── Structure Validation/

Part 6: Automating Test Execution

Creating a Job

Jobs group tests for scheduled execution:

Navigate to Job Repository > New Job
Add your tests to the job
Configure execution settings:
- Parallel execution for faster runs
Set up a schedule aligned with your ETL runs

Integration with Data Pipelines

For pipeline integration, use the Validatar API to:

Trigger test execution after ETL completes
Poll for results and fail pipelines on test failures
Include data quality metrics in your observability platform

Next Steps

You've now established a solid foundation for automated data validation. Here's where to go next:

Goal	Resource
Explore all test configuration options	Configuring a Test
Learn about data profiling	Data Profiles
Set up notifications for test failures	Turning On Notifications
Build custom reports	Creating Reports
Understand the test lifecycle	Test Automation Lifecycle

Summary

As a Data Architect, Validatar helps you:

Save time with template-based testing across hundreds of objects
Catch issues early with source-to-target validation
Prevent surprises with automated schema drift detection
Scale confidently knowing your data pipelines are validated

Start with the quick wins to prove immediate value, then progressively expand your test coverage using templates. Your future self (and your stakeholders) will thank you.

Need Help?

Check out our FAQs or contact support if you have questions about implementing these testing patterns.