Overview
When you configure repository replication, Validatar continuously syncs its internal data to your data processing engine (Snowflake or Databricks). This gives your team direct SQL access to metadata catalogs, test definitions, execution results, profiling metrics, trust scores, and security permissions — enabling custom reporting, BI tool integration, and any analysis that benefits from having Validatar's operational data in a queryable warehouse.
This reference section documents every view and table available in the replicated schema, including column definitions, relationships between objects, and practical SQL examples.
Schema Layout
The replicated database contains two schemas:
REPOSITORY Schema
Contains pre-built views that present Validatar's data in a query-friendly format. These are the primary interface for most use cases.
| View | Purpose |
|---|---|
| VW_DATASET_TABLES | All metadata tables/views with trust scores, custom fields, and profile summaries |
| VW_DATASET_COLUMNS | All metadata columns with data types, custom fields, and detailed profile metrics |
| VW_DATASET_TESTS | Test definitions (standard, template, materialized) with quality dimensions and overall status |
| VW_DATASET_TEST_EXECUTIONS | Test execution history with results, durations, and row-level counts |
| VW_DATASET_JOBS | Job definitions and execution history with pass/fail summaries |
| VW_DATASET_METADATA_OBJECTS | Flat list of all schemas, tables, and columns with paths and URLs |
| VW_DATASET_EVENT_HISTORY | Audit log of all platform events |
| VW_SECURE_DATASET_* | Row-level security filtered versions of the above views |
RESULTS Schema
Contains raw test result data and profiling metrics, stored in tables rather than views:
| Table | Purpose |
|---|---|
| TEST_EXECUTION_RESULTS | Row-level results for single-value tests (one row per comparison) |
| TEST_EXECUTION_TABLE_COLUMNS | Column definitions for multi-row test results |
| TEST_EXECUTION_TABLE_RESULTS | Row-level pass/fail for multi-row tests |
| TEST_EXECUTION_TABLE_RESULT_KEYS | Key column values for each result row (long format) |
| TEST_EXECUTION_TABLE_RESULT_VALUES | Value column comparisons for each result row (long format) |
| META_DATA_LATEST_PROFILES | Most recent profile values per column/table |
| META_DATA_PROFILE_SET_EXECUTION_RESULTS | Historical profile execution results |
How the Schemas Relate
┌─────────────────────────────────────────────────────────────────────┐
│ REPOSITORY Schema (Views) │
│ │
│ VW_DATASET_TABLES ──┐ │
│ │ │ │
│ ▼ ▼ │
│ VW_DATASET_COLUMNS VW_DATASET_TESTS │
│ │ │
│ ▼ │
│ VW_DATASET_TEST_EXECUTIONS ──────────────┐ │
│ │ │ │
│ ▼ ▼ │
│ VW_DATASET_JOB_EXECUTIONS RESULTS Schema │
└─────────────────────────────────────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────────────────┐
│ RESULTS Schema (Tables) │
│ │
│ TEST_EXECUTION_RESULTS (single-value tests) │
│ │
│ TEST_EXECUTION_TABLE_RESULTS ───┐ (multi-row tests) │
│ │ │ │
│ ▼ ▼ │
│ TABLE_RESULT_KEYS TABLE_RESULT_VALUES │
│ │
│ META_DATA_LATEST_PROFILES (profiling) │
│ META_DATA_PROFILE_SET_EXECUTION_RESULTS │
└─────────────────────────────────────────────────────────────────────┘
Key Join Patterns
REPOSITORY to RESULTS
Join test executions to their row-level results using TEST_EXECUTION_ID:
SELECT
te.TEST_NAME,
te.RESULT,
r.NUMERIC_SOURCE_VALUE,
r.NUMERIC_TARGET_VALUE,
r.PERCENT_DIFFERENCE
FROM REPOSITORY.VW_DATASET_TEST_EXECUTIONS te
JOIN RESULTS.TEST_EXECUTION_RESULTS r
ON r.TEST_EXECUTION_ID = te.TEST_EXECUTION_ID
WHERE te.LATEST_TEST_EXECUTION_INDEX = 1;
Tables to Columns
Join using TABLE_ID:
SELECT
t.TABLE_NAME,
t.TRUST_SCORE,
c.COLUMN_NAME,
c.PROFILE_NULL_PERCENT
FROM REPOSITORY.VW_DATASET_TABLES t
JOIN REPOSITORY.VW_DATASET_COLUMNS c
ON c.TABLE_ID = t.TABLE_ID
WHERE t.DATA_SOURCE_NAME = 'My Warehouse';
Tests to Metadata Objects
Join using the metadata link object key:
SELECT
t.TEST_NAME,
t.OVERALL_RESULT_STATUS,
mo.METADATA_PATH
FROM REPOSITORY.VW_DATASET_TESTS t
JOIN REPOSITORY.VW_DATASET_METADATA_OBJECTS mo
ON mo.OBJECT_KEY = t.TEST_PRIMARY_METADATA_LINK_OBJECT_KEY
WHERE t.IS_ACTIVE_VERSION = TRUE;
Connecting to the Schema
Snowflake
Use any Snowflake-compatible client or BI tool. The database name, warehouse, and schema are determined by your Validatar data processing engine configuration.
USE DATABASE <your_validatar_database>;
USE SCHEMA REPOSITORY;
-- Verify you can query
SELECT COUNT(*) FROM VW_DATASET_TABLES;
Databricks
Connect via the Databricks SQL endpoint configured as your data processing engine. The catalog and schema names match your Validatar configuration.
USE CATALOG <your_validatar_catalog>;
USE SCHEMA repository;
SELECT COUNT(*) FROM vw_dataset_tables;
Snowflake vs Databricks Differences
The view and table structures are identical between Snowflake and Databricks deployments. Minor SQL syntax differences to be aware of:
| Operation | Snowflake | Databricks |
|---|---|---|
| Date difference | TIMESTAMPDIFF(second, start, end) |
UNIX_TIMESTAMP(end) - UNIX_TIMESTAMP(start) |
| Boolean cast | col::boolean |
CAST(col AS BOOLEAN) |
| String concat | CONCAT(a, b) or a || b |
CONCAT(a, b) |
| Current timestamp | CURRENT_TIMESTAMP() |
CURRENT_TIMESTAMP() |
| Date arithmetic | DATEADD(day, -30, CURRENT_DATE) |
DATE_ADD(CURRENT_DATE(), -30) |
Custom Fields
Many views (VW_DATASET_TABLES, VW_DATASET_COLUMNS) include columns prefixed with CUSTOM_. These correspond to custom metadata fields defined in your Validatar environment and will vary by deployment. The column names are derived from the custom field names configured by your administrator.
Refresh Frequency
Data in the replicated schema is updated according to the sync schedule configured in Settings > Configuration > Repository Replication. See Replication of Your Metadata to a Data Processing Engine for configuration details.
Results data (RESULTS schema) is loaded at test execution time — results appear in the replicated tables as soon as the test completes and the data processing engine processes the staged files.