What is a Data Processing Engine?
  • 20 Mar 2024
  • 1 Minute to read
  • PDF

What is a Data Processing Engine?

  • PDF

Article summary

Overview

Validatar uses a data processing engine to compare the results of the two datasets executed within a standard test to evaluate whether the test should pass or fail. There are two types of data processing engines: Built-In and Snowflake.

Each data source is assigned a data processing engine when setting up the primary connection to the database. When a standard test is executed, the data processing engine assigned to the data source selected in the "What do you want to test?" section is used the compare the results of both datasets.  

Built-In Engine

The Built-In engine is a licensed feature. When this engine is used, Validatar retrieves the data from both datasets and processes it in memory on the Validatar web server. When the "Scripts Include Order By" option is enabled for a test, the data from both datasets is streamed to the web server instead of being fully loaded into memory, which greatly reduces the total amount of memory used by that test on the Validatar web server.

Results calculated from the built-in engine can either be stored locally in the Validatar metadata repository or in Snowflake.

Snowflake Engine

The Snowflake engine is a licensed feature. When this engine is used, data from both datasets is processed directly within the corresponding Snowflake account and only the summary result of the comparison is sent to the Validatar web server. Data retrieved using a data agent from external databases outside of Snowflake are pushed from the data agent directly into Snowflake.

Results calculated from the Snowflake engine are always stored in Snowflake.


Was this article helpful?