Exercise 11: Create a Source to Target Template Test Using a Metadata Script
  • 21 Jun 2024
  • 1 Minute to read
  • PDF

Exercise 11: Create a Source to Target Template Test Using a Metadata Script

  • PDF

Article summary

Learning Goals
  • How do I create a Template Test from scratch?
  • How do I use a script to generate my list of child tests?
  • How do I configure a template test to use variables?
  • How do I connect to an Excel File?
  • How do I compare a file to a table?

Step 1 - Initialize Template Test

  • Create a new Template Test
  • Name your test: "Exercise 11: Source to Target Row Count Excel File vs Table"
  • Set the Quality Dimension to Completeness
  • Set the Severity Level to Low
  • Change the Metadata Selection to Select Objects and Choose the RAW.ACCOUNT table to start with

image.png

Step 2 - Initialize Test Variable

  • Go back to Test Definition tab
  • Add Variable named "source_customer"
  • Variable Type = "Other"
  • Default = "CUSTOMER_1001"

Step 3 - Initialize Test Data Set

  • Set the Data Source to the Snowflake Data Warehouse
  • Set How to get the Data to Script
  • Populate the SQL Script with this:
Select count(*)
From {{schema.name}}.{{table.name}}
where SOURCE_CUSTOMER = '{{var.source_customer}}'
  • Preview & Refresh Options
  • Confirm a single value is returned

image.png

Step 4 - Initialize Control Data Set

  • Compare to "Another Data Set"
  • Set the Data Source to the Windows Directory
  • Set How to get the Data to Script
  • Populate the Python Script with this:
folder = ".\\EXCEL\\{{var.source_customer}}"  
file = "{{#lower table.name}}_data.xlsx"
sheet = 'Sheet1'


# Full path of the file
full_path = parent_folder_path + "\\\\" + folder[1:] 
file_path = os.path.join(full_path, file)
 

# Check if file exists
if not os.path.isfile(file_path):
    raise FileNotFoundError(f"The file {file_path} does not exist.")

# Read excel file
try:
    x = pd.read_excel(file_path, sheet_name=sheet)
    dataframe = pd.DataFrame({'RowCount':[x.index.size]})
except Exception as e:
    raise Exception(f"Failed to read the excel file: {e}")

  • Preview & Refresh Options
  • Confirm a single value is returned

image.png

Step 5 - Configure Metadata Script

  • Scroll to top and change Object Selection and Definition to "Metadata Script"
  • Switch to the Metadata Script tab
  • Set the Data Source to the Snowflake Data Warehouse
  • Populate the SQL Metadata Script with this:
Select t.table_schema || '.' || t.table_name || ' - ' || s.SOURCE_CUSTOMER as "test.key"
, t.table_name as "table.name"
, t.table_schema as "schema.name"
, s.SOURCE_CUSTOMER as "var.source_customer"
from information_schema.tables t
cross join ( select distinct SOURCE_CUSTOMER FROM RAW.ACCOUNT) s
where table_schema = 'RAW'
  • Preview & Validate Metadata Script
  • Confirm that relevant tables and source_customer values are returned

image.png

Step 7 - Save -> Materialize -> Run

  • Save the test
  • Materialize Child Tests
  • Run Child Tests

Was this article helpful?