Data Profiles
  • 20 Mar 2024
  • 2 Minutes to read
  • PDF

Data Profiles

  • PDF

Article Summary

What are Data Profiles?

Data Profiles are information about existing data that help to determine the accuracy, completeness, and quality of your data. Data profiling is typically used within the broader context of ELT, monitoring, and Data Governance. When done properly, data profiling can play a significant part in data cleansing, enriching, and maintaining quality data within an organization.

Some common data profiles that are useful in many cases are:

  • Record count
  • Distinct value count
  • Nulls
  • Range of values
  • Average values

Validatar contains a standard set of 40 data profiles. They are listed on the Data Profiles page with filterable columns. Any data profile on the list can be deleted by clicking on the white space next to the Name and then clicking the Delete button at the top of the page.

Data Profiles

Creating a New Data Profile

When you create a new data profile, you're creating the existence of the profile. To configure the data profile definition, navigate to Settings > Database Engines > Profiling > Choose the data profile. This allows for different database engines to use the same data profile name, but to be configured differently.

  1. Click New Profile on the Data Profiles page.
  2. Configure the settings in Table 1.
  3. This will bring up a new page with space to enter the Name (required), Reference Key (required), Description (optional), Profiled Object (required), and Results Format (required). Once the required fields are populated, the Save button will become clickable and you can save the new profile.

Table 1

SettingDescription
NameThe data profile name.
Reference KeyThe data profile's unique key identifier.
Description (optional)A description of the data profile.
Profile ObjectChoose a table or column-level profile.
  • Table level - Evaluates the entire table.
  • Column level - Evaluates one column.
Result FormatThe data type and format the result should be in. Options are:
  • Date/Time Array
  • Date/Time Value
  • Numeric Array
  • Numeric Value
  • Percent Value
  • String Array
  • String Value
Restrict data types (checkbox)When checked, the data profile is only valid for the selected data types. The data type options are:
  • Approximate Numeric
  • Boolean/Logical
  • Date/Time
  • Exact Numeric
  • Geospatial
  • GUID
  • String
  • Other

Data Profile List

The complete list of the standard profiles pre-written in Validatar is:





Record CountLower QuartileNull PercentMaximum (Date)
Total Data MBUpper QuartileBlank CountLongest Value
Distinct CountMinimum (String)Blank PercentShortest Value
Distinct PercentMaximum (String)Numeric CountDistribution (String)
Most Common ValueStandard DeviationNumeric PercentDistribution (Numeric)
Most Common CountMax LengthZero CountTop 10 Values
Minimum (Numeric)Min LengthZero PercentBottom 10 Values
Maximum (Numeric)Mean LengthNegative CountBinned (Numeric)
Mean (Numeric)Length DistributionNegative PercentYear Distribution
Median (Numeric)Null CountMinimum (Date)Year Month Distribution


Data Profile Details

Each data profile name is clickable and when opened, displays a details page for that profile. The details include Name, Reference Key, Description, Profiled Object, and Results Format. There is also an optional Restrict Data Type checkbox and if checked, will display a list of options. Multiple options can be checked within the Data Type options list. The data profile details page can be updated and saved.


Was this article helpful?