Documentation Index

Fetch the complete documentation index at: https://docs.validatar.com/llms.txt

Use this file to discover all available pages before exploring further.

REST API (Swagger) Data Source Template

Prev Next

Overview

The REST API data source template enables Validatar to discover and profile data exposed through REST APIs that publish Swagger/OpenAPI specifications. It uses Python scripts to read the API specification, map endpoints to tables and response fields to columns, and sample data for profiling.

Platform: Any REST API with Swagger/OpenAPI spec
Connection Category: Script
Template Category: Marketplace

What's Included

Default Parameters

Parameter Type Description
api_base_url String Base URL of the API
swagger_url String URL of the Swagger/OpenAPI spec (e.g., /swagger/v1/swagger.json)
api_key Secret API authentication key or token
auth_type Dropdown Authentication type (Bearer, API Key Header, Basic)
auth_header_name String Custom header name for API key authentication
page_size Integer Records per page for paginated endpoints
max_sample_pages Integer Maximum pages to fetch for profiling samples

Data Type Mappings

Maps OpenAPI/JSON Schema types:

  • string → String
  • integer → Integer
  • number → Decimal
  • boolean → Boolean
  • array, object → Other

Metadata Ingestion

The ingestion script:

  • Fetches the Swagger/OpenAPI specification from the configured URL
  • API tags or resource groups become schemas
  • Each endpoint (GET operations) becomes a table
  • Response schema properties become columns with types from the spec
  • Supports OpenAPI 2.0 (Swagger) and 3.0 specifications

Profiling

The profiling script:

  • Calls each GET endpoint with pagination
  • Samples response data up to the configured page limit
  • Calculates record counts, null counts, distinct counts, and type-specific metrics
  • Respects rate limits with configurable delays between requests

Installation

Customization

  • Authentication — Modify for OAuth2, JWT, or custom authentication flows
  • Pagination — Adjust pagination logic for APIs that use cursor-based, offset, or link-header pagination
  • Endpoint filtering — Include or exclude specific endpoints from discovery
  • Nested objects — Extend column discovery to flatten nested response objects
  • POST endpoints — Modify the ingestion script to include POST endpoints that return data (e.g., search/filter endpoints)

Related Articles