REST API (Swagger) Data Source Template

Prev Next

Overview

The REST API data source template enables Validatar to discover and profile data exposed through REST APIs that publish Swagger/OpenAPI specifications. It uses Python scripts to read the API specification, map endpoints to tables and response fields to columns, and sample data for profiling.

Platform: Any REST API with Swagger/OpenAPI spec
Connection Category: Script
Template Category: Marketplace

What's Included

Default Parameters

Parameter Type Description
api_base_url String Base URL of the API
swagger_url String URL of the Swagger/OpenAPI spec (e.g., /swagger/v1/swagger.json)
api_key Secret API authentication key or token
auth_type Dropdown Authentication type (Bearer, API Key Header, Basic)
auth_header_name String Custom header name for API key authentication
page_size Integer Records per page for paginated endpoints
max_sample_pages Integer Maximum pages to fetch for profiling samples

Data Type Mappings

Maps OpenAPI/JSON Schema types:

  • string → String
  • integer → Integer
  • number → Decimal
  • boolean → Boolean
  • array, object → Other

Metadata Ingestion

The ingestion script:

  • Fetches the Swagger/OpenAPI specification from the configured URL
  • API tags or resource groups become schemas
  • Each endpoint (GET operations) becomes a table
  • Response schema properties become columns with types from the spec
  • Supports OpenAPI 2.0 (Swagger) and 3.0 specifications

Profiling

The profiling script:

  • Calls each GET endpoint with pagination
  • Samples response data up to the configured page limit
  • Calculates record counts, null counts, distinct counts, and type-specific metrics
  • Respects rate limits with configurable delays between requests

Installation

Customization

  • Authentication — Modify for OAuth2, JWT, or custom authentication flows
  • Pagination — Adjust pagination logic for APIs that use cursor-based, offset, or link-header pagination
  • Endpoint filtering — Include or exclude specific endpoints from discovery
  • Nested objects — Extend column discovery to flatten nested response objects
  • POST endpoints — Modify the ingestion script to include POST endpoints that return data (e.g., search/filter endpoints)

Related Articles