Skip to main content

Glossary

A list of terms and definitions

  • Source: A data source (files, databases, etc.).
  • Pack: A microprogram that analyzes a source.
  • Metrics: Results of a pack's analysis on a source. Example: completeness=0.8
  • Worker: An instance of the CLI that executes packs on data sources. Workers are deployed on-premises and are the only components with direct access to your data.
  • Routine: A recurring execution of a pack on a source, scheduled via CRON expressions.
  • Scope: The scope of a source (dataset, table, column, cell, item, etc.) to which metrics or recommendations can refer.
  • Ticket: A task to be undertaken by individuals; actions to be performed. Also referred to as "Issue".
  • Project: An organizational entity that promotes collaboration by grouping information to ensure quality assurance across a set of sources.
  • Report: A page presenting a set of information about the quality of one or more sources linked to a project. Exportable and shareable.
  • Studio: The AI-powered conversational interface for exploring, investigating, and documenting data quality. Supports multiple LLM providers and streaming responses.
  • LLM Configuration: A configuration entry defining which Large Language Model provider and model to use for Studio's AI features. Supports OpenAI, Azure OpenAI, Anthropic, Mistral, Ollama, and generic OpenAI-compatible APIs.
  • MCP Server: A Model Context Protocol server that extends Studio's capabilities with custom external tools. MCP servers are configured per organization and communicate via JSON-RPC 2.0.
  • Alert: A notification triggered when data quality scores fall below defined thresholds. Alerts can be configured per source and silenced when needed.
  • Certification: A workflow for formally certifying the quality status of data sources based on defined criteria and quality metrics.
  • Curation Plan: A structured plan to federate teams around data quality challenges, defining actions and responsibilities for improving data quality.
  • Domain: The business domain associated with a data source (e.g., Finance, Marketing, Healthcare). Used for governance and catalog organization.
  • Data Owner: The person or team responsible for the quality and governance of a data source.
  • Habilitation: Granular permissions assigned to users on specific data sources, controlling access levels within the platform.
  • Recommendation: A suggestion produced by a pack to improve the quality of a data source. Recommendations are attached to specific scopes.