Area: Product Issues
Sub-Area: Freshness Assertions
Issue
AI-powered freshness assertions fail to learn or remain in failed/training state despite BigQuery tables showing recent "last modified" timestamps. Information Schema-based freshness assertions may also fail unexpectedly even when table metadata appears healthy in the BigQuery console.
You Might Be Asking
- Why do AI freshness assertions fail to train even after recreating them?
- Why do Information Schema freshness checks fail when BigQuery shows recent modifications?
- What permissions are needed for freshness assertions on BigQuery?
Solution
There are two distinct issues that can cause freshness assertion problems:
-
AI Training Failure Due to Missing Operation Events
AI freshness assertions do NOT learn from BigQuery's
last_modified_timeor Information Schema metadata. They require DataHub operation events (INSERT, UPDATE, LOAD) that must be ingested by the BigQuery source.Check your BigQuery ingestion recipe includes:
source: type: bigquery config: include_table_lineage: true include_usage_statistics: true include_tables: true # Must be true, not falseVerify operation events exist by checking the dataset's Timeline tab in the DataHub UI. If no operations are visible, the AI cannot train.
-
Information Schema Query Configuration
Ensure the service account has
bigquery.jobs.listAllpermissions on all projects where assertions are configured, as Information Schema queries require this permission.As a workaround, switch to Platform API freshness source type:
# In assertion configuration freshness_source_type: PLATFORM_APIThis uses
client.get_table().modifiedwhich is more reliable and doesn't depend on AI training. -
Verify Assertion Configuration
Check that your freshness window matches actual table update frequency. If a table updates every 3 days but the assertion expects daily updates, it will legitimately fail.
Additional Notes
AI freshness assertions require a minimum number of operation events spanning at least 7 days for training. Recreating assertions resets the grace period but doesn't solve underlying missing operation data. The Platform API approach provides immediate functionality while operation event ingestion issues are resolved.
Related Documentation
Tags: freshness-assertions, ai-training, bigquery, information-schema, operation-events, platform-api, permissions, smart-assertions, table-lineage, usage-statistics