Area: Ingestion Issues
Sub-Area: Scheduled Ingestion Execution
Issue
Scheduled ingestion sources in DataHub Cloud fail to execute on their configured schedule, appear as "aborted" in the execution history, or become invisible in the UI despite the underlying ingestion process potentially running successfully. This can affect multiple types of sources including BigQuery, Snowflake, and other data platforms.
Error Messages
-
exit code 247(SIGKILL/OOM) Invalid update to immutable state for aspect dataHubExecutionRequestResultFailed to check --report-to support: Command timed out after 10 seconds-
Unknown subquery scopeorCooperativeTimeout deadline exceeded
You Might Be Asking
- Why is my scheduled ingestion not running even though it shows as configured?
- Why do my ingestion runs show as "aborted" but no clear error message?
- Why can't I see recent ingestion runs in the UI execution history?
- Why did my ingestion work before but fail after a DataHub upgrade?
Solution
-
Check execution history and logs
Navigate to DataHub UI → Ingestion → [Your Source] → Run History Click into the most recent failed/aborted run Review the execution logs for specific error messages -
Identify the failure pattern
-
Memory issues (OOM): Look for
exit code 247or memory-related errors -
Timeout issues: Look for
timeoutorCommand timed outmessages - Missing runs: Check if runs are actually executing but not visible in UI
-
Memory issues (OOM): Look for
-
Apply appropriate fix based on issue type
For Memory/OOM Issues:
For Timeout/Executor Issues:1. Go to your ingestion source configuration 2. Click "Advanced Settings" 3. Change CLI Version from 1.5.0.8 to 1.5.0.10 or later 4. Save and manually test the ingestion 5. If still failing, try CLI version 1.4.0.9 as a temporary workaround
For UI Visibility Issues:1. Manually trigger the ingestion source to test connectivity 2. If manual run succeeds, the scheduler may need to be reset 3. Contact DataHub Support to investigate executor coordinator health 4. Check if other sources on the same executor pool are affected1. Check if ingestion is actually running by looking at: - Dataset freshness/last updated timestamps - New metadata appearing in DataHub - S3 logs (for DataHub Cloud customers) 2. If metadata is flowing but runs aren't visible, this indicates an ES indexing issue 3. Contact support to investigate execution request entity indexing -
Prevent recurrence
# For large environments, consider staggering ingestion schedules # Example: Instead of all sources running at midnight Source A: 00:00 (midnight) Source B: 02:00 (2 AM) Source C: 04:00 (4 AM) # For memory-sensitive sources, disable heavy features if needed profiling: enabled: false # Temporarily disable if causing OOM include_table_lineage: false # Reduce memory usage query_log_duration: 7 # Reduce from default 30 days -
Monitor post-fix
1. Wait for the next scheduled execution 2. Verify the run completes successfully 3. Check that new runs appear correctly in the UI 4. Monitor memory usage if previous issue was OOM-related
Additional Notes
DataHub Cloud customers do not manage executor infrastructure directly - all scaling, restarts, and resource allocation are handled by DataHub Support. Memory regression issues were introduced in CLI version 1.5.0.8 due to sqlglot library upgrades, specifically affecting large Snowflake environments with lineage enabled. The issue was resolved in CLI 1.5.0.10. For timeout issues, a 10-second timeout on CLI feature detection can cause fallback to legacy execution paths that overwrite execution request metadata. Race conditions between execution heartbeats and result validation can cause UI visibility issues without affecting actual data ingestion.
Related Documentation
Tags: scheduled-ingestion, executor, memory, oom, timeout, cli-version, sqlglot, bigquery, snowflake, aborted-runs