Area: Observability Issues
Sub-Area: Data Profiling and Smart Assertions
Issue
The Statistics tab appears greyed out on dataset pages in DataHub, preventing users from viewing profiling data such as row counts, null counts, min/max values, and other table statistics. This commonly occurs when profiling is not enabled in ingestion recipes or when Smart Data Quality assertions are not activated in the DataHub environment.
You Might Be Asking
- Why is my Stats tab greyed out and disabled?
- How do I enable data profiling for my datasets?
- What configuration is needed to see statistics in DataHub?
- How do I enable Smart Data Quality assertions?
Solution
The Stats tab is greyed out by design when there is no profiling data available. To enable it, you need to configure profiling in your ingestion recipes:
-
Add profiling configuration to your ingestion recipe:
source: type: databricks # or your source type config: workspace_url: https://.azuredatabricks.net/ # ... other config ... profiling: enabled: true method: "ge" # or "analyze" for Delta tables only warehouse_id: " " -
For other data sources, the profiling configuration varies:
# Snowflake example profiling: enabled: true # BigQuery example profiling: enabled: true profile_table_level_only: false - Save the updated recipe and re-run ingestion to collect profiling data for your datasets.
- Verify the Stats tab becomes active after successful profiling ingestion.
For Smart Data Quality Assertions (DataHub Cloud)
If you need Smart DQ assertions enabled, contact DataHub Support to activate the feature flag in your environment:
- Smart Assertions are a Cloud-only feature that requires backend configuration
- Once enabled, you can create freshness, volume, schema, and field assertions via the UI
- Navigate to a dataset → Validations tab → Create Assertion
Common Issues and Troubleshooting
-
Missing method field: Ensure your profiling block includes a valid
method(e.g., "ge" or "analyze") - Wrong ingestion source: Verify you're updating the correct recipe if you have multiple sources
- Permissions: Ensure your service principal has SELECT access to tables you want to profile
-
Large datasets: Use
profile_patternto limit which tables get profiled to manage run times
Additional Notes
Profiling adds queries against your data sources and will increase ingestion run time and compute usage. For environments with many tables, consider scoping profiling to critical datasets using allow/deny patterns. The Stats tab requires at least one of the following data types: table profiling data, partition profiling data, usage statistics, or operations history.
Related Documentation
- Dataset Usage & Query History
- Smart Assertions
- Data Health Dashboard
- Databricks Source Configuration
Tags: profiling, statistics, stats-tab, greyed-out, databricks, ingestion, smart-assertions, data-quality, observability, configuration