Area: Product Issues
Sub-Area: Analytics Dashboard
Issue
Users frequently encounter confusion about the terminology and counting methods used in DataHub's Analytics dashboard charts. The "Assets by Platform" chart may show inflated numbers compared to search results, and there's uncertainty about the difference between "Assets" and "Entities" in the various analytics visualizations.
You Might Be Asking
- What's the difference between "Assets" and "Entities" in the analytics charts?
- Why does the "Assets by Platform" chart show different numbers than search results?
- Why does one platform show significantly more assets than another platform?
- What types of objects are included in these counts?
Solution
Understanding the Terminology:
In DataHub's Analytics dashboard, "Entity" and "Asset" refer to the same thing and are used interchangeably. Both terms count first-class metadata objects in your DataHub instance, including:
- Datasets (tables, views, materialized views)
- Dashboards
- Charts
- Data Jobs and Data Flows
- ML Models
- Containers (databases, schemas)
- Other metadata types
Chart Behavior:
- The "Entities by Type" chart breaks down counts by entity type (e.g., "4,000 Datasets, 200 Dashboards")
- The "Assets by Platform" chart aggregates all entity types per platform into a single count
- Both charts exclude soft-deleted entities and query entities from the count
Troubleshooting Count Discrepancies:
- If you notice discrepancies between the analytics chart and search results, verify your search by using the platform filter:
platform:snowflake platform:dremio - Different platforms showing vastly different asset counts is normal and reflects:
- Actual differences in platform footprint size
- Different ingestion scope and configuration
- What metadata types are available from each platform
- If the analytics chart shows significantly higher numbers than search results, this may indicate a version-related issue where query entities are incorrectly included in the count
Known Issue Resolution:
In DataHub versions prior to v0.3.16.5, there was a bug where the "Assets by Platform" chart incorrectly included query entities (internal tracking records for SQL queries) in the asset count. This caused platforms with heavy query activity to show inflated numbers. This issue was resolved in:
- DataHub v0.3.16.5
- DataHub v0.3.17 and later
If you're experiencing count discrepancies, check your DataHub version and upgrade if necessary.
Additional Notes
The analytics charts provide valuable insights into your data landscape composition. Significant differences between platforms are typically expected and reflect real usage patterns rather than bugs. However, if you suspect an issue with your specific deployment, comparing chart numbers with search results using platform filters can help identify potential problems. Query entities are excluded from counts in recent versions to provide accurate asset reporting.
Related Documentation
Tags: analytics, dashboard, metrics, entities, assets, platform, counts, query-entities, troubleshooting, ui