Area: Ingestion Issues
Sub-Area: Sibling Entity Configuration
Issue
When ingesting both dbt models and warehouse tables (Snowflake, Databricks, etc.) into DataHub, the related assets appear as separate entities in search results instead of being combined as sibling entities with a "Composed of" section. This typically occurs when the URNs generated by different ingestion sources don't match exactly, preventing DataHub from recognizing them as the same logical asset.
You Might Be Asking
- Why do I see duplicate entries for the same table when searching in DataHub?
- How can I make dbt models and warehouse tables appear as one combined entity?
- What happened to the "Composed of" section that used to show both components?
- Why did my assets suddenly start appearing separately after they were previously combined?
Solution
The solution depends on your specific warehouse platform and configuration:
For Snowflake + dbt:
- Check if there are URN casing mismatches between your dbt and Snowflake ingestion sources
- If your dbt ingestion recently changed to lowercase URNs, add this to your dbt recipe to maintain the original casing:
source: type: dbt config: convert_urns_to_lowercase: false - Re-run your dbt ingestion to restore the sibling relationships
- Verify that both ingestion sources use consistent
envandplatform_instancevalues
For Databricks/Unity Catalog + dbt:
- Ensure the
platform_instancevalues match between your dbt and Databricks ingestion recipes:# In dbt recipe source: type: dbt config: target_platform: databricks target_platform_instance:# In Databricks/Unity Catalog recipe source: type: unity-catalog config: platform_instance: - Verify that the database field in your dbt models matches the Databricks catalog name exactly
- Set consistent
envvalues across both recipes (e.g., both set to "PROD") - If using Unity Catalog, ensure
include_metastoreis set tofalseor omitted
General troubleshooting steps:
- Run your warehouse ingestion first, then your dbt ingestion
- Check the "Raw" view on affected entities to verify if sibling aspects exist
- If you have duplicate entities from previous misconfigurations, use the delete CLI to clean them up:
datahub delete --by-filter --platform - Re-run both ingestion sources after configuration changes
Alternative approach - Filtering warehouse assets:
If you prefer to show only dbt assets and exclude warehouse-generated duplicates:
- Query the dbt API to get a list of tables managed by dbt
- Use
table_patternorschema_patternin your warehouse ingestion recipe to exclude dbt-managed tables - Configure your ingestion to filter out assets that exist in dbt
Additional Notes
DataHub is designed to show sibling relationships between dbt models and warehouse tables by default. When properly configured, you should see a single search result with both platform logos and the ability to toggle between the dbt view and warehouse view of the same asset. The separate appearance usually indicates a configuration mismatch rather than intended behavior. URN alignment is critical - even small differences in casing, platform instances, or environment values will cause assets to appear separately.
Related Documentation
Tags: dbt, snowflake, databricks, unity-catalog, sibling-entities, duplicate-assets, ingestion-configuration, platform-instance, urn-alignment, warehouse-integration