Area: Deployment Issues
Sub-Area: Authentication and Authorization Configuration
Issue
Users attempting to enable authentication and authorization features in self-hosted DataHub deployments may encounter issues where environment variables are not properly set in pods, even after applying configuration changes. This commonly occurs when setting METADATA_SERVICE_AUTH_ENABLED, VIEW_AUTHORIZATION_ENABLED, and AUTH_POLICIES_ENABLED through direct environment variable configuration rather than using the proper Helm chart value paths.
You Might Be Asking
- Why don't VIEW_AUTHORIZATION_ENABLED and AUTH_POLICIES_ENABLED appear in my pod environment variables?
- How do I properly enable authentication and authorization in DataHub?
- What's the difference between setting environment variables directly vs. using Helm values?
- How can I verify that my authentication and authorization settings are working correctly?
Solution
The DataHub Helm chart does not read arbitrary top-level environment variable maps on deployments. Instead, authentication and authorization settings must be configured through specific Helm value paths:
-
Configure authentication through proper Helm values:
global: datahub: metadata_service_authentication: enabled: true view: authorization: enabled: true -
Apply the configuration changes:
helm upgrade datahub datahub/datahub -f values.yaml -
Restart the deployments to pick up the new configuration:
kubectl rollout restart deployment/datahub-datahub-gms -nkubectl rollout restart deployment/datahub-datahub-frontend -n -
Verify the environment variables are set correctly:
kubectl exec -it deploy/datahub-datahub-gms -n-- env | grep -E "(METADATA_SERVICE_AUTH_ENABLED|VIEW_AUTHORIZATION_ENABLED)" -
Test authentication functionality:
# Test unauthenticated request (should return 401) curl -s -o /dev/null -w "%{http_code}" \ "http://:8080/openapi/v2/entity/dataset/ /datasetProperties" # Test authenticated request (should return 200) curl -s -o /dev/null -w "%{http_code}" \ -H "Authorization: Bearer " \ "http:// :8080/openapi/v2/entity/dataset/ /datasetProperties" -
Test authorization policies:
# Test unprivileged user attempting metadata modification (should return 403) curl -s -o /dev/null -w "%{http_code}" \ -X POST "http://:8080/aspects?action=ingestProposal" \ -H "Authorization: Bearer " \ -H "X-RestLi-Protocol-Version: 2.0.0" \ -H "Content-Type: application/json" \ -d '{ "proposal": { "entityUrn": " ", "entityType": "dataset", "aspectName": "globalTags", "changeType": "UPSERT", "aspect": { "contentType": "application/json", "value": "{\"tags\":[{\"tag\":\"urn:li:tag:TestTag\"}]}" } } }'
Additional Notes
AUTH_POLICIES_ENABLED defaults to true in the DataHub application configuration and may not appear in pod environment variables even when active. The variable is controlled by application.yaml defaults rather than explicit environment injection. METADATA_SERVICE_AUTH_ENABLED must be consistent across all DataHub services including GMS, frontend, and all metadata consumer pods (MAE, MCE, PE). VIEW_AUTHORIZATION_ENABLED and AUTH_POLICIES_ENABLED only affect the GMS service. When testing view authorization, remember that DataHub ships with a default "All Users - View Entity Page" policy that grants broad access and may need to be temporarily deactivated during testing.
Related Documentation
Tags: authentication, authorization, deployment, helm, kubernetes, environment-variables, security, configuration, self-hosted