Your Numbers Don't Match My Numbers: Where Does That Data Even Come From? (Intro to DataHub)
Plaid Analytics presented an interactive exploration of DataHub (https://datahubproject.io), an open source metadata platform that supports data governance by allowing you to visualize data lineage between different systems, maintain data and object definitions and descriptions, and manage who has access to this information.
This video is an overall introduction to the tool focused on higher education use cases. An accompanying workshop is also available - if you're interested, contact us via https://plaid.is .
This session was conducted as part of the California Association for Institutional Research Data Talks series (https://cair.org/cair-data-talks/).
We covered topics including:
- What is DataHub and how can it help higher education?
- Connecting DataHub with a student information system-like database and a datawarehouse, and ingesting metadata about tables, columns, and relationships
- Ingesting metadata about FME workflows. FME is a data integration tool and is used here for moving data into the data warehouse from the student information system
- Ingesting metadata about Tableau workbooks, dashboards, worksheets, data sources, calculated fields, relationships, etc.
- Demonstrating lineage between Tableau, the data warehouse, FME workflows (data integration tools), and a student information system-like database
Plaid University - Fall Enrollment Summary
- Andrew Drinkwater
- Patrick Lougheed