A Gala of MDM & Data Governance Use Cases: Building Responsible AI without Reckless Data – Part 3

I hope you could relate to the previous Parts of Use cases on MDM and Data Catalogs. Today, we are bringing a checklist instead of one use case. The checklist is built based on all our experiences working on generative AI projects, and here it is.

Use Case III : A Generative AI Product Manager’s Checklist

A Checklist definitely protects any project we usually do under pressure. Generative AI is new for all. Implementing a project without a formal education is super risky. Reliable data is super crucial for the success of these AI projects, as data is the oil for these engines. Bad oil will also corrupt these systems. All product managers, program managers or any data managers must set data as the prerequisite for building these systems.

Here’s a comprehensive Checklist of Best Practices for Data and AI Governance to check for data readiness prior to building Generative AI systems. It can be an audit tool as well.

Data & AI Governance Best Practices Checklist for Data Readiness Checks for building Generative AI Systems

1	Data Quality & Management	Score (1 to 5)
1.1	Master Data Management is in Place
1.2	Data Quality Rules are set
1.3	Data Cataloging is present
1.4	Lineage is tracked.
1.5	Data Cleanup processes are active
2	Model Governance
2.1	All Models are owned by a steward
2.2	Model input and outputs are logged
2.3	Model Performance Metrics are reported
2.4	Change Management Processes are operational.
2.5	Models are versioned and archived with older training data
3	Responsible & Ethical AI Use
3.1	Bias Assessment is part of AI Model Operational Management
3.2	Model Datasheets are published
3.3	Models are tested for fairness across diverse user groups
3.4	Human testers are hired for sensitive outputs
3.5	An ethics review board exists and evaluates critical use cases
4	Data Privacy & Security
4.1	Personally identifiable information (PII) is anonymized or removed
4.2	Encryption is applied for data at rest and in motion
4.3	Access controls follow the principle of least privilege
4.4	Models are secured from risky prompt injections
4.5	Prompts logs are audited
5	Regulatory & Policy Compliance
5.1	AI governance policies align with GDPR, CCPA, or regional AI laws
5.2	Data localization rules are adhered to
5.3	Permissions or rights are verified for training data
5.4	Model use is documented and reviewed for compliance risk
5.5	Third-party models/tools are vetted for legal and ethical compliance
6	Organizational Readiness & Training
6.1	Staff are trained on data stewardship and Gen AI usage policies
6.2	Clear roles and responsibilities are assigned for AI oversight
6.3	Incident response plans are in place for AI-related issues
6.4	Internal communication channels share Gen AI risk updates
6.5	Executive sponsors governance priorities
7	Continuous Improvement
7.1	User feedback loops inform model retraining and updates
7.2	Governance practices are reviewed quarterly or bi-annually
7.3	Lessons from incidents or audits are used to update policies
7.4	Governance metrics are tracked and reported to leadership
7.5	Catalog Generative AI Model Training Datasets - innovate
	Total Score
	% of Total Score out of max score of 135	TotalScore/135x100%

Please add the scores in the above table. If the total score out of total score is equal to or above 90%, a Generative AI project will have a solid foundation. If the score is below 90%, we recommend the enterprise to work on foundational data and AI governance work first.

Written By:
Aparajeeta Das
Co-Founder & CDO, ThirdEye Data