Database Connection Best Practices
Use Read-Only Users
Always create dedicated read-only users. Prevents accidental data modification and limits security exposure.
Use Read Replicas
Connect to read replicas instead of production to avoid performance impact, especially for large extractions.
Example: PostgreSQL Read-Only User
-- Create user CREATE USER metadata_reader WITH PASSWORD 'secure_password'; -- Grant minimum permissions GRANT CONNECT ON DATABASE production_db TO metadata_reader; GRANT USAGE ON SCHEMA public TO metadata_reader; GRANT SELECT ON ALL TABLES IN SCHEMA public TO metadata_reader; GRANT SELECT ON information_schema.tables TO metadata_reader;
Data Dictionary Generation
| Phase | Action | Goal |
|---|---|---|
| First Run | Test with 5-10 tables | Understand process & confidence scoring |
| Second Run | Document 20-50 tables (one domain) | Refine review process & standards |
| Production Run | Document entire database | Complete coverage |
Systematic Review Process
Pass 1
PII Verification (Critical)
Filter to show only PII-flagged columns. Verify every detection.
Pass 2
Low Confidence Descriptions
Sort by confidence score. Focus on scores below 70%. Add business context.
Naming Convention Impact
✅ Good Names
- customer_email"Customer email address"
- order_created_at"Order creation timestamp"
❌ Poor Names
- col_miscLow confidence description
- field_37Requires human input
Credit Management
Incremental Documentation
Document only new or changed tables instead of regenerating the entire database every time.
Plan Monthly Usage
Allocate credits for production DBs (monthly), dev DBs (quarterly), and ad-hoc analysis.
Documentation Maintenance
When to Regenerate
- Always: After major schema changes (new tables, removed columns).
- Consider: When new team members join (to ensure context is fresh).
- Schedule: Weekly for active dev, Monthly for stable production.
Version Control
Export JSON and commit to Git to track schema evolution over time.
Security Best Practices
| Practice | Detail |
|---|---|
| Least Privilege | Grant minimum access needed. Review permissions quarterly. |
| Audit Trails | Track who generated documentation and exported data. |
| Minimal Samples | Configure minimal sample rows (3-5) for sensitive tables. |