Workflow: Data Quality Assessment

Assess data quality, completeness, and validity using Exploratory Data Analysis (EDA) before migration or analytics projects.

Step 1-2: Run EDA

1. Select Tables

Focus on critical tables (Customers, Orders) or tables with known issues.

2. Configure

  • Sample Size: 10,000 rows
  • Correlation Analysis: Enabled
  • Advanced Stats: Enabled

Step 3: Review Metrics

Completeness Analysis

Review null percentages. Example: products.description has 45% nulls (Major Issue).

Uniqueness Analysis

Check for duplicates in ID columns and unique constraints.

Validity Analysis

Identify format issues (invalid emails), out-of-range values, and constraint violations.

Step 4-5: Identify Issues & Action Plan

IssuePriorityActionOwner
Invalid emails in customers (PII)CriticalRun validation scriptData Eng
Missing values in shipping addressHighInvestigate sourceSupport
Duplicate User IDsCriticalDeduplicate recordsData Eng
Inconsistent date formatsMediumStandardize pipelineAnalytics

Step 6: Export EDA Report

Download & Share

Export the HTML report with visualizations. Share with stakeholders to track improvements over time.

Success Criteria

  • Data quality issues prioritized
  • Action plan created with owners
  • Baseline metrics captured