Comprehensive Data Quality

Data Quality Management - Complete Prompting Guide
Overview: Turn Chaos into Intelligence
Poor data quality costs companies 15-25% of revenue. In sales, it means missed opportunities, embarrassing mistakes, and lost deals. This guide teaches you how to use Colby AI to maintain pristine data that drives better decisions and faster sales cycles.
What You'll Learn:
Why data quality directly impacts sales performance
How to identify and fix common data issues
Techniques for maintaining ongoing data hygiene
Advanced strategies for data enrichment
The Real Cost of Bad Sales Data
Before diving into solutions, understand the impact:
Missed Opportunities: Can't find prospects due to incomplete data
Damaged Credibility: Calling with wrong information
Wasted Time: Reps spending hours fixing instead of selling
Poor Forecasting: Decisions based on incorrect data
Lost Revenue: Deals fall through cracks
Understanding Colby's Data Quality Engine
Colby acts as your data quality analyst:
Pattern Recognition: Identifies anomalies and inconsistencies
Validation Logic: Checks data against known standards
Enrichment Capabilities: Fills gaps from multiple sources
Duplicate Detection: Finds and merges redundant records
Standardization: Formats data consistently
Monitoring: Continuously watches for quality degradation
It however does not do any of this independently. Instead it requires that you review and evaluate the outputs
The Complete Data Quality Prompt Framework
"Analyze data quality for [scope]:
Check For:
- Completeness: [Required fields]
- Accuracy: [Validation rules]
- Consistency: [Format standards]
- Duplicates: [Matching criteria]
- Timeliness: [Freshness requirements]
Focus Areas:
- [Specific objects: Accounts, Contacts, etc.]
- [Critical fields for sales process]
- [High-impact data points]
Actions Requested:
- [Report only, fix automatically, or suggest fixes]
- [Priority order]
- [Risk assessment]"
Data Quality Examples: From Reactive to Proactive
❌ Poor Data Request:
"Fix data"
Why It Fails:
No scope definition
No priority
No specific issues
Overwhelming results
✅ Basic Data Request:
"Find accounts missing industry information"
Better Because:
Specific field
Clear object
Actionable result
✅✅ Good Data Request:
"Find all accounts in my territory missing industry, employee count, or website. Sort by opportunity value."
Much Better:
Multiple fields
Scoped to territory
Priority sorting
Business context
✅✅✅ Excellent Data Request:
"Perform data quality audit on Enterprise accounts:
Critical Fields Check:
- Industry classification (for segmentation)
- Employee count (for pricing tiers)
- Annual revenue (for potential calculation)
- Website (for research)
- Parent account (for hierarchy)
Validation Rules:
- Phone numbers: Valid format with area code
- Websites: Proper URL format, no 404s
- Email domain: Matches company website
- Industry: Uses standard picklist values
Duplicate Detection:
- Similar company names (ABC Inc vs ABC Corporation)
- Same website domain
- Matching phone numbers
- Focus on accounts >$100K potential
Data Freshness:
- Flag accounts not updated in 6+ months
- Identify stale opportunities (no activity 30+ days)
- Find contacts who may have changed jobs
Enrichment Opportunities:
- Missing LinkedIn URLs
- Incomplete tech stack info
- No recent news/triggers
- Missing key stakeholders
Output:
- Prioritized fix list by revenue impact
- Quick wins vs major projects
- Automation recommendations
- Team assignments for cleanup"
Why This Is Outstanding:
Comprehensive scope with business rationale
Clear validation rules
Smart prioritization
Actionable output format
Strategic thinking about impact
Mastering Different Data Quality Scenarios
1. New Data Intake Quality
Purpose: Ensure clean data entry from the start Focus: Prevention over correction
Master Prompt:
"Monitor new data quality for [time period]:
Entry Point Analysis:
- Lead forms: Required field completion
- Imports: Data mapping accuracy
- Manual entry: Common mistakes
- Integrations: Sync errors
Validation at Entry:
- Email format and deliverability
- Phone number validity
- Company/website matching
- Duplicate checking
Real-Time Flags:
- Missing critical fields
- Suspicious patterns (test@test.com)
- Duplicate creation attempts
- Data inconsistencies
Automation Opportunities:
- Auto-formatting rules needed
- Default value suggestions
- Lookup table creation
- Validation rule implementation
Training Needs:
- Common rep mistakes
- Fields often skipped
- Misunderstood definitions
- Process improvements"
2. Account Hierarchy Management
Purpose: Maintain accurate parent-child relationships Focus: Organizational structure integrity
Master Prompt:
"Audit account hierarchy structure:
Parent-Child Analysis:
- Orphaned accounts (should have parent)
- Incorrect hierarchies
- Missing subsidiaries
- Duplicate hierarchies
Business Logic:
- Revenue roll-up accuracy
- Territory assignment conflicts
- Contract inheritance issues
- Engagement tracking gaps
Identification Patterns:
- Company name variations
- Website domain matching
- Address proximity
- Industry alignment
Fix Recommendations:
- Merge duplicate hierarchies
- Establish missing connections
- Update ownership rules
- Create hierarchy standards
Impact Assessment:
- Deals affected by bad hierarchy
- Revenue attribution errors
- Territory confusion
- Reporting inaccuracies"
3. Contact Data Hygiene
Purpose: Maintain accurate, reachable contacts Focus: Deliverability and relevance
Master Prompt:
"Comprehensive contact data review:
Deliverability Check:
- Email bounce indicators
- Invalid format patterns
- Personal vs. professional emails
- Domain verification
Job Change Detection:
- Title inconsistencies
- Company mismatches
- LinkedIn cross-reference
- Engagement drop-offs
Completeness Audit:
- Missing phone numbers
- No LinkedIn URLs
- Incomplete titles
- Missing department info
Relationship Mapping:
- Orphaned contacts
- Wrong account associations
- Duplicate person records
- Missing reporting structure
Enrichment Priorities:
- High-value contact gaps
- Key stakeholder info
- Mobile numbers for executives
- Direct dial research
Compliance Concerns:
- Opt-out status accuracy
- GDPR data freshness
- Permission confirmations
- Data retention violations"
4. Opportunity Data Integrity
Purpose: Ensure accurate pipeline and forecasting Focus: Deal data that drives decisions
Master Prompt:
"Pipeline data quality deep dive:
Stage Progression Logic:
- Stuck opportunities (no movement 30+ days)
- Backward stage movements
- Skipped required stages
- Velocity anomalies
Critical Field Analysis:
- Missing close dates
- Zero dollar amounts
- No next steps defined
- Unassigned opportunities
Probability Alignment:
- Stage vs. probability mismatches
- Override justifications
- Historical accuracy check
- Rep consistency analysis
Activity Correlation:
- Opportunities with no activities
- High-value deals with low touch
- Stale opportunities still open
- Dead deals not closed
Forecast Impact:
- Data issues affecting accuracy
- Cleanup impact on pipeline
- Risk identification
- Confidence scoring"
Advanced Data Quality Techniques
Technique 1: Predictive Data Decay
"Identify data likely to become stale:
Decay Indicators:
- Contact at company 2+ years (likely to leave)
- No engagement 6+ months (role change?)
- Company showing acquisition signals
- Industry consolidation trends
Proactive Actions:
- Schedule verification tasks
- Set up monitoring alerts
- Create update campaigns
- Assign relationship managers"
Technique 2: Cross-Object Validation
"Validate data consistency across objects:
Relationship Checks:
- Contact email domain vs. Account website
- Opportunity value vs. Account size
- Activity volume vs. Deal stage
- Territory alignment across objects
Conflict Resolution:
- Source of truth rules
- Update precedence
- Cascade requirements
- Sync patterns"
Technique 3: Quality Scoring
"Create data quality scores:
Account Scoring:
- Completeness: X/10 required fields
- Freshness: Last update aging
- Accuracy: Validation pass rate
- Enrichment: Enhancement level
Use Scores For:
- Cleanup prioritization
- Rep performance metrics
- Account segmentation
- Risk assessment"
Common Data Quality Mistakes
Mistake 1: Fixing Symptoms, Not Causes
❌ Repeatedly cleaning same issues ✅ "Identify why industry field keeps going missing and fix the root cause"
Mistake 2: All-or-Nothing Approach
❌ Trying to fix everything at once ✅ "Start with high-value accounts and fields critical for this quarter's deals"
Mistake 3: Solo Cleanup Efforts
❌ One person fixing everyone's data ✅ "Assign ownership: Each rep cleans their named accounts weekly"
Mistake 4: No Prevention Strategy
❌ Only reactive cleanup ✅ "Implement validation rules to prevent bad data entry"
Measuring Data Quality Success
Quality Metrics:
Field completion rates
Validation pass rates
Duplicate record percentage
Data freshness scores
Enrichment coverage
Business Impact Metrics:
Time saved on research
Email deliverability rates
Connect rates improvement
Forecast accuracy
Deal velocity
Pro Tips for Data Excellence
The Weekly Hygiene Routine
The Data Dictionary
The Enrichment Waterfall
The Quality Dashboard
Practice Scenarios
Scenario 1: The Acquisition Integration
"We just imported 10,000 accounts from an acquisition. Create a data quality plan to integrate them properly."
Scenario 2: The Territory Realignment
"Sales territories are being redrawn. Identify and fix all account ownership and hierarchy issues."
Scenario 3: The Compliance Audit
"Legal needs all contact data to be GDPR compliant. Find and fix all compliance issues."