#2015ConnectCon @connectmembers facebook.com/connectmembers#2015ConnectCon @connectmembers facebook.com/connectmembers Connect Data Quality Hawro Mustafa DataSpace Product Management Dir
Trang 1#2015ConnectCon @connectmembers facebook.com/connectmembers
#2015ConnectCon @connectmembers facebook.com/connectmembers
Connect Data Quality
Hawro Mustafa
(DataSpace)
Product Management Director, Data.com Connect
Joshua D’Antuono (SuperIron31)
Director, Data Operations - Data.com
Trang 2• New Domain + Company Management Features
• Bulk Contribution Enhancements
• Telephone Disconnect Processing
• Bounce Report Processing Improvements
• Detecting + Preventing Fraud
• Anti-Fraud Activities
• User Trust Score
• Rookie Tier
Agenda
Trang 3New Domain + Company Features
Trang 4New Domain + Company Management Features
5 New Community Features
Update Domain
Delete Domain
Add Domain
Move Domain
Merge Companies
1 New Stewardship Feature
Remove Company
Enabling DQ Improvements On
3,000+
Companies with
2.3 million+
Associated Contacts
Trang 5Bulk Contribution Enhancements
Trang 6Bulk Contribution Enhancements
70%
avg accept rate
82% avg accept rate
• Data science solution using
Naive Bayes Classifier
Designed to address top
issues with bulk contribution:
column mapping errors
Final version of the model
launched in early May 2015,
leading to a 12% overall avg
increase in acceptance rates
to date
Bulk Contribution File Acceptance Rates by Month Accepted Rejected
Feature Complete
Trang 7Telephone Disconnect Processing
Trang 8Telephone Disconnect Processing
83%
12%
5%
5% deactivated
95% updated
A Telephone Quality Pilot was conducted by
the Data Stewardship team with 20k selected
contacts
Many contacts with disconnected phone
numbers could be improved and left active in
Connect
A small subset should be deactivated until an
active phone number is provided by the
Community
~2 million contact phone numbers are to be
run through this process in 2015
Telephone Quality Pilot Results Unknown Active Disconnected
Trang 9Bounce Report Processing
Trang 10Data Analyst Observations and Monitoring
Data Defender Feedback to Data Stewardship Team
Community Feedback
via Corner Posts and
Support Cases
Bounce Report Processing Feedback
Trang 11DaffyHope51 is contributing facts into a complex system of historical data.
Some bounce codes are ambiguous in
nature.
Bounce codes received
by DaffyHope51 are
constantly changing.
Bounce Report Processing Challenges
Trang 12Analysis methods allow
us to more quickly react
to new feedback from the Community.
Should greatly reduce
the impact of DaffyHope51’s actions
on contact data.
Based upon feedback
and examples provided
by the Community +
detailed data analysis.
Bounce Report Processing Solution
Trang 13Detecting + Preventing Fraud
Trang 14Detecting + Preventing Fraud
Anti-Fraud Activities
Data Stewardship Team
Community Reports
Data Defender Reports
Trang 15Detecting + Preventing Fraud
Registration Features
Anti-Fraud Activities
Data Stewardship Team
User Trust Score
Community Reports Patterns of Activity
Timing of Actions Data Defender Reports
Trang 16Detecting + Preventing Fraud
Registration Features
Anti-Fraud Activities
Data Stewardship Team
Rookie Tier
5 Actions Per Day Points Delay Heavily Monitored
User Trust Score
Community Reports Patterns of Activity
Timing of Actions Data Defender Reports
Trang 17Detecting + Preventing Fraud
Registration Features
Anti-Fraud Activities
Data Stewardship Team
Rookie Tier
5 Actions Per Day Points Delay Heavily Monitored
User Trust Score
Community Reports Patterns of Activity
Timing of Actions
Data Defender Reports
Trang 18Contribution Volume of Locked Members By Registration Week 1 - 4 5+
Detecting + Preventing Fraud: Impact
Contact Updates Locked Members By Registration Week Volume
Trang 19Q&A
Trang 20#2015ConnectCon @connectmembers facebook.com/connectmembers
#2015ConnectCon @connectmembers facebook.com/connectmembers
Thank You!
Hawro Mustafa
(DataSpace)
Product Management Director, Data.com Connect
Joshua D’Antuono (SuperIron31)
Director, Data Operations - Data.com