1 / 15

Domestic Violence Network People Matching

Domestic Violence Network People Matching. October 22, 2014 Jay Colbert, Indianapolis. Domestic Violence Network. Phase1: Developed “Domestic Violence in the Criminal Justice System” report in 2013-2014 Phase 2: Updating report

Download Presentation

Domestic Violence Network People Matching

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Domestic Violence NetworkPeople Matching October 22, 2014 Jay Colbert, Indianapolis

  2. Domestic Violence Network • Phase1: Developed “Domestic Violence in the Criminal Justice System” report in 2013-2014 • Phase 2: Updating report • There are many different sources of DV data and the question was “how many unique people are involved?”

  3. Data Sources • Notes • 1Domestic Violence Shelter • 2No names given • 3Special IMPD domestic violence program

  4. Phase 1 people matching • In-house solution using (mainly) python scripts. • Looked at combination of name, race, gender, age. • Gave pretty good final results, but… • It took 10 days to run.

  5. Phase 2 people matching • Researched multiple commercial software solutions geared towards data deduplication. • Some enterprise solutions as much as $30,000 and up. • Some desktop solutions as cheap as $500.

  6. Phase 2 name matching • Purchased MatchIT software. • $4,000 initial, $2,000 annual renewal • Discounted since we are a nonprofit • Flexibility on matching algorithms. • We match on name and date of birth first. • Then we move on to matching on name, race, gender, and year of birth (not all sources give us exact DOB) • We previously used age, but that becomes problematic with increasing number of years of data.

  7. Example • Names have been changed to protect the innocent (and the guilty). • These records identified as same person.

  8. Old Method • Old Method had a hard time seeing people with different birthdates within a few years of one another as different people.

  9. New Method • New Method much better since it looks at exact dates of birth

  10. New Method • Even gets it right when both names are accidently in one field or there is a typo in the birthdate

  11. Many other ways to match • Name • Date of Birth/Age • Race, Gender • Address • Phone numbers • Anything else you can think of

  12. What did it boil down to? • The total 871,681 records boiled down to 400,736 • The 177,545 Marion County DV people boiled down to 92,908.

  13. Example Output Unique Victims

  14. Example Output

  15. Example Output

More Related