Database Challenge
- To: mathgroup at smc.vnet.net
- Subject: [mg106106] Database Challenge
- From: Nicholas Kormanik <nkormanik at gmail.com>
- Date: Fri, 1 Jan 2010 05:37:54 -0500 (EST)
There are 12 records in this mini database. Two columns. First column are social security numbers. Second column are names. Unfortunately Jane Doe appears three times, with three different versions of her name, but having the same social security number. Challenge: Remove the duplicates, where social security is the same, and keep any one of the names. Final result will be whittled down to 10 records. (Real life problem has 6.5 million records, and lots of duplicates, with various versions of names.) 025-60-4044 joe average 004-16-4077 jane doe 014-27-9076 mike smith 098-43-2098 rodolfo pilas 073-15-6005 gustavo boksar 004-16-4077 jane a. doe 147-79-9074 bea busaniche 165-63-0189 pablo medrano 124-96-7092 jeff aaron 004-16-4077 jane anne doe 172-30-6069 michael peters 059-85-1062 leroy baker
- Follow-Ups:
- Re: Database Challenge
- From: DrMajorBob <btreat1@austin.rr.com>
- Re: Database Challenge
- From: Leonid Shifrin <lshifr@gmail.com>
- Re: Database Challenge
- From: Adriano Pascoletti <adriano.pascoletti@dimi.uniud.it>
- Re: Database Challenge