I have two databases of messy names such as these:
- Jindal, Bobby
- Fla. Gov. Bobby Jindal
- Bobby Jindal
- 3M Corp.
- 3M Menomonie
I need to find the matches. Can anyone point me to or suggest a good recipe for how to do this in Google Refine?
This link gives me a starting point but I could use further advice: http://blog.ouseful.info/2011/05/06/merging-datesets-with-common-columns-in-google-refine/
You could try our Refine extension, see especially the reconciliation part of the doc.
cell.cross function is similar to the vlookup in Excel, it will match only if your two cells are identical. If you want to use this method you will need to cluster and clean your data a lot before.
I support Michael answer. Try a reconciliation service: rdf one or the open reconcile.