Quote:
Biggest challenge is the space delimeter as all rows are not identical for ex
It is not a challenge as space have multiple usage in your addresses, it is a bad idea, it is impossible.
You have devised this RegEx
^(.*?)\s([a-z]{2})\s(.*?)\s(\d{5})$
but think abou_t addresses like
4 drive rd phoneix Arizona 34445
Alm dr missouri 65034
4 drive rd phoneix New Mexico 34445
Alm dr New York New York 65034 // aka NY city in NY state
4 drive rd phoneix North Carolina 34445
4 drive rd phoneix South Dakota 34445
4 drive rd phoneix Washington District of Columbia 34445
4 drive rd Baton Rouge Louisiana 34445
You need a way to check every spelling of states and cities with multiple words naming (including wrong spellings).
The only way is to have a separator that can not be part of the address
like:
4 drive rd;phoneix;Washington District of Columbia;34445
4 drive rd;Baton Rouge;Louisiana;34445
As programmer, your job is to device solution that will support all cases. And adding cases not handled by first fast solution takes much more time than doing things right in first place.