A few days ago I wrote an article and posted code to parse email address lists from an unstructured array into a structured array. You can read it here.
There is also the possibility of parsing an email address list from an "unstructured" string. Here's how to handle that.
Here's what the original string could look like:
$str = 'john.doe@domainone.com,';
$str .= 'johan.doer@domaintwo.com,';
$str .= 'Johann Sebastian Bach <johann.bach@composer.com>,';
$str .= '<w.a.mozart@rockingpiano.com> Wolfgang Amadeus Mozart,';
$str .= 'Ludwig van Beethoven <ludwig@vanbeethoven.com>,';
$str .= 'I. P. Knightly';
Note the fourth item incorrectly has the email address before the name and the last item contains no email address. The fourth item would normally crash any email parser since most of those explode on the < character.
We are going to handle this in two passes, the first is to remove any items without email addresses and properly structure each line into name followed by email address.
Here's the code for that first pass:
/**
* by Andy Prevost
* accepts string containing email addresses (separated by comma) in almost any format
can be single address or multiple with or without correct spacing, quote marks
returns a string with items that contain no email address removed, and name/email
address in correct order and with email address tokenized
* @var string
* @return string
*/
function cleanEmailListStr($str) {
if (!is_string($str)) { return $str; }
$nstr = '';
$tarr = explode(',',$str);
foreach ($tarr as $key => $val) {
$val = trim($val);
preg_match('/[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Za-z]{2,4}/', $val, $bk);
if (filter_var($bk[0], FILTER_VALIDATE_EMAIL)) {
$email = trim($bk[0]);
$name = trim(str_replace([$email,'<','>'],'',$val));
if ($name != '') { $name .= ' '; }
$nstr .= $name . '<' . $email . '>' . ',';
} else {
continue;
}
}
return rtrim($nstr,',');
}
Generates:
<john.doe@domainone.com>,<johan.doer@domaintwo.com>,Johann Sebastian Bach <johann.bach@composer.com>,Wolfgang Amadeus Mozart <w.a.mozart@rockingpiano.com>,Ludwig van Beethoven <ludwig@vanbeethoven.com>
We now have a cleaned up string. Where only an email address was given, it is now structured with the proper tokens. Where both email address and name were given, they are now structured properly with name first, and email address (tokenized).
Next step is to get those into an array. I've rethought the entire array structure. Rather than try to create an array that matches a pre-defined PHP function, I am going to structure this so that the array key is the email address and the name is the value. It's easier to search for a unique array key than search on a name that could be empty or in almost any format/structure. That code looks like:
/**
* by Andy Prevost
* accepts string containing email addresses in almost any format
can be single address or multiple with or without correct spacing, quote marks
returns a structured array with email => name
* @var array
* @return string
*/
function emailStr2Array($str) {
if (!is_string($str)) { return $str; }
$rez = [];
if (preg_match_all('/\s*"?([^><,"]+)"?\s*((?:<[^><,]+>)?)\s*/', $str, $matches, PREG_SET_ORDER) > 0) {
foreach($matches as $m) {
preg_match('/[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Za-z]{2,4}/', $m[0], $bk);
if (filter_var($bk[0], FILTER_VALIDATE_EMAIL)) { //must contain an email address
if (!empty($m[2])) {
$rez[trim($m[2], '<>')] = $m[1];
} else {
$rez[$m[1]] = '';
}
}
}
}
return $rez;
}
That code will generate:
$emails (array) (
[john.doe@domainone.com] =>
[johan.doer@domaintwo.com] =>
[johann.bach@composer.com] => Johann Sebastian Bach
[w.a.mozart@rockingpiano.com] => Wolfgang Amadeus Mozart
[ludwig@vanbeethoven.com] => Ludwig van Beethoven
)
Note, I will be re-writing the original code, by the way, so that the result is an array that is structured as this one is.
PS. If you are wondering how I make an array dump look like this: I use Notepad++ and Netbeans IDE to write my code. Notepad++ to author, Netbeans IDE to clean up and debug. In Notepad++ I have a snippet that I use to dump arrays. That code snippet looks like:
/* ***** */
ob_start();print_r($emails);$out = ob_get_contents();ob_end_clean();
$out = str_replace("Array",'$emails (array)',$out);$out = str_replace("\n("," (\n",$out);
$out = str_replace('<','<',$out);$out = str_replace('>','>',$out);
echo __LINE__ . "<pre>\n";print_r($out);echo "</pre>\n";
/* ***** */
Enjoy!
Andy