Regular Expression to test for valid UTF-8
$field =~
m/^(
[\s09\s0A\s0D\sx20-\sx7E] # ASCII
| [\sxC2-\sxDF][\sx80-\sxBF] # non-overlong
2-byte
| \sxE0[\sxA0-\sxBF][\sx80-\sxBF] # excluding
overlongs
| [\sxE1-\sxEC\sxEE\sxEF][\sx80-\sxBF]{2} # straight
3-byte
| \sxED[\sx80-\sx9F][\sx80-\sxBF] # excluding
surrogates
| \sxF0[\sx90-\sxBF][\sx80-\sxBF]{2} # planes 1-3
| [\sxF1-\sxF3][\sx80-\sxBF]{3} # planes 4-15
| \sxF4[\sx80-\sx8F][\sx80-\sxBF]{2} # plane 16
)*$/x;