The auto-linking code basically rewrote the whole string escaping non-ascii
characters in an inefficient way, and building a full character offset map
between the unescaped and escaped texts before sending the contents to
TwitterText's extractor.
Instead of doing that, this commit changes the TwitterText regexps to include
valid IRI characters in addition to valid URI characters.