Seredipity default event_s9ymarkup plugin breaking URLs that contain underscores
The default Serendipity mark-up plugin (event_s9ymarkup) currently breaks URLs that contain underscores.
So
http://en.wikipedia.org/wiki/Statler_%26_Waldorf
will end up
http://en.wikipedia.org/wiki/Statler</u>%26_Waldorf
because of a faulty regex. Garvin Hicking does not really want to fix this. (See this s9y support forum article for arguments pro/contra fixing it). So if you encounter this problem, your options are:
- replace _ in URLs with %5F (aka manually urlencode it)
- remove the plugin or disable it
- patch the plugin
Patching is basically changing
plugins/serendipity_event_s9ymarkup/serendipity_event_s9ymarkup.php:
$text = preg_replace('/\b_([\S ]+?)_\b/','<u>\1</u>',$text);
to
$text = preg_replace('/\ _([\S ]+?)_\ /',' <u>\1</u> ',$text);
If you want to be writing things like "Haha[lol]" (which I have no real use for ...), extend the "\ " with whatever you'd like to be o.k. to delimit bolded words beyond blanks. It should only be symbols that are not valid in URLs (so none of "$-_.+!*'()," which are all valid in URLs according to RFC 1738).
You may also want to consider replacing one underscore ("_") with two or more ("__") to make the detection, that you actually wanted to write bold text, more reliable.
Comments
Display comments as Linear | Threaded
Adult Ühler on :
Nasty. A while back I spent some days writing what I think is a near-bulletprtoof string to URL function that can even turn Chinese characters to the roman equivilent.