Tags: regex

Replacing MS Word’s quotation marks in VB.NET

When dealing with text pasted from Microsoft Word, the presence of “special” (read: non-ASCII) quotation marks and apostrophes can be quite troublesome. Here’s a simple way to convert them to “standard” (read: ASCII) quotation marks and apostrophes…

VB.NET Code:

myString = Regex.Replace(myString, "\u201C|\u201D", """")
myString = Regex.Replace(myString, "\u2018|\u2019", "'")

Obviously, this doesn’t handle all of Word’s annoying special characters; but it should get you off on the right foot.

URL rewriting template engine

My employer is currently in the process of standing-up a content management system (CMS). In the process of migrating links on our existing site to the new CMS site, there are going to be—at least, at first—a metric tonne of possible URL redirects necessary. Since the majority of these will fall into a handful of categories, I began creating IIRF (Ionic ISAPI Rewrite Filter) URL rewriting rules that would, for instance, move a particular list of “Offices” from /offices/officename to http://newserver/offices/officename. (Note: These directives should be compatible with Apache’s mod_rewrite, and even lighttpd’s url.rewrite, as well.) Read More →

Consuming newlines with the Javascript regex engine

In most server-side languages (with an available regex engine), programmers are given a wonderful set of pattern modifiers. One such modifier for PCRE (Perl-compatible regular expressions) is the “s” modifier, known in PHP as the constant PCRE_DOTALL. This modifier will cause the “dot” character–which will usually match any normal character–to include newlines. This is especially useful if you are dealing with text files and your pattern match may span multiple lines of those files. Read More →