June 2005 Archives

Regular expressions can be tricky.

Suppose you want to match C strings which can include backslashed quotes. A newby (or anybody like me) will write "([^"]|\\")+" that reads, match a quote, followed by anything that is not a quote or a backslashed quote. End with another quote.

Unfortunately this does not work. If you try to match against, say 'foo "bar\"zbr" ugh' it will match just with '"bar\"'.

In the other hand, if you swap sides, and try "(\\"|[^"])+" that reads a quote followed by blackslashed quotes or something not a quote, and that ends with a quote; will work.

This can be explained by the guts of the Perl (and Java) regular expression engines... but I am not the best person to explain that.

(or, why JAVA regular expressions can really suck)

Yesterday a friend asked for help with regular expressions, in Java. Regular expressions should be equal on most systems, and I think Java regexp engine is heavily based on Perl one. So, not big deal to write the regular expressions unless the problem of interpolating it on a string.

Let us see an example. Suppose you want to replace all occurrences of \" by a space. Something you would do in perl as s/\\"/ /.

Now, Java uses something like: string.replaceAll( regexp , replacement ). The regexp is inclosed in a string, so you need to escape special characters. Next step was: string.replaceAll("\\\""," "), but it didn't work. Can you guess why?

When the Java parser looks to that string, it will interpret the string. So, "\\\\"" in fact is '\"' and this string, sent to the regexp engine will match... '"'. Nice, wooo?

The solution is string.replaceAll("\\\\\""," "). Why? Let us do it all again. "\\\\\"" is '\\"'. Sent to the regexp engine, it will match a backslash and the quote.

So, God bless Larry Wall for not using strings to write regular expressions.

Female PerversionsThis movie (IMDB Link) is the strangest movie I ever seen. Well, maybe not the stranges, but it was quite strange. The worst thing is that I am almost sure I did not understand the real idea of the film. But, did I told already this was a strange film?

Oh, yeah, it includes sex, lesbians, and strange things. Also, it wasn't subtitled and that makes it harder to understand. So, it was strange.

About this Archive

This page is an archive of entries from June 2005 listed from newest to oldest.

May 2005 is the previous archive.

July 2005 is the next archive.

Find recent content on the main index or look in the archives to find all content.