I'm working on doing some automated testing of a search interface. Unfortunately, this particular interface does some parsing via browser-side JavaScript to do sanity checking on the phrase to be searched. Nothing out of the ordinary here.
This code is C-style JavaScript, which works the same way in pretty much any language with the appropriate syntactic munging. Here's a brief snippet (simplified for discussion):
while (i < str.length) { case(str.charAt(i) { case 'a': case 'A': p = str.indexOf(' ', i + 1); if (p != -1) { word = str.subString(i, p); words[count] = word; word_lc = word.toLowerCase(); if (word_lc == "and") { tokens[count] = "operator"; } else { tokens[count] = "word"; } count++; i = p + 1; } else { word = str.substring(i, str.length-1); words[count] = word; tokens[count] = "word"; count++; i = str.length; } case 'o': case 'O': // ... // ... } }I've been using Perl for so long that it's rotted my brain. For example, I can't imagine not programming with regexes anymore. If I had to write a tokenizer from scratch, I certainly wouldn't walk through it character by character.
I don't have a spec on how this string is supposed to be tokenized, and I can't run JavaScript from my test scripts, so the best approach is to rewrite the damn thing in Perl. Since I'm not sure what the output is supposed to be, the safest approach is to make a faithful translation, not to optimize the algorithm. Here is a (simplified) chunk of my first pass in Perl:
$_ = $str; while ($_ ne "") { if (s/^(and|or|not)\s//i) { push (@tokens, [$1, "operator"]); } ## ... } else { ## Grab the next word or the last word s/^(\S+)\s// || s/^(\S+)$//; push (@tokens, [$1, "word"]); } }The result? 300+ lines of hard-to-follow JavaScript are now translated into ~80 lines of denser Perl. The details of the tokenization algorighm are easier to spot because the code is working at a higher level.
This is the kind of argument that advocates of Perl (and other dynamic languages) refer to when talking about the productivity gains from "programming at a higher level". However, I don't think I've seen many solid examples of the what "programming at a higher level" means.
Of course, the new approach can be faithfully reimplemented in JavaScript (or Python, or Tcl, or Smalltalk, or Lisp, or Java, or ...) , but that's beside the point. The key here is walking away from a C-style mindset (where all you can do is walk a string character by character, and keep track of data in parallel arrays), to a higher level mindset (where you push the work into regexes, and build up an array of a lightweight data structures to construct the result).
I don't know if you've seen Theory's work on this subject, but he's ported much of Perl's test suite to Javascript.
Re:Testing in Javascript
ziggy on 2005-06-07T17:20:22
Yes, but that really doesn't apply here.I'm trying to test the back end, not the browser-side javascript. The browser munges a search phrase before submitting a search. I'm trying to submit searches and test outputs, and the only way to submit a search is with a munged search phrase.