Author Archives: Zach Cox

Simple Subversion Branching and Merging

Branching and merging in Subversion is a great way to work on large new features without disrupting mainline development on trunk.  However, it has a reputation for being so difficult that many developers never take advantage of it.  In this post I’ll show just how easy it really is thanks to some newer features in Subversion and Subclipse (a Subversion plug-in for Eclipse).
Continue reading

Java EE 6 and Scala

Last weekend while pondering the question “Is Scala ready for the enterprise?” I decided to write a simple Java EE 6 app entirely in Scala, without using any Java. I had three main reasons for doing this: one was just to see how easy/difficult it would be to write everything in Scala (it was easy).  Another was to document the process for others journeying down the same road (the entire project is on github).  Finally, I wanted to identify advantages of using Scala instead of Java that are specific to Java EE apps (I found several).
Continue reading

Word Counts Example in Ruby and Scala

A while ago I was asked, as a pre-interview task for another company, to write some code in any language that counted word frequencies in these newsgroup articles. I recently came across the Ruby and Scala scripts I wrote and thought it would be fun to post them.

First, here is the wordcounts.rb script. It assumes the newsgroup files have already been downloaded and is executed by running ruby wordcounts.rb.

#Change rootDir to the location of your newsgroup files
rootDir = "/home/zcox/dev/20_newsgroups"
raise rootDir + " does not exist" unless rootDir
#Iterates over all files under rootDir, opens each one and passes it to the block
def files(rootDir)
  Dir.foreach(rootDir) do |dir|
    if dir != "." && dir != ".."
      puts "Processing " + dir
      Dir.foreach(rootDir + "/" + dir) do |file|
        if file != "." && file != ".."
          open(rootDir + "/" + dir + "/" + file) do |f|
t1 =
counts = #0 will be the default value for non-existent keys
files(rootDir) do |file|\w+/) { |word| counts[word.downcase] += 1 }
puts "Writing counts in decreasing order"
open("counts-descreasing-ruby", "w") do |out|
  counts.sort { |a, b| b[1] <=> a[1] }.each { |pair| out << "#{pair[0]}\t#{pair[1]}\n" }
puts "Writing counts in alphabetical order"
open("counts-alphabetical-ruby", "w") do |out|
  counts.sort { |a, b| a[0] <=> b[0] }.each { |pair| out << "#{pair[0]}\t#{pair[1]}\n" }
t2 =
puts "Finished in " + (t2 - t1).to_s + " seconds"

The code just iterates over each file, splits each file into words and aggregates the word counts using a Hash object. It then sorts the Hash in two different ways and writes each to a separate file. I think Ruby makes this code very easy to read and Ruby’s core library support makes this task pretty simple to accomplish.

Next, here is the wordcounts.scala version. Again it assumes the newsgroup articles are already downloaded and can be run by executing scala wordcounts.scala.

//Change rootDir to the location of your newsgroup files
val rootDir = new File("/home/zcox/dev/20_newsgroups")
if (!rootDir.exists) throw new IllegalArgumentException(rootDir + " does not exist")
/** Iterates over all files under rootDir, opens each one and passes it to the function */
def files(rootDir: File)(process: File => Unit) {
  for (dir <- rootDir.listFiles; if dir.isDirectory) {
    println("Processing" + dir)
    for (file <- dir.listFiles; if file.isFile) {
val t1 = System.currentTimeMillis
var counts = Map.empty[String, Int].withDefaultValue(0)
files(rootDir) { file => 
  file.split("""\W+""").foreach { word => counts = counts(word.toLowerCase) += 1 }
println("Writing counts in decreasing order")
write(counts, "counts-descreasing-scala") {_._2 > _._2}
println("Writing counts in alphabetical order")
write(counts, "counts-alphabetical-scala") {_._1 < _._1}
val t2 = System.currentTimeMillis
println("Finished in " + ((t2 - t1)/1000.0) + " seconds");
/** Writes the specified map to the specified file in tab-delimited format, sorted accordingly. */
def write[K, V](map: Map[K, V], file: String)(sort: (Tuple2[K, V], Tuple2[K, V]) => Boolean) {
  using (new PrintWriter(new FileWriter(file))) { out => 
    map.toList.sort(sort).foreach { pair => out.println(pair._1 + "\t" + pair._2) }
/** Converts a File to a String. */
implicit def file2String(file: File): String = {
  val builder = new StringBuilder
  using (new BufferedReader(new FileReader(file))) { reader => 
    var line = reader.readLine
    while (line != null) {
      line = reader.readLine
/** Performs some operation on the specified closeable object and then ensures it gets closed. */
def using[Closeable <: {def close(): Unit}, B](closeable: Closeable)(getB: Closeable => B): B = 
  try {
  } finally {

Notice that Scala code is statically typed and compiled software that can be run as a simple script – Scala really does scale from small one-off scripts to large enterprise systems. It mirrors the Ruby script fairly closely, with the addition of some helper functions at the end to make up for some features that the core Scala library doesn’t provide. Overall I think it’s just as readable as the Ruby code, but comes with static type-checking and it runs on the JVM.

I have included some timing info as well. On my 2GHz Core2 Duo with 3GB RAM, the Ruby script averages about 60 seconds while the Scala script averages about 30 seconds. The Scala version uses immutable Map objects to store the word counts; thus a new Map object is created for each word. I switched it to a mutable Map (which required only minor code changes) and it dropped to about 15 seconds. Hardly scientific, I’m sure a lot of optimizations could be done on both scripts, but this shows the performance gains provided that can be achieved on the JVM using Scala.

If anyone can spot any improvements to make to these scripts, please share them with comments!

Strict Quote Escaping in Tomcat

I just started here at Source Allies (loving it here so far, btw!) and inherited an aging code base to resurrect.  It was originally deployed on Tomcat 5 and one of the issues I encountered upgrading to Tomcat 6 was strict quote escaping.  The code base has lots of JSPs with elements like this:

<some:tag title=”<%=(String)request.getAttribute(“title”)%>”>

Apparently this used to fly under the radar up until Tomcat 5.5.26, but Tomcat 5.5.27+ enforces the quoting requirements of the JSP spec.  Running this app with Tomcat 6 produced lots of exceptions like this one:

javax.servlet.jsp.JspException: ServletException in ‘/WEB-INF/content/admin/editUser.jsp’: /WEB-INF/content/admin/editUser.jsp(6,23) Attribute value (String)request.getAttribute(“title”) is quoted with ” which must be escaped when used within the value

Now, we all know that double-quotes within double-quotes is a no-no and should be fixed by either using single quotes to enclose the attribute value:

<some:tag title='<%=(String)request.getAttribute(“title”)%>’>

or by escaping the inner double-quotes:

<some:tag title=”<%=(String)request.getAttribute(\”title\”)%>”>

However in this case we just needed to get the app up & running quickly so I found a quick, temporary workaround instead of fixing all of the improperly formatted quotes.  Setting org.apache.jasper.compiler.Parser.STRICT_QUOTE_ESCAPING=false in $TOMCAT_HOME/conf/ allows the double-quotes within double-quotes, and no more exceptions!