<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Source Allies Blog &#187; ruby</title>
	<atom:link href="http://blogs.sourceallies.com/tag/ruby/feed/" rel="self" type="application/rss+xml" />
	<link>http://blogs.sourceallies.com</link>
	<description>Technical and process thinking from Source Allies employees</description>
	<lastBuildDate>Thu, 19 Aug 2010 18:35:29 +0000</lastBuildDate>
	<generator>http://wordpress.org/?v=2.8.4</generator>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
			<item>
		<title>Word Counts Example in Ruby and Scala</title>
		<link>http://blogs.sourceallies.com/2009/12/word-counts-example-in-ruby-and-scala/</link>
		<comments>http://blogs.sourceallies.com/2009/12/word-counts-example-in-ruby-and-scala/#comments</comments>
		<pubDate>Tue, 08 Dec 2009 18:00:14 +0000</pubDate>
		<dc:creator>Zach  Cox</dc:creator>
				<category><![CDATA[Development]]></category>
		<category><![CDATA[ruby]]></category>
		<category><![CDATA[scala]]></category>

		<guid isPermaLink="false">http://blogs.sourceallies.com/?p=509</guid>
		<description><![CDATA[A while ago I was asked, as a pre-interview task for another company, to write some code in any language that counted word frequencies in these newsgroup articles.  I recently came across the Ruby and Scala scripts I wrote and thought it would be fun to post them.
First, here is the wordcounts.rb script.  [...]]]></description>
			<content:encoded><![CDATA[<p>A while ago I was asked, as a pre-interview task for another company, to write some code in any language that counted word frequencies in these <a href="http://kdd.ics.uci.edu/databases/20newsgroups/20newsgroups.html">newsgroup articles</a>.  I recently came across the <a href="http://www.ruby-lang.org/en/">Ruby</a> and <a href="http://www.scala-lang.org/">Scala</a> scripts I wrote and thought it would be fun to post them.</p>
<p>First, here is the wordcounts.rb script.  It assumes the newsgroup files have already been downloaded and is executed by running <code>ruby wordcounts.rb</code>.</p>

<div class="wp_syntax"><div class="code"><pre class="ruby" style="font-family:monospace;"><span style="color:#008000; font-style:italic;">#Change rootDir to the location of your newsgroup files</span>
rootDir = <span style="color:#996600;">&quot;/home/zcox/dev/20_newsgroups&quot;</span>
<span style="color:#CC0066; font-weight:bold;">raise</span> rootDir <span style="color:#006600; font-weight:bold;">+</span> <span style="color:#996600;">&quot; does not exist&quot;</span> <span style="color:#9966CC; font-weight:bold;">unless</span> <span style="color:#CC00FF; font-weight:bold;">File</span>.<span style="color:#9900CC;">directory</span>? rootDir
&nbsp;
<span style="color:#008000; font-style:italic;">#Iterates over all files under rootDir, opens each one and passes it to the block</span>
<span style="color:#9966CC; font-weight:bold;">def</span> files<span style="color:#006600; font-weight:bold;">&#40;</span>rootDir<span style="color:#006600; font-weight:bold;">&#41;</span>
  <span style="color:#CC00FF; font-weight:bold;">Dir</span>.<span style="color:#9900CC;">foreach</span><span style="color:#006600; font-weight:bold;">&#40;</span>rootDir<span style="color:#006600; font-weight:bold;">&#41;</span> <span style="color:#9966CC; font-weight:bold;">do</span> <span style="color:#006600; font-weight:bold;">|</span>dir<span style="color:#006600; font-weight:bold;">|</span>
    <span style="color:#9966CC; font-weight:bold;">if</span> dir != <span style="color:#996600;">&quot;.&quot;</span> <span style="color:#006600; font-weight:bold;">&amp;&amp;</span> dir != <span style="color:#996600;">&quot;..&quot;</span>
      <span style="color:#CC0066; font-weight:bold;">puts</span> <span style="color:#996600;">&quot;Processing &quot;</span> <span style="color:#006600; font-weight:bold;">+</span> dir
      <span style="color:#CC00FF; font-weight:bold;">Dir</span>.<span style="color:#9900CC;">foreach</span><span style="color:#006600; font-weight:bold;">&#40;</span>rootDir <span style="color:#006600; font-weight:bold;">+</span> <span style="color:#996600;">&quot;/&quot;</span> <span style="color:#006600; font-weight:bold;">+</span> dir<span style="color:#006600; font-weight:bold;">&#41;</span> <span style="color:#9966CC; font-weight:bold;">do</span> <span style="color:#006600; font-weight:bold;">|</span>file<span style="color:#006600; font-weight:bold;">|</span>
        <span style="color:#9966CC; font-weight:bold;">if</span> file != <span style="color:#996600;">&quot;.&quot;</span> <span style="color:#006600; font-weight:bold;">&amp;&amp;</span> file != <span style="color:#996600;">&quot;..&quot;</span>
          <span style="color:#CC0066; font-weight:bold;">open</span><span style="color:#006600; font-weight:bold;">&#40;</span>rootDir <span style="color:#006600; font-weight:bold;">+</span> <span style="color:#996600;">&quot;/&quot;</span> <span style="color:#006600; font-weight:bold;">+</span> dir <span style="color:#006600; font-weight:bold;">+</span> <span style="color:#996600;">&quot;/&quot;</span> <span style="color:#006600; font-weight:bold;">+</span> file<span style="color:#006600; font-weight:bold;">&#41;</span> <span style="color:#9966CC; font-weight:bold;">do</span> <span style="color:#006600; font-weight:bold;">|</span>f<span style="color:#006600; font-weight:bold;">|</span>
            <span style="color:#9966CC; font-weight:bold;">yield</span><span style="color:#006600; font-weight:bold;">&#40;</span>f<span style="color:#006600; font-weight:bold;">&#41;</span>
          <span style="color:#9966CC; font-weight:bold;">end</span>
        <span style="color:#9966CC; font-weight:bold;">end</span>
      <span style="color:#9966CC; font-weight:bold;">end</span>
    <span style="color:#9966CC; font-weight:bold;">end</span>
  <span style="color:#9966CC; font-weight:bold;">end</span>
<span style="color:#9966CC; font-weight:bold;">end</span>
&nbsp;
t1 = <span style="color:#CC00FF; font-weight:bold;">Time</span>.<span style="color:#9900CC;">now</span>
counts = <span style="color:#CC00FF; font-weight:bold;">Hash</span>.<span style="color:#9900CC;">new</span><span style="color:#006600; font-weight:bold;">&#40;</span><span style="color:#006666;">0</span><span style="color:#006600; font-weight:bold;">&#41;</span> <span style="color:#008000; font-style:italic;">#0 will be the default value for non-existent keys</span>
files<span style="color:#006600; font-weight:bold;">&#40;</span>rootDir<span style="color:#006600; font-weight:bold;">&#41;</span> <span style="color:#9966CC; font-weight:bold;">do</span> <span style="color:#006600; font-weight:bold;">|</span>file<span style="color:#006600; font-weight:bold;">|</span>
  file.<span style="color:#9900CC;">read</span>.<span style="color:#9900CC;">scan</span><span style="color:#006600; font-weight:bold;">&#40;</span><span style="color:#006600; font-weight:bold;">/</span>\w<span style="color:#006600; font-weight:bold;">+/</span><span style="color:#006600; font-weight:bold;">&#41;</span> <span style="color:#006600; font-weight:bold;">&#123;</span> <span style="color:#006600; font-weight:bold;">|</span>word<span style="color:#006600; font-weight:bold;">|</span> counts<span style="color:#006600; font-weight:bold;">&#91;</span>word.<span style="color:#9900CC;">downcase</span><span style="color:#006600; font-weight:bold;">&#93;</span> <span style="color:#006600; font-weight:bold;">+</span>= <span style="color:#006666;">1</span> <span style="color:#006600; font-weight:bold;">&#125;</span>
<span style="color:#9966CC; font-weight:bold;">end</span>
&nbsp;
<span style="color:#CC0066; font-weight:bold;">puts</span> <span style="color:#996600;">&quot;Writing counts in decreasing order&quot;</span>
<span style="color:#CC0066; font-weight:bold;">open</span><span style="color:#006600; font-weight:bold;">&#40;</span><span style="color:#996600;">&quot;counts-descreasing-ruby&quot;</span>, <span style="color:#996600;">&quot;w&quot;</span><span style="color:#006600; font-weight:bold;">&#41;</span> <span style="color:#9966CC; font-weight:bold;">do</span> <span style="color:#006600; font-weight:bold;">|</span>out<span style="color:#006600; font-weight:bold;">|</span>
  counts.<span style="color:#9900CC;">sort</span> <span style="color:#006600; font-weight:bold;">&#123;</span> <span style="color:#006600; font-weight:bold;">|</span>a, b<span style="color:#006600; font-weight:bold;">|</span> b<span style="color:#006600; font-weight:bold;">&#91;</span><span style="color:#006666;">1</span><span style="color:#006600; font-weight:bold;">&#93;</span> <span style="color:#006600; font-weight:bold;">&lt;=&gt;</span> a<span style="color:#006600; font-weight:bold;">&#91;</span><span style="color:#006666;">1</span><span style="color:#006600; font-weight:bold;">&#93;</span> <span style="color:#006600; font-weight:bold;">&#125;</span>.<span style="color:#9900CC;">each</span> <span style="color:#006600; font-weight:bold;">&#123;</span> <span style="color:#006600; font-weight:bold;">|</span>pair<span style="color:#006600; font-weight:bold;">|</span> out <span style="color:#006600; font-weight:bold;">&lt;&lt;</span> <span style="color:#996600;">&quot;#{pair[0]}<span style="color:#000099;">\t</span>#{pair[1]}<span style="color:#000099;">\n</span>&quot;</span> <span style="color:#006600; font-weight:bold;">&#125;</span>
<span style="color:#9966CC; font-weight:bold;">end</span>
&nbsp;
<span style="color:#CC0066; font-weight:bold;">puts</span> <span style="color:#996600;">&quot;Writing counts in alphabetical order&quot;</span>
<span style="color:#CC0066; font-weight:bold;">open</span><span style="color:#006600; font-weight:bold;">&#40;</span><span style="color:#996600;">&quot;counts-alphabetical-ruby&quot;</span>, <span style="color:#996600;">&quot;w&quot;</span><span style="color:#006600; font-weight:bold;">&#41;</span> <span style="color:#9966CC; font-weight:bold;">do</span> <span style="color:#006600; font-weight:bold;">|</span>out<span style="color:#006600; font-weight:bold;">|</span>
  counts.<span style="color:#9900CC;">sort</span> <span style="color:#006600; font-weight:bold;">&#123;</span> <span style="color:#006600; font-weight:bold;">|</span>a, b<span style="color:#006600; font-weight:bold;">|</span> a<span style="color:#006600; font-weight:bold;">&#91;</span><span style="color:#006666;">0</span><span style="color:#006600; font-weight:bold;">&#93;</span> <span style="color:#006600; font-weight:bold;">&lt;=&gt;</span> b<span style="color:#006600; font-weight:bold;">&#91;</span><span style="color:#006666;">0</span><span style="color:#006600; font-weight:bold;">&#93;</span> <span style="color:#006600; font-weight:bold;">&#125;</span>.<span style="color:#9900CC;">each</span> <span style="color:#006600; font-weight:bold;">&#123;</span> <span style="color:#006600; font-weight:bold;">|</span>pair<span style="color:#006600; font-weight:bold;">|</span> out <span style="color:#006600; font-weight:bold;">&lt;&lt;</span> <span style="color:#996600;">&quot;#{pair[0]}<span style="color:#000099;">\t</span>#{pair[1]}<span style="color:#000099;">\n</span>&quot;</span> <span style="color:#006600; font-weight:bold;">&#125;</span>
<span style="color:#9966CC; font-weight:bold;">end</span>
&nbsp;
t2 = <span style="color:#CC00FF; font-weight:bold;">Time</span>.<span style="color:#9900CC;">now</span>
<span style="color:#CC0066; font-weight:bold;">puts</span> <span style="color:#996600;">&quot;Finished in &quot;</span> <span style="color:#006600; font-weight:bold;">+</span> <span style="color:#006600; font-weight:bold;">&#40;</span>t2 <span style="color:#006600; font-weight:bold;">-</span> t1<span style="color:#006600; font-weight:bold;">&#41;</span>.<span style="color:#9900CC;">to_s</span> <span style="color:#006600; font-weight:bold;">+</span> <span style="color:#996600;">&quot; seconds&quot;</span></pre></div></div>

<p>The code just iterates over each file, splits each file into words and aggregates the word counts using a Hash object.  It then sorts the Hash in two different ways and writes each to a separate file.  I think Ruby makes this code very easy to read and Ruby&#8217;s core library support makes this task pretty simple to accomplish.</p>
<p>Next, here is the wordcounts.scala version.  Again it assumes the newsgroup articles are already downloaded and can be run by executing <code>scala wordcounts.scala</code>.</p>

<div class="wp_syntax"><div class="code"><pre class="scala" style="font-family:monospace;"><span style="color: #008000; font-style: italic;">//Change rootDir to the location of your newsgroup files</span>
<span style="color: #0000ff; font-weight: bold;">import</span> java.<span style="color: #000000;">io</span>.<span style="color: #000080;">_</span>
<span style="color: #0000ff; font-weight: bold;">val</span> rootDir <span style="color: #000080;">=</span> <span style="color: #0000ff; font-weight: bold;">new</span> File<span style="color: #F78811;">&#40;</span><span style="color: #6666FF;">&quot;/home/zcox/dev/20_newsgroups&quot;</span><span style="color: #F78811;">&#41;</span>
<span style="color: #0000ff; font-weight: bold;">if</span> <span style="color: #F78811;">&#40;</span><span style="color: #000080;">!</span>rootDir.<span style="color: #000000;">exists</span><span style="color: #F78811;">&#41;</span> <span style="color: #0000ff; font-weight: bold;">throw</span> <span style="color: #0000ff; font-weight: bold;">new</span> IllegalArgumentException<span style="color: #F78811;">&#40;</span>rootDir + <span style="color: #6666FF;">&quot; does not exist&quot;</span><span style="color: #F78811;">&#41;</span>
&nbsp;
<span style="color: #00ff00; font-style: italic;">/** Iterates over all files under rootDir, opens each one and passes it to the function */</span>
<span style="color: #0000ff; font-weight: bold;">def</span> files<span style="color: #F78811;">&#40;</span>rootDir<span style="color: #000080;">:</span> File<span style="color: #F78811;">&#41;</span><span style="color: #F78811;">&#40;</span>process<span style="color: #000080;">:</span> File <span style="color: #000080;">=&gt;</span> Unit<span style="color: #F78811;">&#41;</span> <span style="color: #F78811;">&#123;</span>
  <span style="color: #0000ff; font-weight: bold;">for</span> <span style="color: #F78811;">&#40;</span>dir <span style="color: #000080;">&lt;</span>- rootDir.<span style="color: #000000;">listFiles</span><span style="color: #000080;">;</span> <span style="color: #0000ff; font-weight: bold;">if</span> dir.<span style="color: #000000;">isDirectory</span><span style="color: #F78811;">&#41;</span> <span style="color: #F78811;">&#123;</span>
    println<span style="color: #F78811;">&#40;</span><span style="color: #6666FF;">&quot;Processing&quot;</span> + dir<span style="color: #F78811;">&#41;</span>
    <span style="color: #0000ff; font-weight: bold;">for</span> <span style="color: #F78811;">&#40;</span>file <span style="color: #000080;">&lt;</span>- dir.<span style="color: #000000;">listFiles</span><span style="color: #000080;">;</span> <span style="color: #0000ff; font-weight: bold;">if</span> file.<span style="color: #000000;">isFile</span><span style="color: #F78811;">&#41;</span> <span style="color: #F78811;">&#123;</span>
      process<span style="color: #F78811;">&#40;</span>file<span style="color: #F78811;">&#41;</span>
    <span style="color: #F78811;">&#125;</span>
  <span style="color: #F78811;">&#125;</span>
<span style="color: #F78811;">&#125;</span>
&nbsp;
<span style="color: #0000ff; font-weight: bold;">val</span> t1 <span style="color: #000080;">=</span> System.<span style="color: #000000;">currentTimeMillis</span>
<span style="color: #0000ff; font-weight: bold;">var</span> counts <span style="color: #000080;">=</span> Map.<span style="color: #000000;">empty</span><span style="color: #F78811;">&#91;</span>String, Int<span style="color: #F78811;">&#93;</span>.<span style="color: #000000;">withDefaultValue</span><span style="color: #F78811;">&#40;</span><span style="color: #F78811;">0</span><span style="color: #F78811;">&#41;</span>
files<span style="color: #F78811;">&#40;</span>rootDir<span style="color: #F78811;">&#41;</span> <span style="color: #F78811;">&#123;</span> file <span style="color: #000080;">=&gt;</span> 
  file.<span style="color: #000000;">split</span><span style="color: #F78811;">&#40;</span><span style="color: #6666FF;">&quot;&quot;</span><span style="color: #6666FF;">&quot;<span style="color: #0000ff; font-weight: bold;">\W</span>+&quot;</span><span style="color: #6666FF;">&quot;&quot;</span><span style="color: #F78811;">&#41;</span>.<span style="color: #000000;">foreach</span> <span style="color: #F78811;">&#123;</span> word <span style="color: #000080;">=&gt;</span> counts <span style="color: #000080;">=</span> counts<span style="color: #F78811;">&#40;</span>word.<span style="color: #000000;">toLowerCase</span><span style="color: #F78811;">&#41;</span> +<span style="color: #000080;">=</span> <span style="color: #F78811;">1</span> <span style="color: #F78811;">&#125;</span>
<span style="color: #F78811;">&#125;</span>
&nbsp;
println<span style="color: #F78811;">&#40;</span><span style="color: #6666FF;">&quot;Writing counts in decreasing order&quot;</span><span style="color: #F78811;">&#41;</span>
write<span style="color: #F78811;">&#40;</span>counts, <span style="color: #6666FF;">&quot;counts-descreasing-scala&quot;</span><span style="color: #F78811;">&#41;</span> <span style="color: #F78811;">&#123;</span><span style="color: #000080;">_</span>.<span style="color: #000080;">_</span>2 <span style="color: #000080;">&gt;</span> <span style="color: #000080;">_</span>.<span style="color: #000080;">_</span>2<span style="color: #F78811;">&#125;</span>
&nbsp;
println<span style="color: #F78811;">&#40;</span><span style="color: #6666FF;">&quot;Writing counts in alphabetical order&quot;</span><span style="color: #F78811;">&#41;</span>
write<span style="color: #F78811;">&#40;</span>counts, <span style="color: #6666FF;">&quot;counts-alphabetical-scala&quot;</span><span style="color: #F78811;">&#41;</span> <span style="color: #F78811;">&#123;</span><span style="color: #000080;">_</span>.<span style="color: #000080;">_</span>1 <span style="color: #000080;">&lt;</span> <span style="color: #000080;">_</span>.<span style="color: #000080;">_</span>1<span style="color: #F78811;">&#125;</span>
&nbsp;
<span style="color: #0000ff; font-weight: bold;">val</span> t2 <span style="color: #000080;">=</span> System.<span style="color: #000000;">currentTimeMillis</span>
println<span style="color: #F78811;">&#40;</span><span style="color: #6666FF;">&quot;Finished in &quot;</span> + <span style="color: #F78811;">&#40;</span><span style="color: #F78811;">&#40;</span>t2 - t1<span style="color: #F78811;">&#41;</span>/<span style="color: #F78811;">1000.0</span><span style="color: #F78811;">&#41;</span> + <span style="color: #6666FF;">&quot; seconds&quot;</span><span style="color: #F78811;">&#41;</span><span style="color: #000080;">;</span>
&nbsp;
<span style="color: #00ff00; font-style: italic;">/** Writes the specified map to the specified file in tab-delimited format, sorted accordingly. */</span>
<span style="color: #0000ff; font-weight: bold;">def</span> write<span style="color: #F78811;">&#91;</span>K, V<span style="color: #F78811;">&#93;</span><span style="color: #F78811;">&#40;</span>map<span style="color: #000080;">:</span> Map<span style="color: #F78811;">&#91;</span>K, V<span style="color: #F78811;">&#93;</span>, file<span style="color: #000080;">:</span> String<span style="color: #F78811;">&#41;</span><span style="color: #F78811;">&#40;</span>sort<span style="color: #000080;">:</span> <span style="color: #F78811;">&#40;</span>Tuple2<span style="color: #F78811;">&#91;</span>K, V<span style="color: #F78811;">&#93;</span>, Tuple2<span style="color: #F78811;">&#91;</span>K, V<span style="color: #F78811;">&#93;</span><span style="color: #F78811;">&#41;</span> <span style="color: #000080;">=&gt;</span> Boolean<span style="color: #F78811;">&#41;</span> <span style="color: #F78811;">&#123;</span>
  using <span style="color: #F78811;">&#40;</span><span style="color: #0000ff; font-weight: bold;">new</span> PrintWriter<span style="color: #F78811;">&#40;</span><span style="color: #0000ff; font-weight: bold;">new</span> FileWriter<span style="color: #F78811;">&#40;</span>file<span style="color: #F78811;">&#41;</span><span style="color: #F78811;">&#41;</span><span style="color: #F78811;">&#41;</span> <span style="color: #F78811;">&#123;</span> out <span style="color: #000080;">=&gt;</span> 
    map.<span style="color: #000000;">toList</span>.<span style="color: #000000;">sort</span><span style="color: #F78811;">&#40;</span>sort<span style="color: #F78811;">&#41;</span>.<span style="color: #000000;">foreach</span> <span style="color: #F78811;">&#123;</span> pair <span style="color: #000080;">=&gt;</span> out.<span style="color: #000000;">println</span><span style="color: #F78811;">&#40;</span>pair.<span style="color: #000080;">_</span>1 + <span style="color: #6666FF;">&quot;<span style="color: #0000ff; font-weight: bold;">\t</span>&quot;</span> + pair.<span style="color: #000080;">_</span>2<span style="color: #F78811;">&#41;</span> <span style="color: #F78811;">&#125;</span>
  <span style="color: #F78811;">&#125;</span>
<span style="color: #F78811;">&#125;</span>
&nbsp;
<span style="color: #00ff00; font-style: italic;">/** Converts a File to a String. */</span>
<span style="color: #0000ff; font-weight: bold;">implicit</span> <span style="color: #0000ff; font-weight: bold;">def</span> file2String<span style="color: #F78811;">&#40;</span>file<span style="color: #000080;">:</span> File<span style="color: #F78811;">&#41;</span><span style="color: #000080;">:</span> String <span style="color: #000080;">=</span> <span style="color: #F78811;">&#123;</span>
  <span style="color: #0000ff; font-weight: bold;">val</span> builder <span style="color: #000080;">=</span> <span style="color: #0000ff; font-weight: bold;">new</span> StringBuilder
  using <span style="color: #F78811;">&#40;</span><span style="color: #0000ff; font-weight: bold;">new</span> BufferedReader<span style="color: #F78811;">&#40;</span><span style="color: #0000ff; font-weight: bold;">new</span> FileReader<span style="color: #F78811;">&#40;</span>file<span style="color: #F78811;">&#41;</span><span style="color: #F78811;">&#41;</span><span style="color: #F78811;">&#41;</span> <span style="color: #F78811;">&#123;</span> reader <span style="color: #000080;">=&gt;</span> 
    <span style="color: #0000ff; font-weight: bold;">var</span> line <span style="color: #000080;">=</span> reader.<span style="color: #000000;">readLine</span>
    <span style="color: #0000ff; font-weight: bold;">while</span> <span style="color: #F78811;">&#40;</span>line <span style="color: #000080;">!=</span> <span style="color: #0000ff; font-weight: bold;">null</span><span style="color: #F78811;">&#41;</span> <span style="color: #F78811;">&#123;</span>
      builder.<span style="color: #000000;">append</span><span style="color: #F78811;">&#40;</span>line<span style="color: #F78811;">&#41;</span>.<span style="color: #000000;">append</span><span style="color: #F78811;">&#40;</span><span style="color: #6666FF;">'<span style="color: #0000ff; font-weight: bold;">\n</span>'</span><span style="color: #F78811;">&#41;</span>
      line <span style="color: #000080;">=</span> reader.<span style="color: #000000;">readLine</span>
    <span style="color: #F78811;">&#125;</span>
  <span style="color: #F78811;">&#125;</span>
  builder.<span style="color: #000000;">toString</span>
<span style="color: #F78811;">&#125;</span>
&nbsp;
<span style="color: #00ff00; font-style: italic;">/** Performs some operation on the specified closeable object and then ensures it gets closed. */</span>
<span style="color: #0000ff; font-weight: bold;">def</span> using<span style="color: #F78811;">&#91;</span>Closeable <span style="color: #000080;">&lt;:</span> <span style="color: #F78811;">&#123;</span><span style="color: #0000ff; font-weight: bold;">def</span> close<span style="color: #F78811;">&#40;</span><span style="color: #F78811;">&#41;</span><span style="color: #000080;">:</span> Unit<span style="color: #F78811;">&#125;</span>, B<span style="color: #F78811;">&#93;</span><span style="color: #F78811;">&#40;</span>closeable<span style="color: #000080;">:</span> Closeable<span style="color: #F78811;">&#41;</span><span style="color: #F78811;">&#40;</span>getB<span style="color: #000080;">:</span> Closeable <span style="color: #000080;">=&gt;</span> B<span style="color: #F78811;">&#41;</span><span style="color: #000080;">:</span> B <span style="color: #000080;">=</span> 
  <span style="color: #0000ff; font-weight: bold;">try</span> <span style="color: #F78811;">&#123;</span>
    getB<span style="color: #F78811;">&#40;</span>closeable<span style="color: #F78811;">&#41;</span>
  <span style="color: #F78811;">&#125;</span> <span style="color: #0000ff; font-weight: bold;">finally</span> <span style="color: #F78811;">&#123;</span>
    closeable.<span style="color: #000000;">close</span><span style="color: #F78811;">&#40;</span><span style="color: #F78811;">&#41;</span>
  <span style="color: #F78811;">&#125;</span></pre></div></div>

<p>Notice that Scala code is statically typed and compiled software that can be run as a simple script &#8211; Scala really does scale from small one-off scripts to large enterprise systems.  It mirrors the Ruby script fairly closely, with the addition of some helper functions at the end to make up for some features that the core Scala library doesn&#8217;t provide.  Overall I think it&#8217;s just as readable as the Ruby code, but comes with static type-checking and it runs on the JVM.</p>
<p>I have included some timing info as well.  On my 2GHz Core2 Duo with 3GB RAM, the Ruby script averages about 60 seconds while the Scala script averages about 30 seconds.  The Scala version uses immutable Map objects to store the word counts; thus a new Map object is created for each word.  I switched it to a mutable Map (which required only minor code changes) and it dropped to about 15 seconds.  Hardly scientific, I&#8217;m sure a lot of optimizations could be done on both scripts, but this shows the performance gains provided that can be achieved on the JVM using Scala.</p>
<p>If anyone can spot any improvements to make to these scripts, please share them with comments!</p>
]]></content:encoded>
			<wfw:commentRss>http://blogs.sourceallies.com/2009/12/word-counts-example-in-ruby-and-scala/feed/</wfw:commentRss>
		<slash:comments>13</slash:comments>
		</item>
	</channel>
</rss>
