Archive for October, 2009

Simile Timeline

October 31st, 2009

An interesting find by Matt and some coaxing lead me to implementing the MIT Simile Timeline project found at https://simile.mit.edu/timeline/, the website also provides an API for the few functions Timeline has.  I use timeline in my actitime-rc project to track when emails are being sent.  This is nice for people who just want a summary of the information being shared in the project. To populate the timeline I created a simple servlet which grabs information from the database  and creates a JSON string to populate the timeline. 

Timeline_fix

Spring LDAP Group Authorization Tip

October 30th, 2009

The folks at Spring have made it extremely easy to allow your application authenticate and authorize users with Spring LDAP. This blog entry explains how to check your directory structure and use some sparsely documented Spring LDAP parameters ({0} and {1}) to get everything working.

In your Spring Security configuration, pointing to your directory is straight forward:

 <ldap-server id="ldapServer" url="ldap://dir.yourdomain.com:389/" />

But in configuring the ldap-authentication-provider, you need to know a few things about your directory of course! We recommend using Apache Directory Studio to browse your directory – it’s a fantastic application.

If you’re more of a command-line person, just use ldapsearch (example below):

ldapsearch -H ldap://dir.yourdomain.com:389 -ZZ -x 
-D "cn=AdminUser,dc=yourdomain,dc=com" -W -b "cn=users,ou=groups,dc=yourdomain,dc=com" 
-s base -a always "(objectClass=*)" "*"

Once connected to your directory, you’ll need to figure out how your groups are configured. Specifically, you’ll want to know if your configuration looks like

Example A:

  • dc=yourdomain,dc=com
    • ou=groups
      • cn=users
        • memberUid=USERNAME

or Example B:

  • dc=yourdomain,dc=com
    • ou=groups
      • cn=users
        • memberUid=uid= USERNAME,ou=people,dc= yourdomain,dc=com

If it’s like Example A, you’ll want your config like this:

<ldap-authentication-provider server-ref="ldapServer"  
	user-search-base="ou=people,dc=yourdomain,dc=com" 
	user-search-filter="(uid={0})"
	group-role-attribute="cn"
	group-search-base="ou=groups,dc=yourdomain,dc=com"
	group-search-filter="(memberUid={1})"
	role-prefix="ROLE_" />

otherwise, you’ll want this config:

<ldap-authentication-provider server-ref="ldapServer"  
	user-search-base="ou=people,dc=yourdomain,dc=com" 
	user-search-filter="(uid={0})"
	group-role-attribute="cn"
	group-search-base="ou=groups,dc=yourdomain,dc=com"
	group-search-filter="(memberUid={0})"
	role-prefix="ROLE_" />

Note the difference in the group-search-filter:

  • {0} contains the username with the entire ldap base.
  • {1} only contains username.

COBOL.NET and Cuff links

October 30th, 2009

I recently sat through several vendor demos for a client who is in the market for a health care claims administration platform. These systems are large ticket items and the vendors ranged from large and well entrenched vendors with years of experience in the market to small and relatively new vendors with fewer than ten clients.

Several of the key decision makers for the client were very focused on whether or not the various vendors invited to participate had chosen to make their presentation in person or via the web. While this seemed to make little difference with regard to the product overviews once each was under way it was discussed several times throughout the two day period during which the demos took place. Each vendor was made aware of the view of the client and made their decision to travel or to demo via the web and phone from an informed perspective.

Several vendors traveled to Des Moines for their presentation and at least one chose to do so exclusively via the web. One well regarded vendor with significant market-share chose to send sales staff to present in person. The two salesmen from Vendor ‘X’ arrived to make their presentation in well-tailored suits with ties. The lead salesman was wearing a shirt with French cuff and cuff links.  So – from the client perspective – these individuals invested the proper time and preparation for their presentation.  So far, so good!

The Vendor X  presentation began with the normal pleasantries and they passed out copies of the Powerpoint slide deck in color. As the salesmen walked through their presentation they connected a technical resource via Webex who spoke in greater detail about the different business features and advantages of their system.  To me it was apparent that the salesmen in the room were involved in the presentation but weren’t in fact doing the demonstration.  Maybe that is too fine a distinction to make and I’m not even sure that all present that day would agree with me.  The two guys in suits made a fine impression regardless of their actual function or purpose.

About one hour into the two hour presentation/demonstration I stepped away for 5 minutes.  When I returned to the room they were covering the underlying technology of one of their core modules. The salesmen were talking enthusiastically about the virtues of COBOL.NET. Wow! I did a double-take and rather than interrupt and expose my obvious ignorance I quickly Googled COBOL.NET to confirm that the language actually existed. Sure enough, it’s a Microfocus product.

I know my client was satisfied that the various presentations given were a success and that there are several vendors with products that we want to further investigate. That investigation will include site visits to out-of-state technology vendors. We need to ensure that they aren’t two guys working in a garage.  (hmmm….)

If we (Source Allies) can take anything away from this it is that customer perceptions are very important and that some non-technical customers may have some very specific requirements that have nothing to do with technology. Seller beware!

After the demos wrapped up the client’s staff and I compared notes internall.  I did not bother to explain to the client that ‘two men in a garage’ is a reference to the founders of Hewlett Packard, a really large company by any standard today.  I also didn’t bother to give them my deep thoughts on COBOL.NET although we covered that briefly.  As for the suits and cuff links, well quite frankly I like wearing cuff links.

Functionality, flexibility and a clean user interface seemed to clearly carry the most weight in the analysis of the applications.  People seemed to appreciate a good design.  I am sure many nontechnical people can’t describe a well designed system but actually do know one when they see one.

I don’t know how many silent points were gained or lost by vendors for choosing to arrive in person or to demo via the web from out of state. All vendors did a good job of covering their respective applications.  I’d hate to think that a vendor would loose because they demo’ed a system online while a less effective application would fare better because a person arrived to hand out Powerpoint slides they had ‘thoughtfully’ printed and assembled in advance.  That doesn’t really seem like the wave of the future.  Some vendors traveled 6 hours in a car each way to make their presentation.  Another vendor had staff travel farther, by commercial air carrier.  Wonder who pays for those expenses in the end? darth

If I get to leave you with a couple of thoughts I’d like them to be that we should always try to understand what our clients’ hot buttons. While those issues or items may not make sense to us we should make note of them and in some manner take them into consideration when working with that client.

Can you imagine having a great product from both a functionality and a technology perspective and ultimately loosing the sale to a larger competitor with a product written in COBOL.NET? Think of the time we’d probably spend analyzing what went wrong.  I’d rather not find myself in that situation – but I’m certain those guys in cufflinks are going to close another deal sooner or later and COBOL.NET will not slow them down too much with some of their prospective clients.

Using Conga Web Configuration with Red Hat Cluster Suite – Part 1

October 30th, 2009

Overview

Red Hat Cluster Suite provides high availability and clustered storage for RHEL platforms.  Unfortunately the configuration management for each node can be tedious as the /etc/cluster/cluster.conf file must be copied to each as changes are made.  Conga makes life a little simpler.

http://sourceware.org/cluster/conga/

Conga provides a client/server architecture for cluster management with the ricci and luci services.  Luci acts as the configuration interface and sends instructions the the ricci client on each server.  Ricci takes instructions from luci and updates cluster.conf as necessary.

Package Installation and Configuration

Install the Cluster Suite with the following package group installation commands:

yum groupinstall "Clustering"
yum groupinstall "Cluster Storage"

Once installed luci must be initalized on one node.  The initialization script will ask for a password for the default admin account.

luci_admin init
/etc/init.d/luci restart

Next start ricci on all nodes that will be joined to the cluster.

/etc/init.d/ricci start

Log In to the Web Interface

Use the url provided by the luci_admin script to login to the web interface.

Luci Login

Luci Login

Coming in Part 2

In the next post I’ll go over the basics of initializing a cluster via Conga.  Seeing as luci and ricci do occasionally get in a fight I will provide some tips on dealing with these situations also.

Strict Quote Escaping in Tomcat

October 30th, 2009

I just started here at Source Allies (loving it here so far, btw!) and inherited an aging code base to resurrect.  It was originally deployed on Tomcat 5 and one of the issues I encountered upgrading to Tomcat 6 was strict quote escaping.  The code base has lots of JSPs with elements like this:

<some:tag title=”<%=(String)request.getAttribute(”title”)%>”>

Apparently this used to fly under the radar up until Tomcat 5.5.26, but Tomcat 5.5.27+ enforces the quoting requirements of the JSP spec.  Running this app with Tomcat 6 produced lots of exceptions like this one:

javax.servlet.jsp.JspException: ServletException in ‘/WEB-INF/content/admin/editUser.jsp’: /WEB-INF/content/admin/editUser.jsp(6,23) Attribute value (String)request.getAttribute(”title”) is quoted with ” which must be escaped when used within the value

Now, we all know that double-quotes within double-quotes is a no-no and should be fixed by either using single quotes to enclose the attribute value:

<some:tag title=’<%=(String)request.getAttribute(”title”)%>’>

or by escaping the inner double-quotes:

<some:tag title=”<%=(String)request.getAttribute(\”title\”)%>”>

However in this case we just needed to get the app up & running quickly so I found a quick, temporary workaround instead of fixing all of the improperly formatted quotes.  Setting org.apache.jasper.compiler.Parser.STRICT_QUOTE_ESCAPING=false in $TOMCAT_HOME/conf/catalina.properties allows the double-quotes within double-quotes, and no more exceptions!

Open Source Router, Proprietary Cake

October 30th, 2009

Keeping with SAI’s proclivity toward open source software, I present to you Vyatta.  Vyatta is a small company with the goal of taking down Cisco by offering an open source router that can run on standard x86 hardware.  With the prevalence of virtualization, one could realistically open a branch office using just a single x86 server with a T1 card from Vyatta.  The router, firewall, and VPN are covered by Vyatta and the apps could run in a virtualized OS.

Better yet is their current sales promotion.  If Cisco’s gross profit margin is 70%, Vyatta will give you a 30% discount.  As Cisco makes less money, Vyatta gets cheaper.

Lastly, proprietary cake tastes good.  I can prove it, too.

Mark is really excited with the Router Cake at his wedding

Mark is really excited with the Router Cake at his wedding

Strive for Employee Motivation… or Prevent Demotivation?

October 29th, 2009

Every individual has experienced the kind of day in which they struggle to be motivated.  Think back to when you last experienced a day like this.  Did you have a hard time getting up out of bed?  Take a little longer to get to work?  Go out of your normal route to work to pick up a bagel or coffee?  On these kinds of days, you search for an extra dose of motivation to “get you going” for the day ahead and tasks that are anxiously awaiting your attention.

Employers constantly search for ways to keep their employees motivated.  But are they doing the right thing?  In this article John Roulet makes a very valid argument that employers are not taking the right approach.  By human nature most individuals have a natural born sense of motivation.  When starting a new job, individuals are ready to dive in and learn, learn, learn in order to ramp up as fast as possible.  But certain circumstances can easily deflate an employee to where they are demotivated to work.

As an employer, we have many tools that are able to assist with preventing demotivation from occuring.  It’s a matter of how the tools are used and applied that makes them a success or failure.

At Source Allies we are constantly implementing and reviewing different programs, policies and planning events to prevent demotivation from setting in.  Our employee surveys prove that we have been very successful at this.  In order to be successful, it takes leadership that agrees to give all that it takes to achieving this goal and team members that are willing to step up to help identify and implement positive changes.

I’m proud to work on a team of employees that regularly contribute to every aspect of the business.  Team members participate by assisting with planning company initiatives and goals.  Goals aren’t just identified, but continuous updates are made throughout the year.  Meetings are held quarterly to review the progress towards achieving our goals.  In addition, our team meets weekly for technical presentations or just to have a social gathering which is as important in order to sustain a positive motivated work force.

Next Monday night we will be presenting the “Source Allies Extra Mile Award” to a very deserving team member.  This award is presented quarterly (or more / less frequent based upon nominations made) to a peer nominated individual who, as the award’s name implies, has gone above and beyond the normal call of duty.  The award winner is chosen by a group of team members in order to have as many people involved in the process as possible.  Stay tuned to hear additional details following the award presentation.

In closing, during my research for this blog post, I came come across another idea that we may have to discuss in a future Monday night meeting.  An INSTANT rewards motivational program.  Follow the link to watch a video with additional details.  Maybe we should think about adding this to our future open source product offerings?? 

Now, all joking aside, it’s YOUR turn – what prevents demotivation for you in the workforce?

Musings of a SpringOne 2009 Attendee – Day 3

October 26th, 2009

Agile Architecture – Technologies and Patterns – Kirk Knoernschild

Some of the questions this session set out to attempt to answer were

  1. What is architecture?
  2. What defines architecture?
  3. What are architectural decisions?
  4. Is architecture a forward only decision?

Several definitions of Architecture were quoted from prior literature. Such as architecture being the the shared understanding of the system being built. Shared understanding between a group of people who need to communicate about it — developers and architects, or technical and management etc.
Lean principles are you delay » Read more: Musings of a SpringOne 2009 Attendee – Day 3

Nutch and Solr for Open source “Google-like” search??

October 23rd, 2009

This is a follow-up blog to Matt’s earlier post on Open Source Enterprise Search

We all love Google don’t we? Right from searching the web or the company intranet to searching internal source code, we just “google” everything. Now, won’t it be more fun to do by yourself what Google does for you? And probably even change how google does few things for you?? And learning how Google does it for you? All that without having to spend $10k+ on appliances to do this for you? If this sounds exciting, read on and Welcome to Open source search using Lucene, OpenGrok, Nutch and Solr.

A quick introduction to what these are:

Apache Lucene: An Open Source full-text information retrieval(IR) library written in Java.

Luke: Diagnostic tool to access, display and modify existing Lucene indexes.

Nutch: Open source Java implemented search engine built on top of Lucene.

Solr: Lucene-based search server which scales better than Nutch for Enterprise level usage.

Our basic goal is to develop a full text search engine. Full text search refers to a technique for searching a computer-stored document or database wherein the search engine examines all of the words in every stored document as it tries to match search words supplied by the user.

Steps in Full text search:

  • Indexing: Scan the text of all the documents and build a list of search terms, often called an index, but more correctly named a concordance or an “inverted index”.

  • Search: Perform a specific query, referring only to the index rather than the text of the original documents.

A way to understand inverted indexes is like a List of <term -> Documents[]>.

More about Lucene:

Lucene is just the core of a search engine. As such, it does not include things like a web spider or parsers for different document formats. Instead these things need to be added by a developer who uses Lucene.

Lucene does not care about the source of the data, its format, or even its language, as long as you can convert it to text. This means you can use Lucene to index and search data stored in files: web pages on remote web servers, documents stored in local file systems, simple text files, Microsoft Word documents, HTML or PDF files, or any other format from which you can extract textual information.

lucene

Lucene supports:

  • Incremental indexing.
  • Ranked searching.
  • Wild card and proximity queries.
  • Search by field (e.g., title, author, contents).
  • Search by date range.
  • Simultaneous update and search of index.
  • Implementations available in other languages with the index being compatible!

Read more about the features and configuration details of Nutch and Solr in their specific blogs:

Nutch – http://blogs.sourceallies.com/2009/10/nutch-features-and-configuration-details/

Solr – http://blogs.sourceallies.com/2009/10/solr-%E2%80%93-features-and-configuration-details/

Nutch Vs. Solr

The Nutch crawler is ideal for crawling unstrucutred data such as PDF, Word Documents and HTML. It also has a feature-rich crawler(filters, authentication, not HTTP based like Solr). On the other hand, Solr is better for crawling Structured data such as XML, Databases etc and also scales better for Enterprise level search.

Now the question is – Solr or Nutch?

The solution:

  • Use Nutch for indexing unstructured data.

  • Use Solr for databases and structured data.

  • Integrate both the indexes and use Solr to serve search results.

We will be (or atleast I am) looking forward to a future when these Open source tools make it to a stage where people say “Just Solr it” instead of “Just Google it”. I believe the release of Solr1.4 after the release of Lucene2.9.1 will bring in many newer and better features and improve the popularity of open source search engines.

Solr – features and configuration details

October 23rd, 2009

Solr is a standalone enterprise search server with a web-services like API. You put documents in it (called “indexing”) via XML over HTTP. You query it via HTTP GET and receive XML results. Some of the main features of Solr are:

  • Advanced Full-Text Search Capabilities

  • Facilitates Faceted browsing : Narrowing down Search results by category (e.g., Manufacturer, price or author)

  • Optimized for High Volume Web Traffic

  • Standards Based Open Interfaces – XML and HTTP

  • Comprehensive HTML Administration Interfaces

  • Server statistics exposed over JMX for monitoring

  • Scalability – Efficient Replication to other Solr Search Servers

  • Flexible and Adaptable with XML configuration

  • Extensible Plugin Architecture.

Solr Uses the Lucene Search Library and Extends it! More details about the features of solr can be found here.

Indexing databases using Solr:

Step 1: Download Solr and set the directory where it is extracted as solr.solr.home

Step 2: Deploy solr.war in the application server.

Step 3: Indexing

In Solr, indexing and searching are initiated by sending HTTP requests to the web application. It uses the Data Import Handler for handling HTTP requests related to database indexing.

Two commands:

Full-import: To do a full import from the database and add to Solr index

Delta-import: To do a delta import (get new inserts/updates)

  • Create a db-data-config.xml and specify the location of this file in solrconfig.xml under DataImportHandler section

    <requestHandler name="/dataimport">
    	<lst name="defaults">
          <str name="config">/home/username/data-config.xml</str>
    	</lst>
  	</requestHandler>
  • Mention connection info in db-data-config.xml:

    <dataSource name=”ds1” driver=”com.mysql.jdbc.Driver” name=“data-src1” url=”jdbc:mysql://host:port/dbname” user=”db_username” password=”db_password”/>

  • Open the DataImportHandler page to verify if everything is in order http://host:port/solr/dataimport

  • Write full-import query in db-data-config.xml.

  • Write delta-import query in db-data-config.xml.

  • Use http://localhost:8983/solr/db/dataimport?command=full-import to do a full-import.

  • Use http://localhost:8983/solr/db/dataimport?command=delta-import to do a delta-import.

Doing a Full import:

example-schema

Sample db-data-config.xml would be something like this:

<dataConfig>

<dataSource driver=”org.hsqldb.jdbcDriver” name=“ds1” url=”jdbc:hsqldb:/temp/example/ex” user=”sa” />

<document name=”products”>

<entity name=”item” dataSource=“ds1” query=”select name,id from item”>

<field column=”ID” name=”id” />

<field column=”NAME” name=”name” />

<entity name=”item_category” query=”select CATEGORY_ID from item_category where item_id=’${item.ID}’”>

<entity name=”category” query=”select description from category where id=’${item_category.CATEGORY_ID}’”>

<field column=”description” name=”cat” />

</entity>

</entity>

</entity>

</document>

</dataConfig>

Use http://localhost:8983/solr/db/dataimport?command=full-import to do the full import.

Doing a Delta import:

When delta-import command is executed, it reads the start time stored in conf/dataimport.properties. It uses that timestamp to run delta queries and after completion, updates the timestamp in conf/dataimport.properties. This feature requires that the database has a field that has a timestamp of when the record was last modified.

For setting up Delta queries, in db-data-config.xml, write delta queries as shown below:

<entity name=”item” pk=”ID”

query=”select name, id from item”

deltaImportQuery=”select name,ID from item where ID==’${dataimporter.delta.id}’”

deltaQuery="select id from item where last_modified > '$ {dataimporter.last_index_time}'">

Use http://localhost:8983/solr/db/dataimport?command=delta-import to do the delta import.

Further details can be found here

Step 4: Configuring schema.xml

Specifies the schema of the Solr index.

Define the new fields added to the schema.xml file

E.g.,

<field name=“id” type=“text” stored=“true” indexed=“true” />

<field name=“temp” type=“sfloat” stored=“true” indexed=“true” />

Integrating Solr and Nutch

It would be nice to index unstructured data using Nutch, structured data using solr and search all the data together using Solr. To do this:

  1. Deploy the Solr web application.

  2. Index structured data using Solr’s full-import and delta-import.

  3. Crawl unstructured data with Nutch until the merge segments stage. (As we need only the segments not the index)

  4. Index all contents from all Nutch segments to Solr.

bin/nutch solrindex http://hostname:port/solr/ crawl/crawldb crawl/linkdb crawl/segments/*

where crawl is the crawl folder created by Nutch

5. Search through Solr admin ui

http://hostname:port/solr/admin

The results are presented in a XML format which can easily be styled using XSLT and presented in a user-friendly manner.

solr

Read more about Solr here:

http://www.ibm.com/developerworks/java/library/j-solr1/

http://www.ibm.com/developerworks/java/library/j-solr2/