Java 8: Parallel vs Sequential Stream Comparison

Motivated by the introduction of Lambdas in Java 8, I wrote a couple of examples to see how difficult it would be to follow a functional programming paradigm in real production code.

I will demonstrate using some features from Java 8 with a simple and fun example.

The application basically gets the minimum temperature in the United States by zip code. Of course we can retrieve the same result instantly using Google, but let’s use my application for the sake of understanding some new features in Java 8.

Please use the following url to clone the example.

The following snippets of code illustrates how the application works:
Create a list of REST web-service URLs.

At the beginning, the application loads the USA_Zip.txt text file. This file contains all the zip codes of the United States. It creates a parallel stream of these zip codes and maps each zip code to a string that contains the URL of the web-service. At the end the program collects the stream into a an Arraylist of URLs Strings.

String path = MainApp.class.getClassLoader().getResource("USA_Zip.txt").getPath();

List<String> urls = Files.lines(Paths.get(path)).parallel()
			 .map(e -> WEATHER_SERVICE.replace("${zip}", e))
			 .collect(toCollection(ArrayList::new));
Extract the minimum temperature by zip code with both sequential and parallel processing.

After extracting the list of URLs two threads are created: Parallel and sequential threads.

Notice that I used the “Single Abstract Method” or SAM feature from Java 8 to initialize the two threads with two different lambdas.

For retrieving the temperature by zip code I used the following web service: http://api.openweathermap.org/data/2.5/weather?zip=${zip},us

Inside the parallel thread I used a parallel stream to iterate over all the URLs and fire a HTTP JSON request using the method captureTemperature.

This method is a function that maps:
* [String url -> Double value (in kelvin)].
After that apply filter on each item to only pass the values more than zero:
* [temp -> temp > 0].
Next step will be another map that converts Kelvin to Fahrenheit with the following function:
* [Double kelvin -> Double fahrenheit ].
Finally find the minimum value by calling the min(Double::compare) method with the “Double::compare” as a method reference (new Java 8 feature).

Same logic was applied to the sequential thread but instead of parallel stream I used the sequential stream.

Runnable parallel = () -> {
    Long timeStarted = System.currentTimeMillis();
    System.out.println("Parallel minimum temperature: " + 
    urls.parallelStream().map(MainApp::capteurTemperature).filter(temp -> temp > 0)
        .map(MainApp::kelvinToFahrenheit)
	.min(Double::compare).get());
    System.out.println("Parallel processing took: " + (System.currentTimeMillis() - timeStarted));
};

Runnable sequential = () -> {
    Long timeStarted = System.currentTimeMillis();
    System.out.println("Sequential minimum temperature: " + 
    urls.stream().map(MainApp::capteurTemperature).filter(temp -> temp > 0)
        .map(MainApp::kelvinToFahrenheit)
	.min(Double::compare).get());
    System.out.println("Sequential processing took: " + (System.currentTimeMillis() - timeStarted));
};

new Thread(sequential).start();
new Thread(parallel).start();
Results

After the application finished running, the results were remarkable:

Parallel minimum temperature: 59.288000000000046 F
Parallel processing total time: 753.975 seconds

Sequential minimum temperature: 59.288000000000046 F
Sequential processing total time: 2481.188 seconds

 

The parallel stream finished processing 3.29 times faster than the sequential stream, with the same temperature result: 59.28F.

Conclusions

Parallel stream is an efficient approach for processing and iterating over a big list, especially if the processing is done using ‘pure functions’ transfer (no side effect on the input arguments).

When developing programs using streams in Java 8, remember we only express ‘what’ we expect instead of ‘how’ to explicitly implement it.