Introduction to Habanero Java (HJ)

Presented by David Bunde, Jaime Spacco, and Casey Samoore.
Pre-conference workshop at CCSC-MW 2012.


Abstract:

In the last several years, multi-core processors have gone from being a rarity to becoming standard, so that all new laptops and desktops use them. The number of cores in typical systems is also growing, with exponential growth expected to come from the transistors predicted by Moore's Law. If our students are to make full use of multi-core processors and write software whose performance will improve with future chip generations, it is crucial that we change parallel programming from a specialized subarea taught in graduate courses and specialized electives to a topic covered in required courses where all students experience it.

We believe that widespread parallel programming will rely on high-level parallel languages and tools. Students should see threads and locks, but even simple programs using these require significant code and can be challenging to debug. To address this, several parallel languages are being developed with programmer productivity as a goal. The expressiveness of these languages also makes them particularly suitable for students by reducing the learning curve. We propose a workshop on one such language, Habanero Java (HJ), currently being developed at Rice University. It is an extension of Java, greatly reducing the barrier to adoption; we are using it for a brief introduction to parallelism in CS 2. At the same time, it is powerful enough to be used by researchers and in advanced electives focused on parallelism. In addition, HJ provides constructs similar to those being introduced in other languages, indicating that students will be learning portable concepts.

Since the reader may be unfamiliar with HJ, we now illustrate it briefly. It is Java plus a few additional keywords. The following are three of the common ones:

These keywords (in bold) are used in the following to count zeros in an array:
void countZero(int[] A, int from, int to) {
   if((to - from) < threshold) {
      //serially compute local count using for loop
      isolated { numZeros += localCount; }
   } else {
      int mid = from + (to - from)/2;
      async countZero(A, from, mid);  //1st half w/ new task
      countZero(A, mid, to);          //recurse for 2nd
   }
}
...
   finish {
  	countZero(array, 0, array.length);
   }  //waits at end of finish until all tasks complete
This procedure's recursive style is not how a serial version would be written, but creating many tasks facilitates load balancing and lets the runtime use whichever cores are available (including additional cores on future processors). Task creation and scheduling (via a queue of tasks) is lightweight enough to support this style of programming. (Note that we serially handle small subarrays to limit overhead.)

It is also a bit unusual that the procedure uses a global variable rather than a return value. HJ also supports return values using another construct (future) which let the function run asynchronously, with the main task blocking only once it needs the return value. We present the version above because future has slightly more complicated syntax and to show async, finish, and isolated.

Our workshop will familiarize attendees with HJ and its approach to scalable parallelism, as well as the sharing our experiences and supporting curricular resources. In our experience, it is not hard to adopt a new language after spending some time using it and we will give attendees this experience and also direct them to resources for additional support.