HPX V0.9.11 Available!

The STE||AR Group is proud to announce the release of HPX v0.9.11! In this release our team has focused on developing higher level C++ programming interfaces which simplify the use of HPX in applications and ensure their portability in terms of code and performance. We paid particular attention to align all of these changes with the existing C++ Standards or with the ongoing standardization work. Other major features include the introduction of executors and various policies which enable to customize the ‘where’ and ‘when’ of task and data placement. Continue reading

GSoC 2015 Results: Success!

This summer has been an exciting time for the STE||AR Group’s GSoC mentors and students alike! We were very pleased with the dedication and effort of all five of our participants. Our students made contributions to three of our software products: HPX, a distributed C++ runtime system which comes with a standards compliant API and allows users to scale their applications across thousands of machines; LibGeoDecomp, an auto-parallizing library for petascale computer simulations which is able to take advantage of HPX to better adapt fluctuating workloads to the system; and LibFlatArray, a highly efficient multidimensional array library which provides an object-oriented interface but stores data in a vectorization-friendly Struct-of-Arrays format. Just like these three products can work together as a tightly integrated stack, our goal with the GSoC projects was to create synergy between them and steer our development towards a common goal: increase the adaptivity and efficiency of our software. Below are the summaries of our student’s projects:

Implementation of a New Resource Manager in HPX- Nidhi Makhijani

This project set out to properly assign hardware resources to executors, which are C++ objects that dictate the way that a thread should be executed. Nidhi was able to allocate resources to an executor when it was created and return the resources when it stops. Additionally, Nidhi laid the groundwork for dynamic allocation where the resource manager can monitor and share resources amongst all of the running executors.

SIMD Wrapper for ARM NEON, Intel AVX512 & KNC in LibFlatArray- Larry Xiao

Vectorization is imperative for writing highly efficient numerical kernels. The goal of this project was to extend the already existing SIMD wrappers in LibFlatArray to more architectures (e.g. ARM NEON, Intel AVX512, & KNC) and to extend the capabilities of these wrappers. Larry set out to study the different ISA (Instruction Set Architecture), and make the library run efficiently on these architectures.

CSV Formatted Performance Counters for HPX- Devang Bacharwar

HPX provides users with a uniform interface to access arbitrary system information from anywhere in the system. Devang’s project has now allowed users to request these counters in a CSV format. Additionally, he has enabled the ability to get timestamps with each value as well. These features will make it easier for HPX users to perform analysis on the performance data gathered from an application.

Integrate a C++AMP Kernel with HPX- Marcin Copik
The HPX runtime system can coordinate the execution and synchronization of OpenCL kernels on arbitrary OpenCL devices, such as GPUs, in a system. In his GSoC project, Marcin used a C++ AMP compiler to produce an OpenCL kernel from a parallel algorithm implemented by HPX. Marcin integrated the Kalmar AMP compiler into the HPX build system, transformed a parallel for each algorithm into an OpenCL kernel, dispatched the kernel to a GPU and synchronized the result with a concurrently running HPX application.

A Flexible IO Infrastructure for LibGeoDecomp- Konstantin Kronfeldner

In LibGeoDecomp, users are able to read from and write to arbitrary regions of the simulation space. These operations are carried out by objects which we call Steerers and Writers. Over the summer, Konstantin has added the ability for these Steerers and Writers to be dynamically created and destroyed. LibGeoDecomp is typically used on supercomputers, where jobs are executed non-interactively via a batch system. Konstantin’s extensions enable users to interact with the application at runtime. They can view and modify the simulation model dynamically. The benefit of this is a significantly lower turnaround time for domain scientists who need to carry out many computational experiments.

CppCon 2015

By ,

Grant Mercer and I had the opportunity to present our talk, ‘Parallelizing the STL’, at Cppcon 2015. We both consider ourselves lucky for being able to attend the conference. The buzz of the atmosphere and C++ community was truly exciting to witness. Attendees were both from all over the world and performance critical industries ranging from gaming and finance to scientific computing. As Jon Kalb highlighted in his talk, C++ is receiving a resurgence for several performance related reasons: Moore’s Law is coming to end and the subsequent shift to multi-core architectures, increased computational demands from the private sector, and the rise of power constrained mobile architectures. Combined with the interest in the standardization process, C++17 and beyond, there was a palpable excitement.

The buzz was only complemented by the number of great talks. The opening keynote by Bjarne Stroustrup introduced the C++ core guidelines (github.com/isocpp/CppCoreGuidelines) to limit resource leaks, ensure static type safety and provide proscriptive guidelines that teams can follow. The idea is to support open dialogue so that better tools can be used and cleaner code can be written. The guidelines were a theme of the conference. Herb Sutter’s day 2 keynote showed off what the upcoming editions of Microsoft visual studio can catch. Aside from a push to more compatible C++ tools, upcoming language features were shown. Friday’s keynote by Eric Niebler demonstrated the upcoming ranges proposal. He introduced programming with his ranges library that he hopes to be in the future STL2. Using ranges, he created a calendar that had no for loops, used lazy evaluation and had immutable types that can serve to limit bugs.

Grant and I delivered our talk on Tuesday where we introduced how the parallel algorithms are implemented inside of HPX and how the application developer can communicate with the algorithm. We wanted to highlight first that the we can communicate the with algorithms via execution policies and there associated executors and executor parameters and second, that we did this through a generic partitioning scheme. As this being my first talk, I was quite nervous! Luckily, there were other students and certainly, a little bit nerves doesn’t outweigh getting to be a part of the C++ community.

HPX Tutorial Promo Video

As a build up for our Supercomputing tutorial, the STE||AR Group has put together a promotional video to generate interest in HPX. The video gives viewer a high level overview of what HPX is and what will be discussed at the tutorial. The SC15 Tutorials Committee will circulate this and other tutorial videos on its YouTube playlist. We would like to thank our colleague Randy Dannenberg and his students for helping us put this together!

On Tour: HPX Tutorial at SC15!

Howdy! The STE||AR Group welcomes you to participate in a hands on HPX tutorial which will be given this year in Austin, Texas as part of the SC Tutorials program. STE||AR Fellows from Louisiana State University, Friedrich-Alexander Universitat, Lawrence Berkeley National Laboratory, and University of Oregon will present “Massively Parallel Task-Based Programming wih HPX” which will consist of five parts:

  1. HPX: a New Paradigm – A high level overview of the kinds of parallel programming problems C++11/14 and HPX were designed to address. The presentation will focus on use of futures, including waiting for a future, chaining subsequent actions to a future, and composing futures both within and across machines.
  2. An Introduction using Lua – This section of the tutorial will demonstrate HPX concepts by utilizing a Lua wrappers library. Examples of a simple serial Lua code will be converted, step by step to run in parallel on a single machine, and then in a distributed environment. We intend for this part of the tutorial to explain the mindset behind HPX applications without necessarily needing to be intimately familiar with the C++11/14 standard. Interactive code execution will be made available through a web site, as well as through a virtual machine.
  3. Digging into the C++ – This section of the tutorial will start with teaching the basic C++11/14 concurrency mechanisms, then branch out to writing HPX applications using simple serial code examples (similar to the Lua code) which will be transformed into fully parallelized, distributed applications.
  4. GPUs and Xeon Phis – Here we will demonstrate how the HPX concepts introduced in the previous sections can be seamlessly integrated with the use of accelerators and co-processors. We will demonstrate how by simply recompiling the application on the device you can run HPX code on the Xeon Phi. Additionally, we will introduce the HPXCL library which enables users to take advantage of the GPU, the CPU or the Phi by integrating OpenCL kernels into their codes and distributing them across a heterogeneous machine.
  5. Performance Analysis of HPX – Finally, we will introduce the TAU Performance System and the policy engine APEX for instrumentation of the applications and runtime. The hands-on session will include an exercise for performance assessment using these performance evaluation tools.

By the end of this tutorial, we hope that participants will have a clear understanding of the HPX approach to parallelism, as well as some hands on experience writing HPX applications. We plan to target C++ application developers, researchers, and programmers who are interested in application scalability, performance evaluation, and distributed computing. We are very excited to have the opportunity to present HPX in such a visible venue as the SC Tutorial program. Don’t forget to stop by after the tutorial and say hi at the Louisiana State University booth on the showroom floor. See you in November!

HPX and C++ Executors

By: Daniel Bourgeois

The STE||AR Group has implemented executors in HPX which, as proposed by the C++ standardization proposal called ‘Parallel Algorithms Need Executors’ (document number N4406), are objects that choose where and how a function call is completed. This is a step in the right direction for HPX and parallelism because executors give more flexibility on how and where task based work should be accomplished and gives the programmer a means to compose executors nicely with execution policies inside of algorithm implementations. Continue reading

HPX Mailing List Archives Now on Gmane

In order to make searching and accessing the HPX mailing lists easier, we have made the hpx-users and hpx-devel archives available via Gmane.  This service allows users to browse posts through the use of several formats including two web interfaces, an NNTP newsreader, and a RSS feed. These interfaces will help get questions, answers, and other detailed information about HPX out to the public in an easily consumable format. Try it out!

Mailing Lists:

GSoC 2015 Participants Announced!

We can now announce the participants in the STE||AR Group’s 2015 Google Summer of Code! We are very proud to announce the names of those 5 students who this year will be funded by Google to work on projects for our group.

These recipients represent only a handful of the many excellent proposals that we had to choose from. For those unfamiliar with the program, the Google Summer of Code brings together ambitious students from around the world with open source developers by giving each mentoring organization funds to hire a set number of participants. Students then write proposals, which they submit to a mentoring organization, in hopes of having their work funded. Continue reading

HPX V0.9.10 Available!

The STE||AR Group is proud to announce the release of HPX v0.9.10! In this release our team has focused on making large scale runs simple and reliable. With these changes we have currently shown the ability to run HPX applications on up to 24,000 cores! Other major features include new parcel-port (network-layer) implementations, variadic template support, more parallel algorithms, and the first distributed data structure, hpx::vector. Continue reading

STE||AR Group Accepted as a GSoC 2015 Mentor Organization

The STE||AR Group is proud to announce that it has been accepted as a mentoring organization in the Google Summer of Code 2015 (GSoC) program! This program pays students to work on open source projects for three months over the summer. While the timeline is short, the experience can leave a lasting impression. In fact, some of us met professionally through past GSoC programs. The next step in the process is for students who wish to participate to write proposals for the work that they would like to do over the summer. To get some ideas of what STE||AR projects are available, please checkout our GSoC Project Ideas page here. We encourage all interested students to contact us with their questions and project ideas at hpx-users@stellar.cct.lsu.edu. We are looking forward to a great summer of code!