HPX - High Performance ParalleX


Hello World

This program will print out a hello world message on every OS-thread on every locality. The output will look something like this:

hello world from OS-thread 1 on locality 0
hello world from OS-thread 1 on locality 1
hello world from OS-thread 0 on locality 0
hello world from OS-thread 0 on locality 1

The source code for this example can be found here: hello_world.cpp.

To compile this program, go to your HPX build directory (see Getting Started for information on configuring and building HPX) and enter:

$ make examples.quickstart.hello_world

To run the program type:

$ ./bin/hello_world

This should print:

hello world from OS-thread 0 on locality 0

To use more OS-threads use the command line option --hpx:threads and type the number of threads that you wish to use. For example, typing:

$ ./bin/hello_world  --hpx:threads 2

will yield:

hello world from OS-thread 1 on locality 0
hello world from OS-thread 0 on locality 0

Notice how the ordering of the two print statements will change with subsequent runs. To run this program on multiple localities please see the PBS documentation.


Now that you have compiled and run the code, lets look at how the code works, beginning with main() and hpx_main():

int main(int argc, char* argv[])
    return hpx::init(argc, argv);       // Initialize and run HPX.

In HPX main is used to initialize the runtime system and pass the command line arguments to the program. hpx::init() invokes hpx_main as a HPX thread after setting up HPX, which is where the logic of our program is encoded.

Here is hpx_main:

int hpx_main()
        // Get a list of all available localities.
        std::vector<hpx::naming::id_type> localities =

        // Reserve storage space for futures, one for each locality.
        std::vector<hpx::lcos::future<void> > futures;

        BOOST_FOREACH(hpx::naming::id_type const& node, localities)
            // Asynchronously start a new task. The task is encapsulated in a
            // future, which we can query to determine if the task has
            // completed.
            typedef hello_world_foreman_action action_type;

        // The non-callback version of hpx::lcos::wait takes a single parameter,
        // a future of vectors to wait on. hpx::lcos::wait only returns when
        // all of the futures have finished.

    // Initiate shutdown of the runtime system.
    return hpx::finalize();

In this excerpt of the code we again see the use of futures. This time the futures are stored in a vector so that they can easily be accessed. hpx::lcos::wait() is a family of functions that wait on for an std::vector<> of futures to become ready. In this piece of code, we are using the synchronous version of hpx::lcos::wait(), which takes one argument (the std::vector<> of futures to wait on). This function will not return until all the futures in the vector have been executed.

In the Fibonacci Example, we used hpx::find_here() to specified the target' of our actions. Here, we instead use hpx::find_all_localities(), which returns an std::vector<> containing the identifiers of all the machines in the system, including the one that we are on.

As in the Fibonacci Example our futures are set using hpx::lcos::async<>(). The hello_world_foreman_action is declared here:

// Define the boilerplate code necessary for the function 'hello_world_foreman'
// to be invoked as an HPX action.
HPX_PLAIN_ACTION(hello_world_foreman, hello_world_foreman_action);

Another way of thinking about this wrapping technique is as follows: functions (the work to be done) are wrapped in actions, and actions can be executed locally or remotely (e.g. on another machine participating in the computation).

Now it is time to look at the hello_world_foreman() function which was wrapped in the action above:

void hello_world_foreman()
    // Get the number of worker OS-threads in use by this locality.
    std::size_t const os_threads = hpx::get_os_thread_count();

    // Find the global name of the current locality.
    hpx::naming::id_type const here = hpx::find_here();

    // Populate a set with the OS-thread numbers of all OS-threads on this
    // locality. When the hello world message has been printed on a particular
    // OS-thread, we will remove it from the set.
    std::set<std::size_t> attendance;
    for (std::size_t os_thread = 0; os_thread < os_threads; ++os_thread)

    // As long as there are still elements in the set, we must keep scheduling
    // PX-threads. Because HPX features work-stealing task schedulers, we have
    // no way of enforcing which worker OS-thread will actually execute
    // each PX-thread.
    while (!attendance.empty())
        // Each iteration, we create a task for each element in the set of
        // OS-threads that have not said "Hello world". Each of these tasks
        // is encapsulated in a future.
        std::vector<hpx::lcos::future<std::size_t> > futures;

        BOOST_FOREACH(std::size_t worker, attendance)
            // Asynchronously start a new task. The task is encapsulated in a
            // future, which we can query to determine if the task has
            // completed.
            typedef hello_world_worker_action action_type;
            futures.push_back(hpx::async<action_type>(here, worker));

        // Wait for all of the futures to finish. The callback version of the
        // hpx::lcos::wait function takes two arguments: a vector of futures,
        // and a binary callback.  The callback takes two arguments; the first
        // is the index of the future in the vector, and the second is the
        // return value of the future. hpx::lcos::wait doesn't return until
        // all the futures in the vector have returned.
            [&](std::size_t, std::size_t t) {
                if (std::size_t(-1) != t)

Now, before we discuss hello_world_foreman(), let's talk about the hpx::lcos::wait() function. hpx::lcos::wait() provides a way to make sure that all of the futures have finished being calculated without having to call hpx::lcos::future::get() for each one. The version of hpx::lcos::wait() used here performs an non-blocking wait, which acts on an std::vector<>. It queries the state of the futures, waiting for them to finish. Whenever a future becomes marked as ready, hpx::lcos::wait() invokes a callback function provided by the user, supplying the callback function with the index of the future in the std::vector<> and the result of the future.

In hello_world_foreman(), an std::set<> called attendance keeps track of which OS-threads have printed out the hello world message. When the OS-thread prints out the statement, the future is marked as ready, and hpx::lcos::wait() invokes the callback function, in this case a C+11 lambda. This lambda erases the OS-threads id from the set attendance, thus letting hello_world_foreman() know which OS-threads still need to print out hello world. However, if the future returns a value of -1, the future executed on an OS-thread which has already printed out hello world. In this case, we have to try again by rescheduling the future in the next round. We do this by leaving the OS-thread id in attendance.

Finally, let us look at hello_world_worker(). Here, hello_world_worker() checks to see if it is on the target OS-thread. If it is executing on the correct OS-thread, it prints out the hello world message and returns the OS-thread id to hpx::lcos::wait() in hello_world_foreman(). If it is not executing on the correct OS-thread, it returns a value of -1, which causes hello_world_foreman() to leave the OS-thread id in attendance.

std::size_t hello_world_worker(std::size_t desired)
    // Returns the OS-thread number of the worker that is running this
    // PX-thread.
    std::size_t current = hpx::get_worker_thread_num();
    if (current == desired)
        // The PX-thread has been run on the desired OS-thread.
        char const* msg = "hello world from OS-thread %1% on locality %2%\n";

        hpx::cout << (boost::format(msg) % desired % hpx::get_locality_id())
                  << hpx::flush;

        return desired;

    // This PX-thread has been run by the wrong OS-thread, make the foreman
    // try again by rescheduling it.
    return std::size_t(-1);

// Define the boilerplate code necessary for the function 'hello_world_worker'
// to be invoked as an HPX action (by a HPX future). This macro defines the
// type 'hello_world_worker_action'.
HPX_PLAIN_ACTION(hello_world_worker, hello_world_worker_action);

Because HPX features work stealing task schedulers, there is no way to guarantee that an action will be scheduled on a particular OS-thread. This is why we must use a guess-and-check approach.