Tuesday, May 18, 2010

Exception-based Timed Locks

A notable feature of Boost.Synchro is the use of exceptions in conjunction with timed locks. This extension is based on the idea of Kevlin Henney presented on "More C++ Threading".
In addition to the standard try_lock_until/try_lock_for which returns false on expiry, Boost.Synchro adds two functions lock_until and lock_for which lock until a specified time is reached or a specified amount of time is elapsed, respectively, and throw a timeout exception on expiry.

The implementation of these function is of course quite simple. Just wrapping the try_clock_... call and throw the exception when  try_clock_... return false.

template<typename Clock, typename Duration>
void lock_until(chrono::time_point<Clock, Duration> const & abs_time)
{
  if(!try_lock_until(abs_time)) 
    throw timeout_exception();
}
template<typename Rep, typename Period>
void lock_for(chrono::duration<Rep, Period> const & rel_time)
{
  if(!try_lock_for(rel_time)) 
    throw timeout_exception();
}

Exception-based Timed Lockers
In addition to supporting timeout exception for Lockables, the library supports them also for exception-based timed lockers. The semantics of each constructor and member function are identical to those of boost::unique_lock for the same Lockable, except that the constructor that takes a time or a duration as first parameter in addition to the reference to a mutex will call m.lock_until(t) or m.lock_for(d) rather than m.try_lock_until(t) or m.try_lock_for(d) and so a timeout_exception is possible on the constructor.

Let me start with an example of an application needing to lock several locks at the same time. Once all the locks are locked something must be done. Otherwise the application do something else and reiterate the lock requests. The natural and exception safe way to do that is

while (polling) {
  t=now()+100;
  std::unique_lock<boost::mutex> l1(m1, t);
  std::unique_lock<boost::mutex> l2(m2, t);
  std::unique_lock<boost::mutex> l3(m3, t);
  if (l1.has_lock() && l2.has_lock() && l3.has_lock()) {
    foo();
    polling = false;
  } else execute_on_failed();
}

The problem with this code is that it locks m2 even if l1 do not owns the lock m1. The advertised reader could argument that if the lock m1 has not been locked by a timeout, as all share the same time constraint the failing lock of m2 will not be expensive. Well the problem is that the locking of m1 can fail because m1 is already locked. When we try to optimize this

while (polling) {
  t=now()+100;
  boost::unique_lock<boost::mutex> l1(m1, t);
  if (l1.has_lock()) {
    boost::unique_lock<boost::mutex> l2(m2, t);
    if (l2.has_lock()) {
      boost::unique_lock<mutex> l3(m3, t);
      if (l2.has_lock()) {
        foo();
        polling = false;
      } else execute_on_failed();
    } else execute_on_failed();
  } else execute_on_failed();
}

the code becomes unmaintainable. What is event worst is that as the preceding one is subject to deadlock if another thread acquire the locks in a different order.

try_lock_until and try_lock_for non-member functions
To avoid this we can request the acquisition of all the locks together (letting the function to try several orders), as it does the function std::try_lock, but adding this time a expiration period parameter

while (polling) {
  if (bsynchro::try_lock_for(100, m1, m2, m3)) {
    foo();
    polling = false; 
    bsynchro::unlock(m1, m2, m3)
  } else execute_on_failed();
}
While this solve the deadlock problem, this code is not exception safe.
We can improve this  code using the standard unique_lock, but deferring the lock as follows:

while (polling) {
  t=now()+100;
  std::unique_lock<boost::mutex> l1(m1, defer_lock);
  std::unique_lock<boost::mutex> l2(m2, defer_lock);
  std::unique_lock<boost::mutex> l3(m3, defer_lock);
  bsynchro::try_lock_until(now()+100, l1, l2, l3);
  if (l1.has_lock()) {
    foo();
    polling = false;
  } else execute_on_failed();
}
Note that we need just to check for one of the lock as either all the locks are locked on none.

With exception based lockers we can do the following (note that the time is given as first argument to the locker constructor)
while (polling) try {
  bsynchro::unique_locker<boost::mutex> l1(t, m1);
  bsynchro::unique_locker<boost::mutex> l2(t, m2);
  bsynchro::unique_locker<boost::mutex> l3(t, m3);
  foo();
  polling = false;
} catch (bsynchro::timeout_exception& ex) {
  execute_on_failed();
} 
While the preceding code is exception safe and do not locks m2 if m1 is not acquired, do not avoid the deadlock when another thread  takes m1, m2 and m3 on a different order.

lock_until and lock_for non-member functions

We can yet try with

while (polling) try {
  std::unique_lock<boost::mutex> l1(m1, defer_lock);
  std::unique_lock<boost::mutex> l2(m2, defer_lock);
  std::unique_lock<boost::mutex> l3(m3, defer_lock);
  bsynchro::lock_until(now()+100, l1, l2, l3);
  foo();
  polling = false;
} catch (bsynchro::timeout_exception& ex) {
  execute_on_failed();
}


locker_tuples or locker_array of Locker
We can go a step ahead and mix the advantage of taking all the locks at once and making the acquisition block scoped. In order to do that we need either a array_locker or a tuple_locker depending on whether the locks are homogeneous or not. These locker containers follows the same rules as the element wise lockers. If the time comes after the locks no exception is thrown on timeout and if given as the first parameter a exception will be thrown when the time will expire.
Using array_unique_locker the preceding code becomes without timeout exceptions
while (polling) {
  bsynchro::array_unique_locker<boost::mutex, 3> lk(m1, m2, m3, 100);
  if (lk.owns_lock()) {
    foo();
    polling = false;
  } else execute_on_failed();
}
which is exception safe, efficient and help to avoid deadlocks.
With exception based timed locks (Note that the time is given before the locks)
while (polling) try {
  bsynchro::array_locker<boost::mutex, 3> lk(100, m1, m2, m3);
  foo();
  polling = false;
} catch (bsynchro::timeout_exception& ex) {
  execute_on_failed();
} 

Friday, May 7, 2010

Language-like Synchronized Block

In "A 'synchronized' statement for C++ like in Java" Achilleas Margaritis presents a synchronized macro that emulates the Java behavior.
  synchronized(l) {
    foo();
    return bar();
  } 
I have adapted the macro to use Boost.Thread as follows:
#define synchronized(MUTEX) \
  if (bool _stop_ = false) {} else \
  for (boost::scoped_guard<boost::mutex> _lock_(MUTEX); 
      !_stop_; _stop_ = true)
How it works?
The macro exploits the nature of the if and for statements of C++ to do the following (in the presented order):
  1. The if condition introduces a new _stop_ variable on the scope of if statement, which is the initialized to false, so the else part will be executed.
  2. for initialization part: a local _lock_ variable is defined that locks the given mutex.
  3. test part: the !_stop_ expression is found to be true: the code inside the loop is executed.
  4. increment part: the _stop_ variable is set to true.
  5. test part: the !_stop_ expression is found to be false: the loop exits.
  6. exit part: the _lock_ variable is destroyed, unlocking the mutex.
The advantage over the initial proposal:
  • no need to wrap a locker class to add extra functions to set the flag neither a conversion operator to bool.
Advantages over classic RAI:
  • it makes the code more language-like, pretending synchronized to be a keyword,
  • it helps avoiding declaration of the auxiliary lock variable, and
  • it ties the code to be synchronized with the next statement either simple or compound.
For example no need to add curly brackets to synchronize a sentence as in
  {
    boost::scoped_guard<boost::mutex> _lock_(l);
    ++cnt;
  }
Just prefix the sentence with
  synchronized(l) ++cnt;

With exception-based timed locks we can define synchronized_until and synchronized_for. Both can throw a timeout_exception
  synchronized_until(l,abs) ++cnt;

Saturday, May 1, 2010

Shallow-Copy Semantics

I have read in too many places this sentence

"The default copy constructor and assignment operator make shallow copies"

This is not completely right. I would say, the generated copy constructor and assignment operator make default copies of the direct class members, and calls to the copy constructor or assignment operator of the base classes. If the member or base classes define the default copy constructor and assignment operator with deep copy semantics, the generated defaults will have a mix of both copy semantics.

This means that we can not let the compiler generate a shallow copy for us if the class has members or inherits in a general case.

Applications needing shallow copy semantics must follow a strict protocol. Here I present an explicit approach. Classes providing shallow copy semantics must define at least a default shallow copy constructor and a shallow assignment function.

The default copy constructor and assignment for the C++ fundamental and pointers types has shallow copy semantics. For the other compound types we need to make the difference between shallow copy and deep copy. We can use a tag for this purpose.

struct shallow_t {};
const shallow_t shallow = {};
C(shallow_t, C const&);
C& shallow_assign(C const&);

For base classes we need to take care just of members.

struct B {
  C c;
  B(shallow_t, B const& rhs) 
    : c(shallow, rhs.c) {}
  B& shallow_assign(B const& rhs){
    if (this!=&rhs) {
      c.shallow_assign(rhs.c);
    }
    return *this;
  }
};

For derived classes we do like:
struct D : B {
  D(shallow_t, D const& d) : B(shallow, d)
  // shallow copy construct the fields
  {
  }
  D& shallow_assign(D const& rhs){
    if (this!=&rhs) {
      this->B::shallow_assign(rhs);
      // shallow copy the fields
    }
    return *this;
  }
};

Now we are able to shallow copy objects as far as we follows the protocol.

D d1; 
D D2(shallow, D2);
D d3;
d3.shallow_assign(d1);

The main problem while trying to integrate shallow copy semantic in C++ is that we can not have function parameters or return types shallow copied as C++ call the copy constructor instead. For function parameters we should pass them by reference and call the shallow copy inside the function when needed.