Saturday, March 17, 2007

some thoughts on transactional memory

Intel has been around the office lately to talk to us about transactional memory and the panacea that it's going to be. Like any silver bullet, it's only effective against werewolves - and fortunately (or unfortunately) there aren't any werewolves attacking me at the office.

Transactional Memory (or TM) is a process by which multiple processes/threads update shared memory within a 'transaction' similar to a commit by a RDBMS. A transaction will only commit if all updates to memory complete successfully without conflicts. In the case of a conflict we roll back execution to where we started.

To give TM a little bit of credit though, it does solve a certain set of problems in concurrent programming. Basically TM allows the programmer to minimize the amount of time they worry about getting locks, freeing locks, and probably more importantly debugging why there's a deadlock in the program when something happens with a lock. We call this the SPOD problem at work - "Spinning Pizza of Death". This will make sense to those of you who own a Mac.

Of course, right now TM is mostly an exercise left to the student (from "The Landscape of Parallel Computing Research: A View From Berkeley"):

Transactional memory is a promising but still active research area. Current software-only schemes have high execution time overheads, while hardware-only schemes either lack facilities required for general language support or require very complex hardware. Some form of hybrid hardware-software scheme is likely to emerge, though more practical experience with the use of transactional memory is required before even the functional requirements for such a scheme are well understood.

Nothing comes without a cost though, some estimates of STM implementations have them incurring a 40-50 percent overhead compared with locking based programming. STM also incurs an additional performance hit if it has to guarantee interoperation between TM code and other code.

The future, I think, holds a couple of directions that will need to go in umm... parallel. We'll need things like transactional memory to deal with things at a low level. We'll also need to do better than source level markup of existing languages to fully take advantage of many core programming. For example, TM is well suited for applications that want to use mutexes or shared memory. We'll need a better way of representing producer/consumer models where message passing is better suited. All in all an interesting time.

No comments: