In brief, optimistic concurrency is a technique for solving coordination problems. It presumes there will rarely be a problem. Should a problem arises you then fix it after the fact. Contrast that with a conservative or pessimistic approach that sets up rules and procedures once set in motion don’t suffer from concurrency problems. This tool, optimistic concurrence, is found in the database community’s workshop.
Recently I’ve become increasingly interested in how this approach is scale free. You can use it to organize database updates or entire social systems.
Consider an example: you have a database, of bank accounts say, and a horde of folks are updating that database. They access account balances and add or subtract amounts from them. A problem can arise: what happens if two people both pull the account balance, do their bit of arithmetic and put back their answers – at “the same time.” One of them goes $100+$10; the other one goes $100+$20, the first one puts back $110; then the second one puts back $120. This is bad, the account now says $120. The correct answer is $130. Trouble! Conflict! Rivalry! Call the police. Write some laws!
The classic solution to this problem is to “lock” the account during the update. This forces the second guy to waits while the first guy finishes. Problem solved – but for one problem. The coordination cost (i.e. managing the locks) is more costly than the work were doing. I’m sure we have all experienced bureaucracies like that! Optimistic concurrency can be a fix for this class of problems.
If we assume that the chance of this kind of collision, e.g. two updates to the same account at the same instant, is amazingly small then we can design an approach that shifts the cost of resolving the conflict caused by the coordination. Instead of bracketing the transaction we can arrange for the expensive part to happen only if a conflict occurs. There are a few ways to do this but the general idea is that each account update checks before it commits it’s change to see if the account was updated while it was working on it’s update. If so then it has to redo the update.
Resolving the conflict after the fact is a pain, but overall it is less painful than taxing every single edit.
Which approach is preferable depends on the chance and cost of a conflict. If the chance/cost of conflict is high then you probably want to go with a more regulated design. If the chance is low then optimistic concurrency is preferable.
One reason I’m thinking about this a lot these days is that this is a question about scarcity, abundance and collegiality. It is a tiny example of the design problems around a public good. The membrane design problem. If conflict is likely then you need a more tightly regulated membrane; if the conflict is rare then you can design a system were typical actions unfold at low cost but you’ll need something to resolve the disputes when conflict arises.
Optimistic concurrency is a scale free tool. You can use it for tiny things like the account balance in the example above. You can use it for big things like the entire source code of a large software project.
We use it extensively in open source projects. Consider a typical transaction that might be made to project P’s code. “Make P faster.” We rarely hand out ownership (i.e. a lock) for a project like that. We rarely create plans (i.e. a lock) for a project like that. Rather we allow any number of contributors to attempt that goal. We resolve conflicts late in the transaction, when they return with the code that achieves the goal. That’s sometimes called “code talks.” But the important thing is not that we shun talk, what’s important is that we resist the lock grabbing. Lock grabbing presumes that the options for what to do with the code are scarce – they aren’t.
I’ve convinced myself that open systems and optimistic concurrency are very similar ideas. Open systems make a choice to be optimistic about the coordination problem. They assume that the system will rarely suffer from rivalry or conflict. Open systems presume that when it does arise, the conflict can be resolved after the fact. I’m very amused by a feedback loop this reveals. When conflict rises, over some threshold, then it appropriate to move to a less optimistic and more regulated system design.
It amuses me is how often some people, usually those familiar with highly regulated systems, will advocate the introduction of a bucket of regulations when ever a conflict occurs. The large angular size of a current conflict often causes even those with the most affection for an open system design to consider switching to a less optimistic design. “Redoing the transaction” is deeply embedded in the optimistic concurrency design. That cost is high – particularly if you personally are stuck doing the reimplementation.
In optimistic systems the cost of freedom is cleaning up the mess when your optimistic presumption is proven wrong. That’s a statistical certainty.
Optimistic concurrency works better in situations where the options for action are abundant and the probability of collision and conflict are low. It’s worth noting that modularity helps to enable just that. Modularity makes the system more granular. The chance that two updates will conflict around one bank account is much lower than the chance that two updates at same bank.
Two thoughts.
1) The maxim “It better to ask for forgiveness than permission” seems like a corrolary to the above.
2) Optimistic concurrancy seems to not be so scale free sometimes since once the group size gets large enough it includes at least one jerk.
When the boss says “ask forgiveness not permission” he’s licensing optomistic concurency and suggesting that the resulting conflict resolution will proceed in a reasonable manner that avoids recrimination. When the subordinate says it it’s can mean that; but it can also mean that he’s breaking the deadlock and will accept the consequences.
There are no end to ways that increasing numbers can effect the nature of the problem. Part of the reason for optimistic concurrency in a large group is to lower the costs of coordinating the binding of tallent to problems – so a large group running optimistically can get the best talent to pick up the problem they are best suited for at minimum cost. The jerk problem is real, but it’s not the only thing who’s probability rises as the size of the pool of workers enlarges.
I would like to quibble with the example given: Surely you can just have a transaction queue? The $20 increase goes in behind the $10 increase, and very soon the balance hits $130. Correct.
A more realistic analogy is where you are logging medicine given to a patient. You shouldn’t give the second medicine after the first one. But here it’s no good to know after the event that you’ve done wrong, you need to have self-refreshing or self-expiring screens to get the latest possible updates. But that’s a different thing altogether.
The solution – optimistic concurrency – only works where you can undo your actions, and it only scales where the number of people accessing a record at one time doesn’t go as the number who can access it.
“optomistic” typo
Thanks Dan, fixed a mess-o-spelling errors.