Saturday, April 5, 2014

Scaling Existing Lock-based Applications with Lock Elision

I remember my first encounter with transactional memory. It was a chapter by Simon Peyton Jones in Beautiful code (2007), and it truly was beautiful: an elegant addition to the Haskell programming language, relying heavily on that language's strong type system.

And then I saw how transactional memory looks like in C code. Boy, that hurt. All the elegance went out the window; visually, the code brought back dim memories of embedded SQL in C back in the 80s. I decided to forget all about transactional memory for now, and try again in a decade or two.

Well, things are definitely changing now. Andi Kleen's paper Scaling Existing Lock-based Applications with Lock Elision shows how things can remain beautiful and elegant with a little bit of hardware support.

Specifically, Intel's latest processors introduce a new group of instructions, called TSX (Transactional Synchronization Extensions). Effectively, these instructions put the processor into a special "collision detection" mode, and it can report when two different threads (or cores), both in this special mode, access the same memory location in a conflicting manner. Any manner except two read attempts would be a conflict.

So, how can this mechanism be used to improve parallelism? You start your code with a special "begin TSX mode" statement (_xbegin() in Intel's C libraries). The code executes normally until _xend() is reached. However, if any conflict (with a different thread) is detected, all changes your code has done to any memory location is reverted, and the code is thrown back to the _xbegin() location, as if _xbegin() was just invoked. Only this time, the function's return value indicates failure. At this stage, you can simply obtain a lock, and run your code using normal lock semantics.

If you make sure that any successful _xbegin() is followed by a read access to the lock, the immediate result is that as soon as any thread obtains the lock, all other threads fail their transaction and are thrown back to the beginning, where they would also attempt to obtain the lock. (That's because the lock-read implies the lock itself is part of your transaction, and so if somebody else writes to it -- in other words, obtains it -- then your transaction is no good.)

How is this helpful? If you replace all lock acquisition/release blocks with transactional attempts like these, any lock-protected access to memory will not, normally, obtain the lock. Two threads accessing the same data structure in a non-mutually-destructive way (none writes to any memory that the other reads or writes), both succeed -- and there's never any waiting for a lock. If there is a conflict, the threads obtain locks as they normally would.

The implication: whereas modern practice implies that complex data structures should use fine-grained locks, it now makes more sense to use coarse-grained locks. Just put a lock over that whole big hashtable or tree, for example. The lock aquisition isn't really performed, though, unless there's any actual conflict. The result is that existing, simply-written code, when using a standard library with lock-elision code on its lock acquisition and release functions, would suddenly scale almost linearly with the number of processor cores.

Of course, this is an over-simplification, and the actual best practice depends on the nature of your data structure and the data-access patterns. Still, it seems to me like in the most common cases, the multi-threaded programmer's life is about to become significantly simpler.

Thursday, August 22, 2013

The HERMIT in the Machine

A year after, I'm now reading the proceedings of the 2012 Haskell Symposium. The very first paper was mind-blowing: The HERMIT in the Machine: A Plugin for the Interactive Transformation of GHC Core Language Programs, by Farmer, Gill, Komp and Sculthorpe.

HERMIT is a tool that this team has apparently been working on for years. It basically gives you a command-line interface that pops up during the compilation process, allowing you to interactively mess around with the compiler-generated code, check out optimization options, and so forth -- with a toolbox for transforming the program in various ways, as the compiler normally does in its optimization stage, and an invitation to extend and enrich this toolbox.

The actual optimizations and tools offered by HERMIT are nice, but they are hardly the point. Just think about what these guys did. It is a common practice to put a compiler inside your editor: this is essentially what a command-line interpreter is (the REPL loop of LISP, or the Python command-line prompt). But HERMIT turns things inside out: they've put an editor inside your compiler! Of course, in this stage, you don't really work with raw source code, but rather with the compiler's internal representation of the parsed (and, by now, optimized) code, before it is converted to machine-code in the compiler's final stage.

I think the ability to mess around visually with the compiler's internal representation is, first and foremost, a great educational tool. I wish IDEs with integrated compilers, such as Eclipse's JDT, would offer something similar. Of course, in Eclipse, I'd expect to see the AST is a GUI tree, which has both benefits and limitations compared to the raw command-line approach used by HERMIT. (Remember that GHC, the Haskell compiler they're using, isn't part of an IDE; it's purely a command-line compiler. The Haskell community still programs as Real Men used to program in the Good Old Days.) Being able to see, in HERMIT or its GUI equivalent, the AST before and after each optimization is both a great debugging tool for the compiler itself, and -- as I already mentioned -- a fantastic educational tool. I really do hope other tool-makers pick up this idea; it's exciting to think what they can make of it.

Saturday, March 23, 2013

Programming by voice

As somebody who suffers from RSI, I was really impressed by the talk "Using Python to Code by Voice", by Tavis Rudd. No way I will be memorizing a set of 2,000 commands like he has, though.

Friday, November 11, 2011

Let the Children Code

I've recently read The Design of Kodu: A Tiny Visual Programming Language for Children on the XBox 360 (Matthew MacLaurin, POPL'11). Like me, the designers of Kodu first learned programming as children, using ROM BASIC in the home computers of the 80s (it was an Atari 800XL and then an Apple II in my case). I think this was a wonderful way to learn programming (Dijkstra's quip notwithstanding), but can children today be introduced to programming in a remotely similar manner?

Apparently not. As the paper states, "We realized very early that the kids of today have a very different context than we ourselves had in the 80s, particularly in regard to user interface and the games themselves, which often have art budgets comparable to those of the largest Hollywood films. We didn't want 21st century language for building 1980s games; instead, we felt our new language should be able to achieve the same quality of work product that kids experience in commercial console games today."

To address this goal, the team (in Microsoft Research) created Kodu as a GUI environment were kids use sensors, filters, and selectors to decide when to change the world using actuators and modifiers. A program could be something like:

1. See (sensor) - purple (filter) - move (actuator) - away (modifier)
2. See (sensor) - apple (filter) - move (actuator) - towards (modifier)

with the expected results regarding the behavior of the avatar in the game. "In the case of a purple apple, the rule to move away from purple objects will take priority" (because it appears first). More complex designs are made possible by means of pages, where an actuator can switch a "robot" (as the avatars are called) from one "page" (program) to another, in effect changing the robot's control procedure and allowing for some level of structured programming.

There are no variables in Kodu: state is stored and read from the world. Apparently, the intended way to keep state was using, for example, objects' colors; but children quickly realized that they could use the score of each robot as an integer variable.

More than anything else, this clarifies to me that programming should be taught using programming languages. I agree that console output, not to mention command-line games, is too limited for today's children. But, aiming at youth the same age I was when I first started programming (10-12 or so), I think that using a real programming language is far superior to using a toy one. My recommendation would be starting with HTML: it's not a programming language per se, but it would teach children the importance of accuracy, nesting, the implication of a simple typo on your output, the difference between a visible mistake and an invisible one, and so forth. Then, add JavaScript. There's no need for expert-level HTML and JS, mind you. The HTML would just set the ground for the JavaScript output; pick up images, or areas for text, etc. with getElementById, and then manipulate that element -- its location, size, and textual content. And input is very simple, too, using either text fields, buttons, or clickable elements with onClick events.

Just make sure you don't try to teach the children object-oriented programming using JavaScript. The language (at least in its current version) is simply not object-oriented, and seeing Java/C#/C++ programmers trying to force object-oriented design into an object-based, but not object-oriented, langauge is painful, and reminds me of the old adage about FORTRAN programmers who remain FORTRAN programmers, no matter what programming language they use.

Sunday, May 22, 2011

A Co-Relational Model of Data for Large Shared Data Banks

A Co-Relational Model of Data for Large Shared Data Banks is a paper by Erik Meijer (a used programming language salesman) and Gavin Bierman, in Communications of the ACM, vol. 54, no. 4 (April 2011; originally published in Queue vol. 9, no. 3). The paper is rather pretentious, but fascinating nonetheless.

The authors correctly compare the situation in the current noSQL market with that of the early database days, before Codd's landmark paper from 1970 heralded relational databases. Whereas customers can choose between relational (SQL) DB providers based on product merits, but using a standard interface (which makes switching relatively easy, and the market competitive), no such choice exists in the world of noSQL. While there are multiple noSQL products (Cassandra, Dynamo, Hibari, and many others), there is no unifying standard, and once a project picks a noSQL database, it becomes tightly bound to it.

Meijer and Bierman claim that noSQL should actually be named coSQL, since, in terms of category theory, the two models are complementary. Here's some quick background: category theory explores math by viewing the world as entities and arrows between them (domains and functions). Given a category T of objects and arrows, co(T) is the same set of objects, with the direction of arrows reversed. Well, in SQL, the arrows are between properties of objects and the objects themselves (e.g., if a book has multiple authors, then the table AUTHORS will have a foreign key into table BOOK). In co(SQL), the arrows would be from the book to the authors: and indeed, this is the case in noSQL, where (using in-memory representation) a "Book" object has pointers or references (arrows) to one or more "Author" objects. Ergo, noSQL is coSQL.

The authors then go into further mathematical details, and in particular discuss how, mathematically speaking, a MapReduce operation over a noSQL data store is effectively a Monad in category theory. Ultimately, their claim is that obtaining a standardized, mathematical view of noSQL (just as Todd did with pre-SQL databases, using relational algebra) will result in a universal standard for noSQL products, and a truly competitive and thriving market. They do not hesitate to share their belief that this mathematical formalization of noSQL will lead to similar economic growth. In other words, what we have here is a paper that claims in advance to be a landmark paper.

And it might just be right.

Friday, April 22, 2011

C++ template syntax patterns

A simple and straightforward guide to reading C++ template code: C++ template syntax patterns, by Eli Bendersky. Useful if you found yourself trying to interpret odd generic C++ code, and failed. Note, though, that it's not very deep -- more of an overview.

C++ increasingly bothers me. I'll write more about this soon; meanwhile, you can see an earlier post by Bendersky: Modern C++ Scares Me. (In general, Bendersky's blog is worth following.)

Tuesday, March 15, 2011

On OOPSLA 2010's acceptance policy

The opening pages of a Proceedings volume, the editor's forward, is something one normally skips without even noticing. Yet the OOPSLA 2010 proceeding start with a rather unusual "Program Chair Statement" that is well worth reading. And debating.

The chair, Martin Rinard, joins the heated discussion that is raging on the pages of Communications of the ACM, under the editorial guidance of Moshe Y. Vardi, for quite some time now: how to improve the state of publications in computer science. See, for example, Vardi's editorial from the March 2010 issue: Revisiting the Publication Culture in Computing Research.

Rinard, however, decided to take action on the matter. And he introduced several key changes to the acceptance procedure for OOPSLA. These include:

  • Unilateral paper acceptance, where each program committee member can pick a single paper that will be accepted, regardless of the opinions of the rest of the committee.

  • Standard scientific review criteria, focusing on whether the paper introduces new results or techniques, are its results significant and interesting, and whether it contains errors.

  • Reproducibility, measuring whether the paper contained sufficient information for other researchers to reproduce the result.

There were additional, more technical changes (e.g., with regard to submissions by program committee members, page limits, etc.).

Overall, I think these changes are most welcome. I particularly applaud the reproducibility concern, which (finally!) states the obvious: our field has a problem with papers by researches in the industry that work with proprietary systems, and produce papers that, while often fascinating, do not provide sufficient detail to truly advance the science.

It is the unilateral acceptance policy that bothers me; more specifically, the way it was executed. The cause, I should state, is a worthy one: as Rinard claims, "Papers that open up new directions or go against the established value system in the field are routinely rejected, either because they are controversial or because the program committee finds it difficult to evaluate the paper with the same degree of certainty as more mainstream papers"; and, in Rinard's view, "this situation has become severe enough to seriously hamper the development of the field" (!). In other words, the mundane wins over the challenging, and researchers prefer churning out mundane papers over doing ground-breaking research.

And is unilateral acceptance a good solution? I actually think it is. Except for this: Rinard chose to publish exactly which papers were unilaterally accepted, thus singling them out as the "odd ones"; worse still, the table of contents explicitly states which program committee member chose what paper.

The problem, obviously, is that this could create a sense of personal debt in the industry. Let's say committee member X picked a paper by author Y. Next year, Y would be in a committee of some other conference, to which X will submit a paper. Can we truly say that Y will never give X preferential treatment? Can we truly say that X will not hold a grudge against Y if he learns that, for this new conference, Y picked some other paper as part of his "unilateral acceptance prerogative" rather than picking X's?

Even if X and Y are both perfectly honest people, things can still seem wrong to others. If Y does pick X's paper in the following conference, will everyone believe it was a purely impersonal choice? When Y argues for X's paper in the next conference's PC meetings, how will the other PC members hold his views?

With the best of intentions, Rinard and the OOPSLA 2010 committee created a dangerous precedence. Yes, PC members should be able to pick one paper unilaterally. But they should justify their choice clearly to other PC members: why do they believe this paper they've picked is ground-breaking and novel. And more importantly, once the choice is made, the details should never be published in public; much like the scores given by PC members to different papers, during the review process, are never published except anonymously.

[Full disclosure: I have not submitted a paper to OOPSLA 2010. For that matter, since moving to the industry, I haven't submitted a paper to a conference in the past three years or so. So this isn't about me. I did take part in reviewing papers for several academic conferences during this time, though.]