[英] 《七週七併發模型》作者Paul Butcher：用併發計算實現最大效率（圖靈訪談）

Paul Butcher是一位資深程式設計師，涉獵廣泛，從微控制器編碼到高階宣告式程式設計無所不精，現在他開辦了獨立諮詢公司Ten Tenths。他曾任SwiftKey的首席軟體架構師，並先後擔任Texperts和Smartner的CTO。他從1989年開始攻讀博士學位，在平行計算和分散式計算的領域深造，當時他便深信併發程式設計將成為主流。二十年後，他的觀點終於得以驗證——整個世界都在討論多核以及如何發揮其優勢。Paul Butcher的著作《七週七併發模型》延續了《七週七語言》的寫作風格，通過七個精選的模型幫助讀者瞭解併發領域的輪廓。除《七週七併發模型》外，Paul還著有在亞馬遜獲得全五星好評的《軟體除錯修煉之道》。

iTuring: You started early as a coder, what had inspired you to study concurrent and distributed computation?

You’re right that I started early—I wrote my first program on a first-generation programmable calculator when I was 10 years old :-)

My inspiration was my interest in programming languages. When I was starting my PhD in 1989, I wanted to pick an area with difficult problems to be solved, so I decided to study languages for parallel and distributed computing. I was certainly right that it was an interesting area, but I underestimated how long it would take to become mainstream.

After my PhD, I was lucky enough to be able to work on a large shared-memory multi-threaded system (a parallel PostScript interpreter) which gave me an excellent grounding in the difficulties with threads and locks.

iTuring: What are concurrency models’ advantages comparing to traditional serial models? What are the best scenarios to adopt concurrency models?

We need to embrace non-sequential (concurrent and/or parallel) programming to make effective use of today’s multi-core processors. But concurrency is useful for much more than just exploiting multiple cores—used correctly, it can result in software that is more responsive, easier to write and easier to understand than sequential software.

Perhaps the most compelling argument is fault-tolerance. Sequential software can never be as resilient as concurrent software (what happens if the hardware that your sequential code is running on fails, for example?).

iTuring: Is it possible to compare performances between different concurrency models?

Just as benchmarking one programming language against another is difficult, so is benchmarking one concurrency model against another. Performance depends on so many factors (hardware architecture, the nature of the algorithm you’re implementing, whether communication is over the network or between processes running locally, etc.) that drawing general conclusions is almost impossible.

Different approaches certainly have different “sweet spots” though. Data-parallel code running on a GPGPU will deliver impressive performance if you’re number-crunching, for example. And the Lambda Architecture excels if your data is in the terabytes.

When it comes to general-purpose programming, to my mind the choice between approaches is less about performance and more about whether they fit your mental model and provide the facilities you need. If you need support for distribution and fault-tolerance, for instance, Actors are pretty much the only option at the moment.

iTuring: When multiple logical concurrent programs are executed, which way will achieve maximum efficiency, serial execution or independent execution?

As with the previous question, this answer to this depends very much on your specific situation. If by “maximum efficiency” you’re thinking about utilising multiple cores, by definition serial execution will be less efficient, as it can only utilise a single core :-)

So I’ll assume that you’re asking about efficiency on a single core. In general, concurrency brings some overhead with it, but today’s well-optimised runtimes mean that that overhead is surprisingly small. Erlang’s processes, Go’s goroutines, Clojure’s core.async, and similar mechanisms in other languages now allow multiple logically-concurrent processes to execute incredibly efficiently.

We’ve reached the point where, outside a very few specialised areas, efficiency is no longer a reason to avoid concurrency.

iTuring: Erlang and Go are influenced by CSP model, however, Process Algebra has three divisions: CSP, ACP, and CCS. Why there are still no new languages base their designs on ACP or CCS?

Go is certainly influenced by CSP. Erlang has more in common with the Actor model than CSP (although the Actor model and process calculi have certainly influenced each other).

Interestingly, Erlang’s creators hadn’t heard of the Actor modelwhen they were designing the language, and I think that this hints at the answer to your question—the links between academia and practice aren’t very strong in our field, and language design tends to be driven by more by pragmatism than theory.

iTuring: Different languages have different concurrency models, and they have little intersections among them. If the subsystems use different concurrency models, they are likely to adopt two or more languages, then how should we solve debugging problems of multiple languages? When we trace certain data stream, which may have passed multiple actors and many language modules, does it mean that we have to face more challenging debugging process? Is there any suggestions from you?

If two subsystems are based on ErlangVM and JVM, and they use process to process communication method. When there is high concurrency take place, loads will cause pressure on the edges of both systems, is there any good solution for that?

Both of these are difficult questions without easy answers. Polyglot programs are challenging at the best of times, and introducing different concurrency models only makes them more so.

The only solution is a sensible high-level design. You need to architect your system along well-understood principles like maximising cohesion and minimising coupling so that communication between subsystems is minimised compared to communication within the subsystems.

iTuring: Erlang and Go don’t have complete type system, which result in difficulties while delivering structural data like json. So is there any suggestions about how should we deliver structural data with Channel?

The static versus dynamic typing debate is almost as old as programming, and is not unique to concurrency.

When I’m using a dynamically typed system like Erlang, then I use the same techniques to convince myself that my concurrent code is correct as I do my sequential code—lots of tests.

iTuring: Erlang has longer history than Scala, however Scala win more favor than Erlang recently, and Scala also has very high efficiency. So is it possible for actor concurrency model of Scala to take place of Erlang?

I think it’s too early to tell. Scala’s Akka is very impressive indeed, and I have no hesitation whatsoever recommending it. But Elixir (the new language that targets the Erlang VM) is doing a great deal to rekindle interest in the Erlang ecosystem. We’re very lucky to have two great options to choose between. It’s likely that both will continue to be popular for the foreseeable future.

iTuring: It’s more difficult to write concurrent/parallel programs than serial programs, is there any way that we can lower the degrees of difficulties? Is there any thought model you would like to introduce to the readers?

It’s certainly true that concurrent programming with threads and locks is difficult. Worse than difficult—it’s almost impossible to be certain that a threads and locks-based program is correct.

But if you choose the right tools, it doesn’t have to be that way. In many cases, a concurrent solution can be simpler and easier to understand than its sequential equivalent.

Perhaps the best advice I can give is to become familiar with as many different approaches to concurrency as possible so you know what’s available. The larger your toolbox is, the more likely you are to choose the right tool for the problem at hand.

[英] 《七週七併發模型》作者Paul Butcher：用併發計算實現最大效率（圖靈訪談）

更多精彩，加入圖靈訪談微信！

相關文章