Bjarne Stroustrup: A simple way of expressing an idea can be optimal in real-world situations(圖靈訪談)

劉敏ituring發表於2016-12-30

2016年的最後一天,圖靈訪談給各位小夥伴兒獻上特大彩蛋!借用Bjarne大師的話“趁你還年輕的時候,能夠喜歡上某些學科,選擇具有挑戰性和感興趣的工作並養成良好的習慣!”,預祝你們在2017年找到新的方向!

Guest:

Bjarne Stroustrup(本賈尼·斯特勞斯特盧普)

1982年,貝爾實驗室(美國AT&T公司)的Bjarne Stroustrup博士在c語言的基礎上引入並擴充了物件導向的概念,發明了新的程式語言C++。之所以被命名為C++,是為了表達該語言與c語言的淵源關係。Bjarne Stroustrup博士因此被尊稱為“C++語言之父”。

之後,物件導向的程式設計思想開始席捲整個開發領域,標準模板庫(STL)和微軟的VC++平臺推波助瀾,C++開始流行起來。可以說,C++對整個軟體開發及IT業的貢獻,不言而喻。

C++仍在它擅長的領域發揮著不可或缺的作用。作為C++之父,Bjarne Stroustrup也一直致力於C++標準的改進和推廣,其著作《C++程式語言》《C++的設計和演化》和《C++加註參考手冊》等已成為C++學習的經典讀物。

enter image description here

Transcripts:

Chinese Version

You are one of a few tech gurus who have many wisdom words quoted massively. I'm attracted by your thought-provoking talks and dialectical logic in your previous interviews. How could you play both natural language and programming language that well?

I try to express ideas directly and succinctly. I don't always succeed, but it is something worth trying. Remember that when you write code, the program text is not just for the compiler. Rather, its "consumers" include all the people who will read and maintain the code. If you write ugly, incomprehensible code, it could fail or cause massive problems in maintenance. Thus, for both code and "ordinary text" , the aim is to express ideas clearly and in a way that help get those ideas into other people's heads. Writing is a way of clarifying ideas – to ourselves and to others.

I remember you've talked about people's misunderstanding of C++ before. (To understand C++, you must first learn C; C++ is an Object-oriented language; For reliable software, you need Garbage Collection; For efficiency, you must write low-level code; C++ is for large, complicated programs only) Has time changed their prejudice?

Some have learned, but many have not. These myths are widespread on the web, in articles, and in textbooks. Often, they are taken for granted – stated as facts without supporting evidence. That makes them hard to counter. People who hold them do not consider themselves prejudiced. Often, they consider themselves enlightened or even superior for having these opinions.

Let me take this opportunity to encourage people to take some time to (re)evaluate their assumptions and very briefly state my position on these myths. For the real technical arguments, people will have to consult my papers and books.

To understand C++, you must first learn C. No, if you are already a programmer, you can go straight to class design and library use. If you are just starting out with a language that can deal with low-level programming, you can learn the basic far easier and faster relying on C++'s stronger type checking and better libraries. There is no good reason to let novices suffer from the problems and complexities inherent in the use of low-level facilities, such as pointer, arrays, malloc()/free(), casts, and macros. Just copying or comparing character strings in C can be a pain for novices, and is always tedious.

Naturally, you can't consider yourself competent in C++ without understanding pointers, arrays, free store (dynamic memory, heap) management, etc., but it can come later, after the basics of programming and C++ is mastered. I designed a course for freshmen (first year university students) based on these ideas, and wrote a textbook for that course. It works.

C++ is an Object-oriented language. No, not for most classical definitions of OO critically involving inheritance. C++ supports OOP rather well, but that's not all it does and much of modern C++, including most of the ISO C++ standard library, do not follow that paradigm. Start with simple concrete types and free-standing functions. Use run-time polymorphism (as offered by virtual functions) if and only if your application domain is hierarchically organized and requires run-time resolution of calls. Many of your favorite applications are built in C++ using far more techniques than just classical OO, or even without it.

For reliable software, you need Garbage Collection. No, GC can be an obstacle to reliability. GC does not eliminate all memory leaks, and it does not address the management of non-memory resources at all. You can bring a system to a halt by leaking sockets, file handles, threads, and locks even faster than by leaking memory. The best support for reliability is a comprehensive approach to resource management and error handling, as is offered by C++ (often called RAII-Resource Acquisition Is Initialization). I am currently extending that approach to be a comprehensive system for resource-safe and type-safe C++. The key idea is to guarantee that nothing is leaked, so as to make a garbage collector unnecessary. Doing so without interfering with the programmer's ability to express non-trivial idea simply and directly in code is hard, but possible.

For efficiency, you must write low-level code. No, modern C++ compilers are so good at low-level optimizations and optimizing through levels of abstraction that for any significant amount of code a programmer can't match them. This is especially true for modern architectures with deep cache hierarchies and optimizers doing aggressive instruction reordering. At a higher level, it is just about impossible for a human to reliable directly use threads and locks for optimal results, so we need higher-level models and algorithms for correctness, reliability, predictability, and raw performance. Fiddling with bits, byte, and pointers so easily becomes pessimization when some assumption about the machine, the data, or the algorithm turns out to be unwarranted. As an example, have a look at a paper I co-wrote, improving the performance of a spec-mark program by ripping out all the carefully hand-crafted optimizations.

The resulting program is much shorter, cleaner, more maintainable, more scalable at negative cost. I talk a lot about zero-overhead abstraction, but recently, I have seen many examples of negative-overhead abstraction: you optimize by simplifying using appropriate abstractions.

C++ is for large, complicated programs only. No, at least not unless you consider a page or two of code a large, complicated program. Just remember that to do anything significant you need one or more libraries. This is true for every language. Writing in just the bare language, without libraries, is painful and rarely unproductive.

Casually perpetuating one of these myths when discussing C++ or (worse) making decisions related to C++ is a sign of intellectual laziness. Last year, I wrote a paper about "Myths" .

C++ is not standing still. Your committee will release C++17 very soon. What do you think should users to expect?

Expect a lot of minor improvements. I think there is something for every C++ programmer, but don't expect anything major or revolutionary. Expect the new facilities to be available in all major compilers before "the ink is dry" on the 2017 standard. In fact, most C++17 features are already shipping today.

Not every new feature will help everybody; most are aimed at rounding off the language or standard library for some specific sub-group of programmers. You can find long lists and detailed explanations simply by typing "C++17" into your web search engine, but let me briefly mention a few of my favorites:

  • Structured binding: In C++17 we will be able to break out a struct into its members and name them. For example

```

 map<int,string>mymap;  
 //...  
 auto[iter,success]=mymap.insert(value);  
 if (success)f(*iter);  

For a map<int,string>, insert() returns a pair<mymap<int,string>::iterator,bool>, now we can name the two return values and use them directly instead of creating a pair object and accessing its members using dot.

This can be particularly useful in loops:

 for(const auto&[key,value]:mymap)  
      cout<<key<<”->”<<value<<’\n’;
  • In the library, we find std::variant to make many explicit uses of unions redundant. Instead we can write

```

 variant<int,double>v;       //can hold an int or a double  
 v=12;  
 auto i=get<int>(v);         //i becomes 12  
 auto d=get<double>(v);      //will throw bad_variant_access
  • The order of evolution will now be defined in many cases where it was unspecified before. For example

    count<<f(x)<<””<<g(y)<<'\n';

is now guaranteed to output the result of f(x) before the result of g(y). Pre-C++17 the evolution of f(x) and g(y) was allowed to be interleaved. This has been the source of many bugs and much confusion.

Roll on 2020 when we should see C++20 with major improvements such as

  • Concepts for dramatically simpler and better specified generic programming.
  • Modules for better modularity and significantly faster compilations.
  • Coroutine for simpler and faster generators and pipelines.
  • A library for simpler, faster, and more flexible networking.
  • A new version of STL for faster, simpler, and more flexible algorithms and ranges.

All of these are shipping somewhere today, so I'm not indulging in science fiction. This issue is just whether the ISO C++ standards committee can get them approved in time.

Could you use a newly-added C++ feature as an example to show us how it fit the evolving principles of C++(Direct hardware access; Zero-overhead abstraction; static typing)?

Improving the ability to work with hardware and improve the performance of code that has traditionally been low-level is a long-term effort. Some is clearly visible, but some is not so easy to see.

  • We have been working to improve the facilities for computing things at compile-time. Constexpr is the obvious poster child for that. Using constexpr, we can specify that a function can be evaluated at compile time if given constant expression as arguments. Using constexpr, we can also insist that a calculation is done at compile time. For example:

```

 constexpr int isqrt(int n)     //can be evaluated at compile time 
                                //for constant arguments  
 {  
     int i=1;  
     while(i*i<n) ++i;  
     return i-(i*i!=n);  
 }  
 constexpr int s1=isqrt(9);    //s1 is 3  
 int x;                        //not a constant   
 //…  
 constexpr int s2=isqrt(x);    //compile-time error  
 count<<weekday{jun/21/2016}<<’\n’; //Tuesday  
 static_assert(weekday{jun/21/2016}==tue);  

Constexpr, together with better use of const are delivering major improvements in performance, code size, and ability to place data in code sections and ROM. Also, "you can't have a race condition on a constant," so this helps concurrent systems.

  • A less visible example is that C++17 guarantees copy elision in many cases. That makes it simple and sufficient to get values out of functions. For example:

```

 T compute(S a)
 {
     return complicated_computation_yielding_a_T(a);
 }
 T t=compute(s);

This saves people from playing around with pointers and dynamic memory. Indirections and the use of dynamic memory is getting increasingly expensive (relatively) on modern hardware. This gets even more interesting when combined with the structured binding that I mentioned above

 pair<T,T2>compute(S a,S2 b)
 {
     return{ comp1(a,b),comp2(a,b) };
 }

 auto[foo,bar]=compute(s,s2);

Again, this is done without copying.

Templates have been a stable of zero-overhead abstraction over the last two decades and a run-away success. They have been widely copied in other languages, thought typically not in a form that is as flexible or as run-time efficient as C++'s templates. However, templates basically offer compile-time duck typing, rather than programming based on checked interfaces; they are type-checked late, at instantiation time. Consequently, the success of templates has led to seriously complicated programming techniques. We need to make generic code more similar to non-generic code, far easier to write, far easier for the compiler to check while not impacting efficiency or limit what can be expressed.

Constexpr functions do much of that: you no longer have to use a template to get compile-time computation of a value. If what you want is a value of some type, a function is the right way of expressing that. With constexpr, compile-time functions are just like other functions and type-checked just like other functions (as opposed to macro magic or traditional template metaprogramming).

"Concepts" is a language feature supporting specification of template interfaces. Unfortunately, they didn't make it into C++17, but they are currently an ISO Technical Specification, and ship as part of GCC6.2. They directly address many of the problems with templates. Consider a simplified version of the standard-library function advance() that moves an iterator n elements forward. We need two versions, one for things like lists were we must do the operation by moving forward one element at a time n times and one where we can directly move n elements forward:

 template<Input_iterator Iter>
 void advance(Iter p,int n){while (n--)++p;}

 template<Random_access_iterator Iter>
 void advance(Iter p,int n){p+=n;}

That is, if the argument is a random-access iterator, use the second (fast) version; otherwise, use the first (slow) version:

 void(vector<int>::iterator pv, list<string>::iterator pl)
 {
      advance(pv,17);         //fast
      advance(pl,17);          //slow
 }

This runs optimally fast and I can explain it to a novice in couple of minutes. It differs from "traditional template programming" by being basically the way we write other code, and by being as well checked. If I wanted, I could even simplify the definitions of advance further:

 void advance(Input_iterator p, int n){while(n--)++p;}
 void advance(Random_access_iterator p, int n){p+=n;}

That exactly matches the way we speak about such code and is what a really naïve programmer would quite reasonably have expected.

I elaborated these ideas about concepts in a recent paper .

To some extent, C++ is an expert friendly programming language. Only a few professionals can do it well. How to mitigate beginners' difficulties?

"Only a few professionals can do it well" is overstating the problem because millions of programmers produce successful systems, but fair enough, much C++ code is not what I would consider professional quality. We can do much better.

C++ is very friendly to experts writing complex, high-performance, low resource-usage code, but that’s not sufficient for making it a good language for large numbers of programmers, so it also needs facilities to ease use.

I try hard to convince the experts in the ISO C++ standards committee - and many teachers – that we need a constant effort to develop and teach simpler ways of expressing things, rather than just focusing on optimal solutions and the cleverest techniques. Often, a simple way of expressing an idea can be optimal in real-world situations and often, "clever" becomes a burden to readers, maintainers, and optimizers. When talking about expressing ideas in code, I mostly use the word "clever" to mean "too complicated" . Cleverness is better applied to understanding problems and finding good fundamental solutions.

We have a few successes in this quest to "make simple things simple" in Standard C++. Consider a conventional C++98 STL-style loop using the C-style for-statement:

 for(vector<int>::iterator p=v.begin();v!=v.end();++p)
      cout<<*p<<’\n’;

In C++11, we can use the range-for-statement:

 for(auto x:v)
      cout<<x<<’\n’;

That reads"for all x in v write out that x." That auto means "let x have the type of its initializer; in this case, the element type of v."

Language features and standard-library components by themselves do not adequately address the problem of complexity. Consequently, I started a project to produce guidelines for better use of modern C++. This effort merged with other similar efforts and we are now developing a set of rules called "The C++ Core Guidelines" together with tool support. You can find it on GitHub under the MIT open-source license . The guidelines try to direct programmers away suboptimal and error-prone ways of expressing ideas. Its ultimate aims include more readable code, more maintainable code, simpler code, more efficient code, complete type-safety, and complete resource safety. It is not an unambitious project.

Please note that this is not just for experts. The idea is to have tool support that detects problems and guides programmers (novices and experts) away from them. Early versions of tool support can be found in Visual Studio, Clang tidy, and elsewhere.

The Guidelines are already finding a use among medium-to-expert level programmers as reading material and a source of ideas how to use C++11 and C++14 features effectively. Every rule is supported by a rationale and examples of good and bad code.

What is the development plan for Guidelines Support Library? Would it be supported by and shipped with all major compilers like Standard Template Library?

The Core guidelines project aims to provide a path forward to better use of C++. It is an answer to the question "What should your code look like in 5 years?" We can already do much better with C++11 and C++14 than we could when much of today's C++ code was written, but the individual programmer is busy and doesn't have the time to evaluate new facilities so guidance and support is needed.

The concrete support comes in two forms

  • The GSL (Guideline Support Library) to supplement the ISO standard library.
  • A static analysis tool to help enforce the rules and provide correctness guarantees.

The GSL is very small (about a dozen classes and functions) primarily aimed at allowing programmers to avoid directly using the most tricky/unsafe parts of C++. For example, there is a not_null type for asserting that a pointer must not be the nullptr, and span to pass (pointer, size) pairs to a function.

An implementation of the GSL for GCC, Clang, and Microsoft is available on GitHub under the MIT open-source license: https://github.com/Microsoft/GSL . We are working in a Standard-style specification of the GSL to ease the development of compatible implementations.

The Core Guidelines is the work of a few members of the ISO Standards Committee and others, but it is not part of the standard. Some of the GSL has been proposed for the standard and eventually we would like it to be part of the standard library, but for now it is separate.

What makes highly-skilled programmers stand out of semi-skilled coders? exposing to programming earlier, hardworking...

Curiosity, leading to life-long learning, persistence in the face of hard problems, a solid grounding in the foundations of design, programming, and computers, and a willingness to communicate effectively with users of the systems they build.

No, there is no maximum age for learning how to program. You haven't lost the opportunity to become a great programmer if you didn't start at 10, or 20, or 30. I started at 20. It is important not just to be a programmer. You need a feeling for what you are computing – a knowledge of the subject matter, domain experience. Some of the best programmers I have known were not computer science graduates: Mathematicians, Engineers, Historians, Chemists, Biologists, and even a couple of Philosophers. I suspect that what really matters is to work on something really challenging and interested while you are young enough to be influenced by a subject and to develop good work habits.

No, I don't think that endless swotting and getting all As is the right approach. Many of the best programmers are really nice well-rounded people, but unfortunately not all.

Faced with people who asked for recipe with "I don't want to know how to play the piano; I want to know how Horowitz played." What would you like to say?

Horowitz spent a lifetime practicing; if you want to be a programming equivalent to Horowitz, plan to spend a lifetime practicing and learning. And remember, Horowitz spent years practicing before his first public performance. I think he spent 15 years of practice and taking instruction before his public debut.

You don't need to be a world-class genius to be a good programmer and I don't recommend 15 years of study before starting programming for real applications, but I do recommend years of serious study and practicing before trying to impose the results of your programming efforts on others.

I'm sure Horowitz started with finger exercises and pieces specifically written for learners or chosen for their simplicity. He did not start with Liszt's Hungarian Rhapsodies, nobody does. I conjecture that very few people who do not start by getting a solid foundation in their chosen field reach the highest levels of accomplishment. Rush – and be forever stuck in the low-to-middle levels.


——See More


更多精彩,加入圖靈訪談微信!

相關文章