編寫可讀程式碼的藝術
這是《The Art of Readable Code | 編寫可讀程式碼的藝術》的讀書筆記,再加一點自己的認識,強烈推薦此書。
程式碼為什麼要易於理解
“Code should be written to minimize the time it would take for someone else to understand it.”
日常工作的事實是:
- 寫程式碼前的思考和看程式碼的時間遠大於真正寫的時間
- 讀程式碼是很平常的事情,不論是別人的,還是自己的,半年前寫的可認為是別人的程式碼
- 程式碼可讀性高,很快就可以理解程式的邏輯,進入工作狀態
- 行數少的程式碼不一定就容易理解
- 程式碼的可讀性與程式的效率、架構、易於測試一點也不衝突
整本書都圍繞“如何讓程式碼的可讀性更高”這個目標來寫。這也是好程式碼的重要標準之一。
如何命名
變數名中應包含更多資訊
使用含義明確的詞,比如用download而不是get,參考以下替換方案:
send -> deliver, dispatch, announce, distribute, route find -> search, extract, locate, recover start -> lanuch, create, begin, open make -> create,set up, build, generate, compose, add, new
避免通用的詞
像tmp和retval這樣詞,除了說明是臨時變數和返回值之外,沒有任何意義。但是給他加一些有意義的詞,就會很明確:
tmp_file = tempfile.NamedTemporaryFile() ... SaveData(tmp_file, ...)
不使用retval而使用變數真正代表的意義:
sum_squares += v[i]; // Where's the "square" that we're summing? Bug!
巢狀的for迴圈中,i、j也有同樣讓人困惑的時候:
for (int i = 0; i < clubs.size(); i++) for (int j = 0; j < clubs[i].members.size(); j++) for (int k = 0; k < users.size(); k++) if (clubs[i].members[k] == users[j]) cout << "user[" << j << "] is in club[" << i << "]" << endl;
換一種寫法就會清晰很多:
if (clubs[ci].members[mi] == users[ui]) # OK. First letters match.
所以,當使用一些通用的詞,要有充分的理由才可以。
使用具體的名字
CanListenOnPort就比ServerCanStart好,can start比較含糊,而listen on port確切的說明了這個方法將要做什麼。
--run_locally就不如--extra_logging來的明確。
增加重要的細節,比如變數的單位_ms,對原始字串加_raw
如果一個變數很重要,那麼在名字上多加一些額外的字就會更加易讀,比如將string id; // Example: "af84ef845cd8"換成string hex_id;。
Start(int delay) --> delay → delay_secs CreateCache(int size) --> size → size_mb ThrottleDownload(float limit) --> limit → max_kbps Rotate(float angle) --> angle → degrees_cw
更多例子:
password -> plaintext_password comment -> unescaped_comment html -> html_utf8 data -> data_urlenc
對於作用域大的變數使用較長的名字
在比較小的作用域內,可以使用較短的變數名,在較大的作用域內使用的變數,最好用長一點的名字,編輯器的自動補全都可以很好的減少鍵盤輸入。對於一些縮寫字首,儘量選擇眾所周知的(如str),一個判斷標準是,當新成員加入時,是否可以無需他人幫助而明白字首代表什麼。
合理使用_、-等符號,比如對私有變數加_字首。
var x = new DatePicker(); // DatePicker() 是類的"構造"函式,大寫開始 var y = pageHeight(); // pageHeight() 是一個普通函式 var $all_images = $("img"); // $all_images 是jQuery物件 var height = 250; // height不是 //id和class的寫法分開 <div id="middle_column" class="main-content"> ...
命名不能有歧義
命名的時候可以先想一下,我要用的這個詞是否有別的含義。舉個例子:
results = Database.all_objects.filter("year <= 2011")
現在的結果到底是包含2011年之前的呢還是不包含呢?
使用min、max代替limit
CART_TOO_BIG_LIMIT = 10 if shopping_cart.num_items() >= CART_TOO_BIG_LIMIT: Error("Too many items in cart.") MAX_ITEMS_IN_CART = 10 if shopping_cart.num_items() > MAX_ITEMS_IN_CART: Error("Too many items in cart.")
對比上例中CART_TOO_BIG_LIMIT和MAX_ITEMS_IN_CART,想想哪個更好呢?
使用first和last來表示閉區間
print integer_range(start=2, stop=4) # Does this print [2,3] or [2,3,4] (or something else)? set.PrintKeys(first="Bart", last="Maggie")
first和last含義明確,適宜表示閉區間。
使用beigin和end表示前閉後開(2,9))區間
PrintEventsInRange("OCT 16 12:00am", "OCT 17 12:00am") PrintEventsInRange("OCT 16 12:00am", "OCT 16 11:59:59.9999pm")
上面一種寫法就比下面的舒服多了。
Boolean型變數命名
bool read_password = true;
這是一個很危險的命名,到底是需要讀取密碼呢,還是密碼已經被讀取呢,不知道,所以這個變數可以使用user_is_authenticated代替。通常,給Boolean型變數新增is、has、can、should可以讓含義更清晰,比如:
SpaceLeft() --> hasSpaceLeft() bool disable_ssl = false --> bool use_ssl = true
符合預期
public class StatisticsCollector { public void addSample(double x) { ... } public double getMean() { // Iterate through all samples and return total / num_samples } ... }
在這個例子中,getMean方法遍歷了所有的樣本,返回總額,所以並不是普通意義上輕量的get方法,所以應該取名computeMean比較合適。
漂亮的格式
寫出來漂亮的格式,充滿美感,讀起來自然也會舒服很多,對比下面兩個例子:
class StatsKeeper { public: // A class for keeping track of a series of doubles void Add(double d); // and methods for quick statistics about them private: int count; /* how many so far */ public: double Average(); private: double minimum; list<double> past_items ;double maximum; };
什麼是充滿美感的呢:
// A class for keeping track of a series of doubles // and methods for quick statistics about them. class StatsKeeper { public: void Add(double d); double Average(); private: list<double> past_items; int count; // how many so far double minimum; double maximum; };
考慮斷行的連續性和簡潔
這段程式碼需要斷行,來滿足不超過一行80個字元的要求,引數也需要註釋說明:
public class PerformanceTester { public static final TcpConnectionSimulator wifi = new TcpConnectionSimulator( 500, /* Kbps */ 80, /* millisecs latency */ 200, /* jitter */ 1 /* packet loss % */); public static final TcpConnectionSimulator t3_fiber = new TcpConnectionSimulator( 45000, /* Kbps */ 10, /* millisecs latency */ 0, /* jitter */ 0 /* packet loss % */); public static final TcpConnectionSimulator cell = new TcpConnectionSimulator( 100, /* Kbps */ 400, /* millisecs latency */ 250, /* jitter */ 5 /* packet loss % */); }
考慮到程式碼的連貫性,先優化成這樣:
public class PerformanceTester { public static final TcpConnectionSimulator wifi = new TcpConnectionSimulator( 500, /* Kbps */ 80, /* millisecs latency */ 200, /* jitter */ 1 /* packet loss % */); public static final TcpConnectionSimulator t3_fiber = new TcpConnectionSimulator( 45000, /* Kbps */ 10, /* millisecs latency */ 0, /* jitter */ 0 /* packet loss % */); public static final TcpConnectionSimulator cell = new TcpConnectionSimulator( 100, /* Kbps */ 400, /* millisecs latency */ 250, /* jitter */ 5 /* packet loss % */); }
連貫性好一點,但還是太羅嗦,額外佔用很多空間:
public class PerformanceTester { // TcpConnectionSimulator(throughput, latency, jitter, packet_loss) // [Kbps] [ms] [ms] [percent] public static final TcpConnectionSimulator wifi = new TcpConnectionSimulator(500, 80, 200, 1); public static final TcpConnectionSimulator t3_fiber = new TcpConnectionSimulator(45000, 10, 0, 0); public static final TcpConnectionSimulator cell = new TcpConnectionSimulator(100, 400, 250, 5); }
用函式封裝
// Turn a partial_name like "Doug Adams" into "Mr. Douglas Adams". // If not possible, 'error' is filled with an explanation. string ExpandFullName(DatabaseConnection dc, string partial_name, string* error); DatabaseConnection database_connection; string error; assert(ExpandFullName(database_connection, "Doug Adams", &error) == "Mr. Douglas Adams"); assert(error == ""); assert(ExpandFullName(database_connection, " Jake Brown ", &error) == "Mr. Jacob Brown III"); assert(error == ""); assert(ExpandFullName(database_connection, "No Such Guy", &error) == ""); assert(error == "no match found"); assert(ExpandFullName(database_connection, "John", &error) == ""); assert(error == "more than one result");
上面這段程式碼看起來很髒亂,很多重複性的東西,可以用函式封裝:
CheckFullName("Doug Adams", "Mr. Douglas Adams", ""); CheckFullName(" Jake Brown ", "Mr. Jake Brown III", ""); CheckFullName("No Such Guy", "", "no match found"); CheckFullName("John", "", "more than one result"); void CheckFullName(string partial_name, string expected_full_name, string expected_error) { // database_connection is now a class member string error; string full_name = ExpandFullName(database_connection, partial_name, &error); assert(error == expected_error); assert(full_name == expected_full_name); }
列對齊
列對齊可以讓程式碼段看起來更舒適:
CheckFullName("Doug Adams" , "Mr. Douglas Adams" , ""); CheckFullName(" Jake Brown ", "Mr. Jake Brown III", ""); CheckFullName("No Such Guy" , "" , "no match found"); CheckFullName("John" , "" , "more than one result"); commands[] = { ... { "timeout" , NULL , cmd_spec_timeout}, { "timestamping" , &opt.timestamping , cmd_boolean}, { "tries" , &opt.ntry , cmd_number_inf}, { "useproxy" , &opt.use_proxy , cmd_boolean}, { "useragent" , NULL , cmd_spec_useragent}, ... };
程式碼用塊區分
class FrontendServer { public: FrontendServer(); void ViewProfile(HttpRequest* request); void OpenDatabase(string location, string user); void SaveProfile(HttpRequest* request); string ExtractQueryParam(HttpRequest* request, string param); void ReplyOK(HttpRequest* request, string html); void FindFriends(HttpRequest* request); void ReplyNotFound(HttpRequest* request, string error); void CloseDatabase(string location); ~FrontendServer(); };
上面這一段雖然能看,不過還有優化空間:
class FrontendServer { public: FrontendServer(); ~FrontendServer(); // Handlers void ViewProfile(HttpRequest* request); void SaveProfile(HttpRequest* request); void FindFriends(HttpRequest* request); // Request/Reply Utilities string ExtractQueryParam(HttpRequest* request, string param); void ReplyOK(HttpRequest* request, string html); void ReplyNotFound(HttpRequest* request, string error); // Database Helpers void OpenDatabase(string location, string user); void CloseDatabase(string location); };
再來看一段程式碼:
# Import the user's email contacts, and match them to users in our system. # Then display a list of those users that he/she isn't already friends with. def suggest_new_friends(user, email_password): friends = user.friends() friend_emails = set(f.email for f in friends) contacts = import_contacts(user.email, email_password) contact_emails = set(c.email for c in contacts) non_friend_emails = contact_emails - friend_emails suggested_friends = User.objects.select(email__in=non_friend_emails) display['user'] = user display['friends'] = friends display['suggested_friends'] = suggested_friends return render("suggested_friends.html", display)
全都混在一起,視覺壓力相當大,按功能化塊:
def suggest_new_friends(user, email_password): # Get the user's friends' email addresses. friends = user.friends() friend_emails = set(f.email for f in friends) # Import all email addresses from this user's email account. contacts = import_contacts(user.email, email_password) contact_emails = set(c.email for c in contacts) # Find matching users that they aren't already friends with. non_friend_emails = contact_emails - friend_emails suggested_friends = User.objects.select(email__in=non_friend_emails) # Display these lists on the page. display['user'] = user display['friends'] = friends display['suggested_friends'] = suggested_friends return render("suggested_friends.html", display)
讓程式碼看起來更舒服,需要在寫的過程中多注意,培養一些好的習慣,尤其當團隊合作的時候,程式碼風格比如大括號的位置並沒有對錯,但是不遵循團隊規範那就是錯的。
如何寫註釋
當你寫程式碼的時候,你會思考很多,但是最終呈現給讀者的就只剩程式碼本身了,額外的資訊丟失了,所以註釋的目的就是讓讀者瞭解更多的資訊。
應該註釋什麼
不應該註釋什麼
這樣的註釋毫無價值:
// The class definition for Account class Account { public: // Constructor Account(); // Set the profit member to a new value void SetProfit(double profit); // Return the profit from this Account double GetProfit(); };
不要像下面這樣為了註釋而註釋:
// Find a Node with the given 'name' or return NULL. // If depth <= 0, only 'subtree' is inspected. // If depth == N, only 'subtree' and N levels below are inspected. Node* FindNodeInSubtree(Node* subtree, string name, int depth);
不要給爛取名註釋
// Enforce limits on the Reply as stated in the Request, // such as the number of items returned, or total byte size, etc. void CleanReply(Request request, Reply reply);
註釋的大部分都在解釋clean是什麼意思,那不如換個正確的名字:
// Make sure 'reply' meets the count/byte/etc. limits from the 'request' void EnforceLimitsFromRequest(Request request, Reply reply);
記錄你的想法
我們討論了不該註釋什麼,那麼應該註釋什麼呢?註釋應該記錄你思考程式碼怎麼寫的結果,比如像下面這些:
// Surprisingly, a binary tree was 40% faster than a hash table for this data. // The cost of computing a hash was more than the left/right comparisons. // This heuristic might miss a few words. That's OK; solving this 100% is hard. // This class is getting messy. Maybe we should create a 'ResourceNode' subclass to // help organize things.
也可以用來記錄流程和常量:
// TODO: use a faster algorithm // TODO(dustin): handle other image formats besides JPEG NUM_THREADS = 8 # as long as it's >= 2 * num_processors, that's good enough. // Impose a reasonable limit - no human can read that much anyway. const int MAX_RSS_SUBSCRIPTIONS = 1000;
可用的詞有:
- TODO : Stuff I haven’t gotten around to yet
- FIXME : Known-broken code here
- HACK : Adimittedly inelegant solution to a problem
- XXX : Danger! Major problem here
站在讀者的角度去思考
當別人讀你的程式碼時,讓他們產生疑問的部分,就是你應該註釋的地方。
struct Recorder { vector<float> data; ... void Clear() { vector<float>().swap(data); // Huh? Why not just data.clear()? } };
很多C++的程式設計師啊看到這裡,可能會想為什麼不用data.clear()來代替vector.swap,所以那個地方應該加上註釋:
// Force vector to relinquish its memory (look up "STL swap trick") vector<float>().swap(data);
說明可能陷阱
你在寫程式碼的過程中,可能用到一些hack,或者有其他需要讀程式碼的人知道的陷阱,這時候就應該註釋:
void SendEmail(string to, string subject, string body);
而實際上這個傳送郵件的函式是呼叫別的服務,有超時設定,所以需要註釋:
// Calls an external service to deliver email. (Times out after 1 minute.) void SendEmail(string to, string subject, string body);
全景的註釋
有時候為了更清楚說明,需要給整個檔案加註釋,讓讀者有個總體的概念:
// This file contains helper functions that provide a more convenient interface to our // file system. It handles file permissions and other nitty-gritty details.
總結性的註釋
即使是在函式內部,也可以有類似檔案註釋那樣的說明註釋:
# Find all the items that customers purchased for themselves. for customer_id in all_customers: for sale in all_sales[customer_id].sales: if sale.recipient == customer_id: ...
或者按照函式的步進,寫一些註釋:
def GenerateUserReport(): # Acquire a lock for this user ... # Read user's info from the database ... # Write info to a file ... # Release the lock for this user
很多人不願意寫註釋,確實,要寫好註釋也不是一件簡單的事情,也可以在檔案專門的地方,留個寫註釋的區域,可以寫下你任何想說的東西。
註釋應簡明準確
前一個小節討論了註釋應該寫什麼,這一節來討論應該怎麼寫,因為註釋很重要,所以要寫的精確,註釋也佔據螢幕空間,所以要簡潔。
精簡註釋
// The int is the CategoryType. // The first float in the inner pair is the 'score', // the second is the 'weight'. typedef hash_map<int, pair<float, float> > ScoreMap;
這樣寫太羅嗦了,儘量精簡壓縮成這樣:
// CategoryType -> (score, weight) typedef hash_map<int, pair<float, float> > ScoreMap;
避免有歧義的代詞
// Insert the data into the cache, but check if it's too big first.
這裡的it's有歧義,不知道所指的是data還是cache,改成如下:
// Insert the data into the cache, but check if the data is too big first.
還有更好的解決辦法,這裡的it就有明確所指:
// If the data is small enough, insert it into the cache.
語句要精簡準確
# Depending on whether we've already crawled this URL before, give it a different priority.
這句話理解起來太費勁,改成如下就好理解很多:
# Give higher priority to URLs we've never crawled before.
精確描述函式的目的
// Return the number of lines in this file. int CountLines(string filename) { ... }
這樣的一個函式,用起來可能會一頭霧水,因為他可以有很多歧義:
- ”” 一個空檔案,是0行還是1行?
- “hello” 只有一行,那麼返回值是0還是1?
- “hello\n” 這種情況返回1還是2?
- “hello\n world” 返回1還是2?
- “hello\n\r cruel\n world\r” 返回2、3、4哪一個呢?
所以註釋應該這樣寫:
// Count how many newline bytes ('\n') are in the file. int CountLines(string filename) { ... }
用例項說明邊界情況
// Rearrange 'v' so that elements < Pivot come before those >= Pivot; // Then return the largest 'i' for which v[i] < Pivot (or -1 if none are < pivot) int Partition(vector<int>* v, int pivot);
這個描述很精確,但是如果再加入一個例子,就更好了:
// ... // Example: Partition([8 5 9 8 2], 8) might result in [5 2 | 8 9 8] and return 1 int Partition(vector<int>* v, int pivot);
說明你的程式碼的真正目的
void DisplayProducts(list<Product> products) { products.sort(CompareProductByPrice); // Iterate through the list in reverse order for (list<Product>::reverse_iterator it = products.rbegin(); it != products.rend(); ++it) DisplayPrice(it->price); ... }
這裡的註釋說明了倒序排列,單還不夠準確,應該改成這樣:
// Display each price, from highest to lowest for (list<Product>::reverse_iterator it = products.rbegin(); ... )
函式呼叫時的註釋
看見這樣的一個函式呼叫,肯定會一頭霧水:
Connect(10, false);
如果加上這樣的註釋,讀起來就清楚多了:
def Connect(timeout, use_encryption): ... # Call the function using named parameters Connect(timeout = 10, use_encryption = False)
使用資訊含量豐富的詞
// This class contains a number of members that store the same information as in the // database, but are stored here for speed. When this class is read from later, those // members are checked first to see if they exist, and if so are returned; otherwise the // database is read from and that data stored in those fields for next time.
上面這一大段註釋,解釋的很清楚,如果換一個詞來代替,也不會有什麼疑惑:
// This class acts as a caching layer to the database.
簡化迴圈和邏輯
流程控制要簡單
讓條件語句、迴圈以及其他控制流程的程式碼儘可能自然,讓讀者在閱讀過程中不需要停頓思考或者在回頭查詢,是這一節的目的。
條件語句中引數的位置
對比下面兩種條件的寫法:
if (length >= 10) while (bytes_received < bytes_expected) if (10 <= length) while (bytes_expected > bytes_received)
到底是應該按照大於小於的順序來呢,還是有其他的準則?是的,應該按照引數的意義來
- 運算子左邊:通常是需要被檢查的變數,也就是會經常變化的
- 運算子右邊:通常是被比對的樣本,一定程度上的常量
這就解釋了為什麼bytes_received < bytes_expected比反過來更好理解。
if/else的順序
通常,if/else的順序你可以自由選擇,下面這兩種都可以:
if (a == b) { // Case One ... } else { // Case Two ... } if (a != b) { // Case Two ... } else { // Case One ... }
或許對此你也沒有仔細斟酌過,但在有些時候,一種順序確實好過另一種:
- 正向的邏輯在前,比如if(debug)就比if(!debug)好
- 簡單邏輯的在前,這樣if和else就可以在一個螢幕顯示 – 有趣、清晰的邏輯在前
舉個例子來看:
if (!url.HasQueryParameter("expand_all")) { response.Render(items); ... } else { for (int i = 0; i < items.size(); i++) { items[i].Expand(); } ... }
看到if你首先想到的是expand_all,就好像告訴你“不要想大象”,你會忍不住去想它,所以產生了一點點迷惑,最好寫成:
if (url.HasQueryParameter("expand_all")) { for (int i = 0; i < items.size(); i++) { items[i].Expand(); } ... } else { response.Render(items); ... }
三目運算子(?:)
time_str += (hour >= 12) ? "pm" : "am"; Avoiding the ternary operator, you might write: if (hour >= 12) { time_str += "pm"; } else { time_str += "am"; }
使用三目運算子可以減少程式碼行數,上例就是一個很好的例證,但是我們的真正目的是減少讀程式碼的時間,所以下面的情況並不適合用三目運算子:
return exponent >= 0 ? mantissa * (1 << exponent) : mantissa / (1 << -exponent); if (exponent >= 0) { return mantissa * (1 << exponent); } else { return mantissa / (1 << -exponent); }
所以只在簡單表示式的地方用。
避免使用do/while表示式
do { continue; } while (false);
這段程式碼會執行幾遍呢,需要時間思考一下,do/while完全可以用別的方法代替,所以應避免使用。
儘早return
public boolean Contains(String str, String substr) { if (str == null || substr == null) return false; if (substr.equals("")) return true; ... }
函式裡面儘早的return,可以讓邏輯更加清晰。
減少巢狀
if (user_result == SUCCESS) { if (permission_result != SUCCESS) { reply.WriteErrors("error reading permissions"); reply.Done(); return; } reply.WriteErrors(""); } else { reply.WriteErrors(user_result); } reply.Done();
這樣一段程式碼,有一層的巢狀,但是看起來也會稍有迷惑,想想自己的程式碼,有沒有類似的情況呢?可以換個思路去考慮這段程式碼,並且用盡早return的原則修改,看起來就舒服很多:
if (user_result != SUCCESS) { reply.WriteErrors(user_result); reply.Done(); return; } if (permission_result != SUCCESS) { reply.WriteErrors(permission_result); reply.Done(); return; } reply.WriteErrors(""); reply.Done();
同樣的,對於有巢狀的迴圈,可以採用同樣的辦法:
for (int i = 0; i < results.size(); i++) { if (results[i] != NULL) { non_null_count++; if (results[i]->name != "") { cout << "Considering candidate..." << endl; ... } } }
換一種寫法,儘早return,在迴圈中就用continue:
for (int i = 0; i < results.size(); i++) { if (results[i] == NULL) continue; non_null_count++; if (results[i]->name == "") continue; cout << "Considering candidate..." << endl; ... }
拆分複雜表示式
很顯然的,越複雜的表示式,讀起來越費勁,所以應該把那些複雜而龐大的表示式,拆分成一個個易於理解的小式子。
用變數
將複雜表示式拆分最簡單的辦法,就是增加一個變數:
if line.split(':')[0].strip() == "root": //用變數替換 username = line.split(':')[0].strip() if username == "root": ...
或者這個例子:
if (request.user.id == document.owner_id) { // user can edit this document... } ... if (request.user.id != document.owner_id) { // document is read-only... } //用變數替換 final boolean user_owns_document = (request.user.id == document.owner_id); if (user_owns_document) { // user can edit this document... } ... if (!user_owns_document) { // document is read-only... }
邏輯替換
- 1) not (a or b or c) <–> (not a) and (not b) and (not c)
- 2) not (a and b and c) <–> (not a) or (not b) or (not c)
所以,就可以這樣寫:
if (!(file_exists && !is_protected)) Error("Sorry, could not read file."); //替換 if (!file_exists || is_protected) Error("Sorry, could not read file.");
不要濫用邏輯表示式
assert((!(bucket = FindBucket(key))) || !bucket->IsOccupied());
這樣的程式碼完全可以用下面這個替換,雖然有兩行,但是更易懂:
bucket = FindBucket(key); if (bucket != NULL) assert(!bucket->IsOccupied());
像下面這樣的表示式,最好也不要寫,因為在有些語言中,x會被賦予第一個為true的變數的值:
x = a || b || c
拆解大表示式
var update_highlight = function (message_num) { if ($("#vote_value" + message_num).html() === "Up") { $("#thumbs_up" + message_num).addClass("highlighted"); $("#thumbs_down" + message_num).removeClass("highlighted"); } else if ($("#vote_value" + message_num).html() === "Down") { $("#thumbs_up" + message_num).removeClass("highlighted"); $("#thumbs_down" + message_num).addClass("highlighted"); } else { $("#thumbs_up" + message_num).removeClass("highighted"); $("#thumbs_down" + message_num).removeClass("highlighted"); } };
這裡面有很多重複的語句,我們可以用變數還替換簡化:
var update_highlight = function (message_num) { var thumbs_up = $("#thumbs_up" + message_num); var thumbs_down = $("#thumbs_down" + message_num); var vote_value = $("#vote_value" + message_num).html(); var hi = "highlighted"; if (vote_value === "Up") { thumbs_up.addClass(hi); thumbs_down.removeClass(hi); } else if (vote_value === "Down") { thumbs_up.removeClass(hi); thumbs_down.addClass(hi); } else { thumbs_up.removeClass(hi); thumbs_down.removeClass(hi); } }
變數與可讀性
消除變數
前一節,講到利用變數來拆解大表示式,這一節來討論如何消除多餘的變數。
沒用的臨時變數
now = datetime.datetime.now() root_message.last_view_time = now
這裡的now可以去掉,因為:
- 並非用來拆分複雜的表示式
- 也沒有增加可讀性,因為`datetime.datetime.now()`本就清晰
- 只用了一次
所以完全可以寫作:
root_message.last_view_time = datetime.datetime.now()
消除條件控制變數
boolean done = false; while (/* condition */ && !done) { ... if (...) { done = true; continue; } }
這裡的done可以用別的方式更好的完成:
while (/* condition */) { ... if (...) { break; } }
這個例子非常容易修改,如果是比較複雜的巢狀,break可能並不夠用,這時候就可以把程式碼封裝到函式中。
減少變數的作用域
我們都聽過要避免使用全域性變數這樣的忠告,是的,當變數的作用域越大,就越難追蹤,所以要保持變數小的作用域。
class LargeClass { string str_; void Method1() { str_ = ...; Method2(); } void Method2() { // Uses str_ } // Lots of other methods that don't use str_ ... ; }
這裡的str_的作用域有些大,完全可以換一種方式:
class LargeClass { void Method1() { string str = ...; Method2(str); } void Method2(string str) { // Uses str } // Now other methods can't see str. };
將str通過變數函式引數傳遞,減小了作用域,也更易讀。同樣的道理也可以用在定義類的時候,將大類拆分成一個個小類。
不要使用巢狀的作用域
# No use of example_value up to this point. if request: for value in request.values: if value > 0: example_value = value break for logger in debug.loggers: logger.log("Example:", example_value)
這個例子在執行時候會報example_value is undefined的錯,修改起來不算難:
example_value = None if request: for value in request.values: if value > 0: example_value = value break if example_value: for logger in debug.loggers: logger.log("Example:", example_value)
但是參考前面的消除中間變數準則,還有更好的辦法:
def LogExample(value): for logger in debug.loggers: logger.log("Example:", value) if request: for value in request.values: if value > 0: LogExample(value) # deal with 'value' immediately break
用到了再宣告
在C語言中,要求將所有的變數事先宣告,這樣當用到變數較多時候,讀者處理這些資訊就會有難度,所以一開始沒用到的變數,就暫緩宣告:
def ViewFilteredReplies(original_id): filtered_replies = [] root_message = Messages.objects.get(original_id) all_replies = Messages.objects.select(root_id=original_id) root_message.view_count += 1 root_message.last_view_time = datetime.datetime.now() root_message.save() for reply in all_replies: if reply.spam_votes <= MAX_SPAM_VOTES: filtered_replies.append(reply) return filtered_replies
讀者一次處理變數太多,可以暫緩宣告:
def ViewFilteredReplies(original_id): root_message = Messages.objects.get(original_id) root_message.view_count += 1 root_message.last_view_time = datetime.datetime.now() root_message.save() all_replies = Messages.objects.select(root_id=original_id) filtered_replies = [] for reply in all_replies: if reply.spam_votes <= MAX_SPAM_VOTES: filtered_replies.append(reply) return filtered_replies
變數最好只寫一次
前面討論了過多的變數會讓讀者迷惑,同一個變數,不停的被賦值也會讓讀者頭暈,如果變數變化的次數少一些,程式碼可讀性就更強。
一個例子
假設有一個頁面,如下,需要給第一個空的input賦值:
<input type="text" id="input1" value="Dustin"> <input type="text" id="input2" value="Trevor"> <input type="text" id="input3" value=""> <input type="text" id="input4" value="Melissa"> ... var setFirstEmptyInput = function (new_value) { var found = false; var i = 1; var elem = document.getElementById('input' + i); while (elem !== null) { if (elem.value === '') { found = true; break; } i++; elem = document.getElementById('input' + i); } if (found) elem.value = new_value; return elem; };
這段程式碼能工作,有三個變數,我們逐一去看如何優化,found作為中間變數,完全可以消除:
var setFirstEmptyInput = function (new_value) { var i = 1; var elem = document.getElementById('input' + i); while (elem !== null) { if (elem.value === '') { elem.value = new_value; return elem; } i++; elem = document.getElementById('input' + i); } return null; };
再來看elem變數,只用來做迴圈,呼叫了很多次,所以很難跟蹤他的值,i也可以用for來修改:
var setFirstEmptyInput = function (new_value) { for (var i = 1; true; i++) { var elem = document.getElementById('input' + i); if (elem === null) return null; // Search Failed. No empty input found. if (elem.value === '') { elem.value = new_value; return elem; } } };
重新組織你的程式碼
分離不相關的子問題
工程師就是將大問題分解為一個個小問題,然後逐個解決,這樣也易於保證程式的健壯性、可讀性。如何分解子問題,下面給出一些準則:
- 看看這個方法或程式碼,問問你自己“這段程式碼的最終目標是什麼?”
- 對於每一行程式碼,要問“它與目標直接相關,或者是不相關的子問題?”
- 如果有足夠多行的程式碼是處理與目標不直接相關的問題,那麼抽離成子函式
來看一個例子:
ajax_post({ url: 'http://example.com/submit', data: data, on_success: function (response_data) { var str = "{\n"; for (var key in response_data) { str += " " + key + " = " + response_data[key] + "\n"; } alert(str + "}"); // Continue handling 'response_data' ... } });
這段程式碼的目標是傳送一個ajax請求,所以其中字串處理的部分就可以抽離出來:
var format_pretty = function (obj) { var str = "{\n"; for (var key in obj) { str += " " + key + " = " + obj[key] + "\n"; } return str + "}"; };
意外收穫
有很多理由將format_pretty抽離出來,這些獨立的函式可以很容易的新增feature,增強可靠性,處理邊界情況,等等。所以這裡,可以將format_pretty增強,就會得到一個更強大的函式:
var format_pretty = function (obj, indent) { // Handle null, undefined, strings, and non-objects. if (obj === null) return "null"; if (obj === undefined) return "undefined"; if (typeof obj === "string") return '"' + obj + '"'; if (typeof obj !== "object") return String(obj); if (indent === undefined) indent = ""; // Handle (non-null) objects. var str = "{\n"; for (var key in obj) { str += indent + " " + key + " = "; str += format_pretty(obj[key], indent + " ") + "\n"; } return str + indent + "}"; };
這個函式輸出:
{ key1 = 1 key2 = true key3 = undefined key4 = null key5 = { key5a = { key5a1 = "hello world" } } }
多做這樣的事情,就是積累程式碼的過程,這樣的程式碼可以複用,也可以形成自己的程式碼庫,或者分享給別人。
業務相關的函式
那些與目標不相關函式,抽離出來可以複用,與業務相關的也可以抽出來,保持程式碼的易讀性,例如:
business = Business() business.name = request.POST["name"] url_path_name = business.name.lower() url_path_name = re.sub(r"['\.]", "", url_path_name) url_path_name = re.sub(r"[^a-z0-9]+", "-", url_path_name) url_path_name = url_path_name.strip("-") business.url = "/biz/" + url_path_name business.date_created = datetime.datetime.utcnow() business.save_to_database()
抽離出來,就好看很多:
CHARS_TO_REMOVE = re.compile(r"['\.']+") CHARS_TO_DASH = re.compile(r"[^a-z0-9]+") def make_url_friendly(text): text = text.lower() text = CHARS_TO_REMOVE.sub('', text) text = CHARS_TO_DASH.sub('-', text) return text.strip("-") business = Business() business.name = request.POST["name"] business.url = "/biz/" + make_url_friendly(business.name) business.date_created = datetime.datetime.utcnow() business.save_to_database()
簡化現有介面
我們來看一個讀寫cookie的函式:
var max_results; var cookies = document.cookie.split(';'); for (var i = 0; i < cookies.length; i++) { var c = cookies[i]; c = c.replace(/^[ ]+/, ''); // remove leading spaces if (c.indexOf("max_results=") === 0) max_results = Number(c.substring(12, c.length)); }
這段程式碼實在太醜了,理想的介面應該是這樣的:
set_cookie(name, value, days_to_expire); delete_cookie(name);
對於並不理想的介面,你永遠可以用自己的函式做封裝,讓介面更好用。
按自己需要寫介面
ser_info = { "username": "...", "password": "..." } user_str = json.dumps(user_info) cipher = Cipher("aes_128_cbc", key=PRIVATE_KEY, init_vector=INIT_VECTOR, op=ENCODE) encrypted_bytes = cipher.update(user_str) encrypted_bytes += cipher.final() # flush out the current 128 bit block url = "http://example.com/?user_info=" + base64.urlsafe_b64encode(encrypted_bytes) ...
雖然終極目的是拼接使用者資訊的字元,但是程式碼大部分做的事情是解析python的object,所以:
def url_safe_encrypt(obj): obj_str = json.dumps(obj) cipher = Cipher("aes_128_cbc", key=PRIVATE_KEY, init_vector=INIT_VECTOR, op=ENCODE) encrypted_bytes = cipher.update(obj_str) encrypted_bytes += cipher.final() # flush out the current 128 bit block return base64.urlsafe_b64encode(encrypted_bytes)
這樣在其他地方也可以呼叫:
user_info = { "username": "...", "password": "..." } url = "http://example.com/?user_info=" + url_safe_encrypt(user_info)
分離子函式是好習慣,但是也要適度,過度的分離成多個小函式,也會讓查詢變得困難。
單任務
程式碼應該是一次只完成一個任務
var place = location_info["LocalityName"]; // e.g. "Santa Monica" if (!place) { place = location_info["SubAdministrativeAreaName"]; // e.g. "Los Angeles" } if (!place) { place = location_info["AdministrativeAreaName"]; // e.g. "California" } if (!place) { place = "Middle-of-Nowhere"; } if (location_info["CountryName"]) { place += ", " + location_info["CountryName"]; // e.g. "USA" } else { place += ", Planet Earth"; } return place;
這是一個用來拼地名的函式,有很多的條件判斷,讀起來非常吃力,有沒有辦法拆解任務呢?
var town = location_info["LocalityName"]; // e.g. "Santa Monica" var city = location_info["SubAdministrativeAreaName"]; // e.g. "Los Angeles" var state = location_info["AdministrativeAreaName"]; // e.g. "CA" var country = location_info["CountryName"]; // e.g. "USA"
先拆解第一個任務,將各變數分別儲存,這樣在後面使用中不需要去記憶那些繁長的key值了,第二個任務,解決地址拼接的後半部分:
// Start with the default, and keep overwriting with the most specific value. var second_half = "Planet Earth"; if (country) { second_half = country; } if (state && country === "USA") { second_half = state; }
再來解決前半部分:
var first_half = "Middle-of-Nowhere"; if (state && country !== "USA") { first_half = state; } if (city) { first_half = city; } if (town) { first_half = town; }
大功告成:
return first_half + ", " + second_half;
如果注意到有USA這個變數的判斷的話,也可以這樣寫:
var first_half, second_half; if (country === "USA") { first_half = town || city || "Middle-of-Nowhere"; second_half = state || "USA"; } else { first_half = town || city || state || "Middle-of-Nowhere"; second_half = country || "Planet Earth"; } return first_half + ", " + second_half;
把想法轉換成程式碼
要把一個複雜的東西解釋給別人,一些細節很容易就讓人產生迷惑,所以想象把你的程式碼用平實的語言解釋給別人聽,別人是否能懂,有一些準則可以幫助你讓程式碼更清晰:
- 用最平實的語言描述程式碼的目的,就像給讀者講述一樣
- 注意描述中關鍵的字詞
- 讓你的程式碼符合你的描述
下面這段程式碼用來校驗使用者的許可權:
$is_admin = is_admin_request(); if ($document) { if (!$is_admin && ($document['username'] != $_SESSION['username'])) { return not_authorized(); } } else { if (!$is_admin) { return not_authorized(); } } // continue rendering the page ...
這一段程式碼不長,裡面的邏輯巢狀倒是複雜,參考前面章節所述,巢狀太多非常影響閱讀理解,將這個邏輯用語言描述就是:
有兩種情況有許可權: 1、你是管理員(admin) 2、你擁有這個文件 否則就沒有許可權
根據描述來寫程式碼:
if (is_admin_request()) { // authorized } elseif ($document && ($document['username'] == $_SESSION['username'])) { // authorized } else { return not_authorized(); } // continue rendering the page ...
寫更少的程式碼
最易懂的程式碼就是沒有程式碼!
- 去掉那些沒意義的feature,也不要過度設計
- 重新考慮需求,解決最簡單的問題,也能完成整體的目標
- 熟悉你常用的庫,週期性研究他的API
最後
還有一些與測試相關的章節,留給你自己去研讀吧,再次推薦此書:
- 英文版:The Art of Readable Code
- 中文版:編寫可讀程式碼的藝術
相關文章
- 讀《編寫可讀程式碼的藝術》
- 編寫可讀性程式碼的藝術
- 如何提高程式碼的可讀性? - 讀《編寫可讀程式碼的藝術》
- 『No22: 編寫可讀程式碼的藝術(1)』
- 《編寫可讀程式碼的藝術》讀書筆記(上)表面層次的改進筆記
- 編寫可讀性程式碼的藝術--萬字總結,看到即學到
- 編寫可讀的程式碼
- 編寫可閱讀的程式碼--基本規約
- 編寫更加穩定/可讀的javascript程式碼JavaScript
- 乾淨的程式碼: 編寫可讀的函式函式
- 編寫Linux實用程式的藝術(轉)Linux
- 編寫更加穩定、可讀性強的JavaScript程式碼JavaScript
- 編寫超級可讀程式碼的15個最佳實踐
- 編寫易讀的程式碼 (轉)
- 編寫可測試的 JavaSript 程式碼Java
- 編寫可測試的 JavaScript 程式碼JavaScript
- 編寫小而美函式的藝術函式
- 掌握編寫有效的GitHub提交資訊的藝術Github
- Dave Cheney:編寫簡單,可讀,可維護的Go程式碼的十個工程建議Go
- 用C語言編寫Linux實用程式的藝術(轉)C語言Linux
- 程式碼的藝術:如何寫出小而清晰的函式函式
- 遠離麵條程式碼:編寫可維護的 JS 程式碼JS
- 讀《程式碼不朽:編寫可維護軟體的10大要則》C# 版C#
- Linux Kernel 程式碼藝術——編譯時斷言Linux編譯
- 寫介面的藝術: 精簡,可擴充套件套件
- 頗具藝術感的程式碼
- 編碼如作文:寫出高可讀 JS 的 7 條原則JS
- 編寫讓別人能夠讀懂的程式碼
- 如何編寫高質量和可維護的程式碼
- 如何編寫可維護的物件導向JavaScript程式碼物件JavaScript
- 《修改程式碼的藝術》迷你書
- 編寫可擴充套件程式套件
- 前端進階篇之如何編寫可維護可升級的程式碼前端
- 想寫無Bug的安全程式碼?看防禦性程式設計的藝術程式設計
- 命名&可閱讀的程式碼
- 讀《軟體驅魔》除錯和優化遺留程式碼的藝術除錯優化
- 前端開發中的程式碼藝術(精要)前端
- 編寫業務邏輯程式碼,清晰可維護是很重要的