從原始碼解析ERROR 1129 (HY000):Host is blocked because of many connection errors

gaopengtttt發表於2017-06-01
原創水平有限,有錯請指出

原始碼版本5.7.14

今天群裡一個朋友出現如下錯誤:
ERROR 1129 (HY000): Host '10.0.0.8' is blocked because of many connection errors; unblock with 'mysqladmin flush-hosts'
為了找到這個問題原因,首先在原始碼中找到錯誤碼

點選(此處)摺疊或開啟

  1. { "ER_HOST_IS_BLOCKED", 1129, "Host \'%-.64s\' is blocked because of many connection errors; unblock with \'mysqladmin flush-hosts\'" }
然後找到拋錯地點如下:
位於
sql_connect.cc下的check_connection()函式的

點選(此處)摺疊或開啟

  1. if (!(specialflag & SPECIAL_NO_RESOLVE))
  2.     {
  3.       int rc;
  4.       char *host;
  5.       LEX_CSTRING main_sctx_host;

  6.       rc= ip_to_hostname(&net->vio->remote,
  7.                          main_sctx_ip.str,
  8.                          &host, &connect_errors); //接受ip_to_hostname的返回值到rc
  9.       ......
  10.       if (rc == RC_BLOCKED_HOST)//判斷rc是否為RC_BLOCKED_HOST 1
  11.       {
  12.         /* HOST_CACHE stats updated by ip_to_hostname(). */
  13.         my_error(ER_HOST_IS_BLOCKED, MYF(0),
  14.                  thd->m_main_security_ctx.host_or_ip().str);
  15.         return 1;
  16.       }
  17.     }
這裡如果如果rc == RC_BLOCKED_HOST RC_BLOCKED_HOST是一個宏定義為1
#define RC_BLOCKED_HOST 1
為真則拋錯,接下來我們需要看rc是函式ip_to_hostname的返回值
位於hostname.cc 的ip_to_hostname函式中

點選(此處)摺疊或開啟

  1. if (!(specialflag & SPECIAL_NO_HOST_CACHE))
  2.   {
  3.     mysql_mutex_lock(&hostname_cache->lock);//這裡注意一下整個在cache中查詢的過程是有MUTEX的
  4.     Host_entry *entry= hostname_cache_search(ip_key);//在cache中查詢
  5.    
  6.     if (entry)
  7.     {
  8.       entry->m_last_seen= now;
  9.       *connect_errors= entry->m_errors.m_connect;
  10.       if (entry->m_errors.m_connect >= max_connect_errors) //max_connect_errors就是引數我們設定的引數
  11.       {
  12.         entry->m_errors.m_host_blocked++;
  13.         entry->set_error_timestamps(now);
  14.         mysql_mutex_unlock(&hostname_cache->lock);//這裡解鎖
  15.         DBUG_RETURN(RC_BLOCKED_HOST);
  16.       }
到這裡我們找到了和這個報錯相關的一些事實:
1、max_connect_errors和這個報錯有關
2、SPECIAL_NO_RESOLVE和這個報錯有關
3、SPECIAL_NO_HOST_CACHE和這個報錯有關

max_connect_errors引數不用再解釋,如果!(specialflag & SPECIAL_NO_RESOLVE)返回為假則不會呼叫
ip_to_hostname做IP域名反解析,如果!(specialflag & SPECIAL_NO_HOST_CACHE)返回為假則不會呼叫if
以後報錯的內容。換句話說這裡的位與後然後取反關係到了這個報錯的是否觸發,那麼我們就用必要看看
specialflag和SPECIAL_NO_RESOLVE以及SPECIAL_NO_HOST_CACHE的關係了。
下面是原始碼引數
MYSQLD.CC

點選(此處)摺疊或開啟

  1. case (int) OPT_SKIP_HOST_CACHE:
  2.     opt_specialflag|= SPECIAL_NO_HOST_CACHE;
  3.     break;
  4.   case (int) OPT_SKIP_RESOLVE:
  5.     opt_skip_name_resolve= 1;
  6.     opt_specialflag|=SPECIAL_NO_RESOLVE;
  7.     break;
其實這裡是受到OPT_SKIP_HOST_CACHE、OPT_SKIP_RESOLVE的控制就是我們的引數
skip-host-cache
skip-name-resolve
那麼我們起碼能夠透過skip-host-cache和skip-name-resolve來解決問題,但是這也帶來一個問題
建立使用者的時候將不能使用域名,如果有UNIX網路程式設計基礎的朋友應該知道客戶端透過socket連線
服務端得到的是IP地址和客戶端的埠如果一旦關閉IP->DNS的反解析,user@domain這種使用者將是
不能連線的,只能是user@ip,這點異常重要。本來是可以連線的:

 create user testuuu@'test' identified by '123';
 /etc/hosts設定為:
 192.168.190.60 test
 
root@test1 ~]# mysql -utestuuu -p123 -h192.168.190.93 -P13001
Warning: Using a password on the command line interface can be insecure.
Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 17
Server version: 5.7.14-7-debug-log Source distribution

Copyright (c) 2000, 2014, Oracle and/or its affiliates. All rights reserved.

Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

mysql> exit

但是設定skip-host-cache、skip-name-resolve後
[root@test1 ~]# mysql -utestuuu -p123 -h192.168.190.93 -P13001
Warning: Using a password on the command line interface can be insecure.
ERROR 1045 (28000): Access denied for user 'testuuu'@'192.168.190.60' (using password: YES)
明顯這裡沒有透過192.168.190.60去做反解析.

其實關於很多解析的報錯都是在函式ip_to_hostname函式中,如果skip-name-resolve將不會發生這些
包含如下:
 sql_print_warning("IP address '%s' could not be resolved: %s",ip_key,gai_strerror(err_code));
 sql_print_warning("IP address '%s' has been resolved " "to the host name '%s', which resembles " "IPv4-address itself.",ip_key,hostname_buffer);
 sql_print_warning("Host name '%s' could not be resolved: %s",hostname_buffer,gai_strerror(err_code));
 sql_print_warning("Hostname '%s' does not resolve to '%s'.",hostname_buffer,ip_key);
 如果遇到問題類似問題緊急情況下先設定skip-name-resolve再說。

注意:
1、
在整個解析期間可能還會出現下面的使用者
unauthenticated user
也就是TCP/IP 握手已經成功,資料已經開始互動,執行緒已經建立,但是還沒有透過MYSQL 許可權認證的使用者(不知這句結論正確與否,自我理解)。

mysql> show processlist;
+----+----------------------+----------------------+------+---------+------+----------+------------------+-----------+---------------+
| Id | User                 | Host                 | db   | Command | Time | State    | Info             | Rows_sent | Rows_examined |
+----+----------------------+----------------------+------+---------+------+----------+------------------+-----------+---------------+
|  5 | unauthenticated user | 192.168.190.60:12770 | NULL | Connect |   35 | login    | NULL             |         0 |             0 |
|  6 | root                 | localhost            | NULL | Query   |    0 | starting | show processlist |         0 |             0 |
+----+----------------------+----------------------+------+---------+------+----------+------------------+-----------+---------------+
如果出現這種使用者優先考慮一下是否是DNS解析緩慢問題,如開啟了mysql 反解析,沒有設定合適的/etc/hosts
2、測試期間,未關閉skip-name-resolve首先使用了user@ip進行了連線,然後刪除使用者建立一個user@domain的使用者
   並且設定了/etc/hosts,這個時候客戶端連線不上服務端,一直報密碼不對,這個問題flush hosts後解決,這是
   因為最開始使用IP反解析的時候得到的域名和我設定的/etc/hosts的域名不一致,雖然得到了正確的IP,但是在
   host cache中,得到的domain並不一致,也就是說本來是username@domain1被反解析為了username@domain2這種
   情況當然在mysql.user中找不到相應的許可權使用者記錄了,這個時候flush hosts後清空了host cache,得以重新
   生成相應的host cache後正常,這點在前面的程式碼
   Host_entry *entry= hostname_cache_search(ip_key);//在cache中查詢
   也有體現。
3、select * from performance_schema.host_cache 這裡記錄了IP->DOMAIN的類容
如下:
mysql> select * from performance_schema.host_cache \G;
*************************** 1. row ***************************
                                        IP: 192.168.190.60
                                      HOST: test
                            HOST_VALIDATED: YES
..................
                                FIRST_SEEN: 2017-05-31 17:17:40
                                 LAST_SEEN: 2017-05-31 17:17:40
                          FIRST_ERROR_SEEN: NULL
                           LAST_ERROR_SEEN: NULL
4、反解析的作用
如前面所講述,反解析IP->DOMAIN的作用就在於,在建立如下使用者的時候
create user testuuu@'test' identified by '123';
的時候MYSQL能夠透過ip地址判斷出他的許可權資訊,因為在MYSQL.USER中儲存的是域名(DOMAIN),而為了加速反解析的速度,而有了HOST CACHE
那就是
第一次連線
1、拿到客戶端IP地址(socket連線客戶端IP地址資訊)
2、進行DNS反解析 (如/etc/hosts)
3、儲存反解析資訊到host cache
4、從host cache中拿到這個反解析出來的domain進行許可權驗證
如果不是第一次連線跳過第二步,也正是因為跳過了第二步,產生了注意2中的問題
flush host就是來清理host cache從而解決這種問題,重新進行DNS反解析                        
其實整個定位問題的過程還是比較簡單,但是我至今沒有找到entry->m_errors.m_connect什麼時候增加,而且原始碼中
還有很多地方 沒有理解由於時間原因我沒有去仔細看(因為這要消耗很多很多的時間),這裡只能拋磚了,同時也記錄
了我的分析過程。
   
下面是MYSQL官方手冊的解釋:   
9.12.6.2 DNS Lookup Optimization and the Host Cache   
The server handles entries in the host cache like this:
1. When the first TCP client connection reaches the server from a given IP address, a new entry is
created to record the client IP, host name, and client lookup validation flag. Initially, the host name is
set to NULLand the flag is false. This entry is also used for subsequent client connections from the
same originating IP.
2. If the validation flag for the client IP entry is false, the server attempts an IP-to-host name DNS
resolution. If that is successful, the host name is updated with the resolved host name and the
validation flag is set to true. If resolution is unsuccessful, the action taken depends on whether the error
is permanent or transient. For permanent failures, the host name remains NULLand the validation flag
is set to true. For transient failures, the host name and validation flag remain unchanged. (Another DNS
resolution attempt occurs the next time a client connects from this IP.)
3. If an error occurs while processing an incoming client connection from a given IP address, the server
updates the corresponding error counters in the entry for that IP. For a description of the errors
recorded, see Section 23.9.16.1, “The host_cache Table”.
The server uses the host cache for several purposes:
? By caching the results of IP-to-host name lookups, the server avoids doing a DNS lookup for each client
connection. Instead, for a given host, it needs to perform a lookup only for the first connection from that
host.
? The cache contains information about errors that occur during the connection process. Some errors are
considered “blocking.”If too many of these occur successively from a given host without a successful
connection, the server blocks further connections from that host. The max_connect_errorssystem
variable determines the number of permitted errors before blocking occurs. See Section B.5.2.6, “Host
'host_name' is blocked”.
To unblock blocked hosts, flush the host cache by issuing a FLUSH HOSTSstatement or executing a
mysqladmin flush-hostscommand.

作者微信:

               


來自 “ ITPUB部落格 ” ,連結:http://blog.itpub.net/7728585/viewspace-2140143/,如需轉載,請註明出處,否則將追究法律責任。

相關文章