問題描述
Redis根據定價層說明,不同級別支援的連線數最多可達4萬(同時),但是當短時間又大量連線請求建立的時候,Redis服務的服務壓力非常大,到達100%。嚴重影響了高響應的要求。最嚴重時,經常出現Redis Client Operation timeout錯誤。
問題分析
根據設計,Redis 只使用一個執行緒進行命令處理。 Azure Cache for Redis 還利用其它核心進行 I/O 處理。 擁有更多的核心可能不會產生線性縮放,但可提高吞吐量效能。 而且,較大 VM 的頻寬限制通常比較小 VM 的更高。 這有助於避免網路飽和,從而避免應用程式超時。
- 基本快取和標準快取
- C0 (250 MB) 快取 - 最多支援 256 個連線
- C1 (1 GB) 快取 - 最多支援 1,000 個連線
- C2 (2.5 GB) 快取 - 最多支援 2,000 個連線
- C3 (6 GB) 快取 - 最多支援 5,000 個連線
- C4 (13 GB) 快取 - 最多支援 10,000 個連線
- C5 (26 GB) 快取 - 最多支援 15,000 個連線
- C6 (53 GB) 快取 - 最多支援 20,000 個連線
- 高階快取
- P1 (6 GB - 60 GB) - 最多支援 7,500 個連線
- P2 (13 GB - 130 GB) - 最多支援 15,000 個連線
- P3 (26 GB - 260 GB) - 最多支援 30,000 個連線
- P4 (53 GB - 530 GB) - 最多支援 40,000 個連線
雖然每個快取大小 最多 允許一定數量的連線,但與 Redis 的每個連線都具有其關聯的開銷。 此類開銷的一個示例是,由於 TLS/SSL 加密而導致的 CPU 和記憶體使用。 給定快取大小的最大連線限制假定輕負載快取。 如果連線開銷的負載 和 客戶端操作的負載超出了系統容量,那麼即使未超出當前快取大小的連線限制,快取也可能會遇到容量問題。
解決方案
啟用連線池,重複使用連線。建立新連線是高開銷的操作,會增大延遲,因此請儘量重複使用連線。 如果你選擇建立新連線,請確保在釋放舊連線之前先將其關閉(即使是在 .NET 或 Java 等託管記憶體語言中)。
避免高開銷操作 - 某些 Redis 操作(例如 KEYS 命令)的開銷很大,應該避免。 有關詳細資訊,請參閱有關長時間執行的命令的一些注意事項
如果請求連線及效能要求已經超過了單個伺服器的極限,則考慮使用Redis Cluster (叢集,增加分片數)。
連線池例項
一: Jedis的連線池設定
................. // boolean useSsl = true; String cacheHostname = "xxxx.redis.cache.chinacloudapi.cn"; String cachekey = " Key"; JedisPool jdspool = getPool(cacheHostname, 6379, cachekey); Jedis jedis1 = jdspool.getResource(); System.out.println("Cache Response : " + jedis1.set("Message-1","hello pool")); String msg = "Hello! The cache is working from Java!"; for (int i = 0; i < 10000; i++) { try { Jedis jedis = jdspool.getResource(); System.out.println("Cache Response : " + jedis.set("Message-Java-" + i, msg + i)); jedis.close(); } catch (Exception ex) { ex.printStackTrace(); } } .................... /** * 獲取連線池. * * @return 連線池例項 */ public static JedisPool getPool(String ip, int port, String cachekey) { JedisPoolConfig config = new JedisPoolConfig(); config.setMaxIdle(20); config.setMaxTotal(20); config.setMinIdle(10); config.setMaxWaitMillis(2000); config.setTestOnBorrow(true); config.setTestOnReturn(true); JedisPool pool = null; try { /** * 如果你遇到 java.net.SocketTimeoutException: Read timed out exception的異常資訊 * 請嘗試在構造JedisPool的時候設定自己的超時值. JedisPool預設的超時時間是2秒(單位毫秒) */ pool = new JedisPool(config, ip, port, 2000, cachekey); } catch (Exception e) { e.printStackTrace(); } return pool; }
注意:主要的設定引數為 setMaxTotal 和 setMaxIdle, MaxTotal為連線池中連線的最大數,而MaxIdle則表示允許最多多少個連線空閒時,以便隨時提供Jedis連線。 最好情況下,MaxTotal與MaxIdle一樣。
Parameter Description Default value Recommended settings maxTotal The maximum number of connections that are supported by the pool. 8 For more information, see Recommended settings. maxIdle The maximum number of idle connections in the pool. 8 For more information, see Recommended settings. minIdle The minimum number of idle connections in the pool. 0 For more information, see Recommended settings. blockWhenExhausted Specifies whether the client must wait when the resource pool is exhausted. Only when this parameter is set to true, the maxWaitMillis parameter takes effect. true We recommend that you use the default value. maxWaitMillis The maximum number of milliseconds that the client must wait when no connection is available. A value of -1 specifies that the connection never times out. We recommend that you do not use the default value. testOnBorrow Specifies whether to validate connections by using the PING command before the connections are borrowed from the pool. Invalid connections are removed from the pool. false We recommend that you set this parameter to false when the workload is heavy. This allows you to reduce the overhead of a ping test. testOnReturn Specifies whether to validate connections by using the PING command before the connections are returned to the pool. Invalid connections are removed from the pool. false We recommend that you set this parameter to false when the workload is heavy. This allows you to reduce the overhead of a ping test. jmxEnabled Specifies whether to enable Java Management Extensions (JMX) monitoring. true We recommend that you enable JMX monitoring. Take note that you must also enable the fe Recommended settings (https://partners-intl.aliyun.com/help/doc-detail/98726.htm)
maxTotal: The maximum number of connections.
To set a proper value of maxTotal, take note of the following factors:
- The expected concurrent connections based on your business requirements.
- The amount of time that is consumed by the client to run the command.
- The limit of Redis resources. For example, if you multiply maxTotal by the number of nodes (ECS instances), the product must be smaller than the supported maximum number of connections in Redis. You can view the maximum connections on the Instance Information page in the ApsaraDB for Redis console.
- The resource that is consumed to create and release connections. If the number of connections that are created and released for a request is large, the processes that are performed to create and release connections are adversely affected.
For example, the average time that is consumed to run a command, or the average time that is required to borrow or return resources and to run Jedis commands with network overhead, is approximately 1 ms. The queries per second (QPS) of a connection is about 1 second/1 millisecond = 1000. The expected QPS of an individual Redis instance is 50,000 (the total number of QPS divided by the number of Redis shards). The theoretically required size of a resource pool (maxTotal) is 50,000/1,000 = 50.
However, this is only a theoretical value. To reserve some resources, the value of the maxTotal parameter can be larger than the theoretical value. However, if the value of the maxTotal parameter is too large, the connections consume a large amount of client and server resources. For Redis servers that have a high QPS, if a large number of commands are blocked, the issue cannot be solved even by a large resource pool.
maxIdle and minIdle
maxIdle is the actual maximum number of connections required by workloads. maxTotal includes the number of idle connections as a surplus. If the value of maxIdle is too small on heavily loaded systems,
new Jedis
connections are created to serve the requests. minIdle specifies the minimum number of established connections that must be kept in the pool.The connection pool achieves its best performance when maxTotal = maxIdle. This way, the performance is not affected by the scaling of the connection pool. We recommend that you set the maxIdle and minIdle parameters to the same value if the user traffic fluctuates. If the number of concurrent connections is small or the value of the maxIdle parameter is too large, the connection resources are wasted.
You can evaluate the size of the connection pool used by each node based on the actual total QPS and the number of clients that Redis serves.
Retrieve proper values based on monitoring data
In actual scenarios, a more reliable method is to try to retrieve optimal values based on monitoring data. You can use JMX monitoring or other monitoring tools to find proper values.
You cannot obtain resources from the resource pool in the following cases:
- Timeout:
redis.clients.jedis.exceptions.JedisConnectionException: Could not get a resource from the pool … Caused by: java.util.NoSuchElementException: Timeout waiting for idle object at org.apache.commons.pool2.impl.GenericObjectPool.borrowObject(GenericObjectPool.java:449)
When you set the blockWhenExhausted parameter to false, the time specified by borrowMaxWaitMillis is not used and the borrowObject call blocks the connection until an idle connection is available.
redis.clients.jedis.exceptions.JedisConnectionException: Could not get a resource from the pool … Caused by: java.util.NoSuchElementException: Pool exhausted at org.apache.commons.pool2.impl.GenericObjectPool.borrowObject(GenericObjectPool.java:464)
This exception may not be caused by a limited pool size. For more information, see Recommended settings. To fix this issue, we recommend that you check the network, the parameters of the resource pool, the resource pool monitoring (JMX monitoring), the code (for example, the reason is that
jedis.close()
is not executed), slow queries, and the domain name system (DNS).