java.net.bindexception:address already in use: connect

醉面韋陀發表於2010-04-19

解決方法:
在網路程式設計中,特別是在短時間內new的網路連線太多,經常出現java.net.BindException: Address already in use: JVM_Bind的異常,網路有很多介紹此異常的,通常都是在說是要使用的埠被別的程式已經使用,但有時並不是這個原因,通過仔細查詢,找到一些很好的資料,在此將其一一記錄下來。
**********************************************************************
文章一
  短時間內new socket操作過多
  而socket.close()操作並不能立即釋放繫結的埠
  而是把埠設定為TIME_WAIT狀態
  過段時間(預設240s)才釋放(用netstat -na可以看到)
  最後系統資源耗盡
  (windows上是耗盡了pool of ephemeral ports 這段區間在1024-5000之間)
Socket FAQ:
  Remember that TCP guarantees all data transmitted will be delivered,
if at all possible. When you close a socket, the server goes into a
TIME_WAIT state, just to be really really sure that all the data has
gone through. When a socket is closed, both sides agree by sending
messages to each other that they will send no more data. This, it
seemed to me was good enough, and after the handshaking is done, the
socket should be closed. The problem is two-fold. First, there is no
way to be sure that the last ack was communicated successfully.
Second, there may be "wandering duplicates" left on the net that must
be dealt with if they are delivered.

Andrew Gierth (andrew@erlenstar.demon.co.uk) helped to explain the
closing sequence in the following usenet posting:

Assume that a connection is in ESTABLISHED state, and the client is
about to do an orderly release. The client's sequence no. is Sc, and
the server's is Ss. Client Server
====== ======
ESTABLISHED ESTABLISHED
(client closes)
ESTABLISHED ESTABLISHED

RESOLUTIONWarning Serious problems might occur if you modify the registry incorrectly by using Registry Editor or by using another method. These problems might require that you reinstall your operating system. Microsoft cannot guarantee that these problems can be solved. Modify the registry at your own risk.
The default maximum number of ephemeral TCP ports is 5000 in the products that are included in the 'Applies to' section. A new parameter has been added in these products. To increase the maximum number of ephemeral ports, follow these steps: 1.Start Registry Editor.2.Locate the following subkey in the registry, and then click Parameters: HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\Tcpip\Parameters
3.On the Edit menu, click New, and then add the following registry entry: Value Name: MaxUserPort
Value Type: DWORD
Value data: 65534
Valid Range: 5000-65534 (decimal)
Default: 0x1388 (5000 decimal)
Description: This parameter controls the maximum port number that is used when a program requests any available user port from the system. Typically , ephemeral (short-lived) ports are allocated between the values of 1024 and 5000 inclusive.
4.Quit Registry Editor.Note An additional TCPTimedWaitDelay registry parameter determines how long a closed port waits until the closed port can be reused.
原文連線:http://blog.chinaunix.net/u/29553/showart_450701.html
**********************************************************************
文章二
java.net.BindException: Address already in use: connect的問題
大概原因是短時間內new socket操作很多,而socket.close()操作並不能立即釋放繫結的埠,而是把埠設定為TIME_WAIT狀態,過段時間(預設240s)才釋放,(用netstat -na可以看到),最後系統資源耗盡(windows上是耗盡了pool of ephemeral ports ,這段區間在1024-5000之間; )
避免出現這一問題的方法有兩個,一個是調高你的web伺服器的最大連線執行緒數,調到1024,2048都還湊合,以resin為例,修改resin.conf中的thread-pool.thread_max,如果你採用apache連resin的架構,別忘了再調整apache;
另一個是修改執行web伺服器的機器的作業系統網路配置,把time wait的時間調低一些,比如30s。
在red hat上,檢視有關的選項,
[xxx@xxx~]$ /sbin/sysctl -a|grep net.ipv4.tcp_tw
net.ipv4.tcp_tw_reuse = 0
net.ipv4.tcp_tw_recycle = 0
[xxx@xxx~]$vi /etc/sysctl,修改
net.ipv4.tcp_tw_reuse = 1
net.ipv4.tcp_tw_recycle = 1
[xxx@xxx~]$sysctl -p,使核心引數生效

socket-faq中的這一段講time_wait的,摘錄如下:
2.7. Please explain the TIME_WAIT state.

Remember that TCP guarantees all data transmitted will be delivered,
if at all possible. When you close a socket, the server goes into a
TIME_WAIT state, just to be really really sure that all the data has
gone through. When a socket is closed, both sides agree by sending
messages to each other that they will send no more data. This, it
seemed to me was good enough, and after the handshaking is done, the
socket should be closed. The problem is two-fold. First, there is no
way to be sure that the last ack was communicated successfully.
Second, there may be "wandering duplicates" left on the net that must
be dealt with if they are delivered.

Andrew Gierth (andrew@erlenstar.demon.co.uk) helped to explain the
closing sequence in the following usenet posting:

Assume that a connection is in ESTABLISHED state, and the client is
about to do an orderly release. The client's sequence no. is Sc, and
the server's is Ss. Client Server
====== ======
ESTABLISHED ESTABLISHED
(client closes)
ESTABLISHED ESTABLISHED
------->>
FIN_WAIT_1
<<--------
FIN_WAIT_2 CLOSE_WAIT
<<-------- (server closes)
LAST_ACK
, ------->>
TIME_WAIT CLOSED
(2*msl elapses...)
CLOSED

Note: the +1 on the sequence numbers is because the FIN counts as one
byte of data. (The above diagram is equivalent to fig. 13 from RFC
793).

Now consider what happens if the last of those packets is dropped in
the network. The client has done with the connection; it has no more
data or control info to send, and never will have. But the server does
not know whether the client received all the data correctly; that's
what the last ACK segment is for. Now the server may or may not care
whether the client got the data, but that is not an issue for TCP; TCP
is a reliable rotocol, and must distinguish between an orderly
connection close where all data is transferred, and a connection abort
where data may or may not have been lost.

So, if that last packet is dropped, the server will retransmit it (it
is, after all, an unacknowledged segment) and will expect to see a
suitable ACK segment in reply. If the client went straight to CLOSED,
the only possible response to that retransmit would be a RST, which
would indicate to the server that data had been lost, when in fact it
had not been.

(Bear in mind that the server's FIN segment may, additionally, contain
data.)

DISCLAIMER: This is my interpretation of the RFCs (I have read all the
TCP-related ones I could find), but I have not attempted to examine
implementation source code or trace actual connections in order to
verify it. I am satisfied that the logic is correct, though.

More commentarty from Vic:

The second issue was addressed by Richard Stevens (rstevens@noao.edu,
author of "Unix Network Programming", see ``1.5 Where can I get source
code for the book [book title]?''). I have put together quotes from
some of his postings and email which explain this. I have brought
together paragraphs from different postings, and have made as few
changes as possible.

From Richard Stevens (rstevens@noao.edu):

If the duration of the TIME_WAIT state were just to handle TCP's full-
duplex close, then the time would be much smaller, and it would be
some function of the current RTO (retransmission timeout), not the MSL
(the packet lifetime).

A couple of points about the TIME_WAIT state.

o The end that sends the first FIN goes into the TIME_WAIT state,
because that is the end that sends the final ACK. If the other
end's FIN is lost, or if the final ACK is lost, having the end that
sends the first FIN maintain state about the connection guarantees
that it has enough information to retransmit the final ACK.

o Realize that TCP sequence numbers wrap around after 2**32 bytes
have been transferred. Assume a connection between A.1500 (host A,
port 1500) and B.2000. During the connection one segment is lost
and retransmitted. But the segment is not really lost, it is held
by some intermediate router and then re-injected into the network.
(This is called a "wandering duplicate".) But in the time between
the packet being lost & retransmitted, and then reappearing, the
connection is closed (without any problems) and then another
connection is established between the same host, same port (that
is, A.1500 and B.2000; this is called another "incarnation" of the
connection). But the sequence numbers chosen for the new
incarnation just happen to overlap with the sequence number of the
wandering duplicate that is about to reappear. (This is indeed
possible, given the way sequence numbers are chosen for TCP
connections.) Bingo, you are about to deliver the data from the
wandering duplicate (the previous incarnation of the connection) to
the new incarnation of the connection. To avoid this, you do not
allow the same incarnation of the connection to be reestablished
until the TIME_WAIT state terminates.

Even the TIME_WAIT state doesn't complete solve the second problem,
given what is called TIME_WAIT assassination. RFC 1337 has more
details.

o The reason that the duration of the TIME_WAIT state is 2*MSL is
that the maximum amount of time a packet can wander around a
network is assumed to be MSL seconds. The factor of 2 is for the
round-trip. The recommended value for MSL is 120 seconds, but
Berkeley-derived implementations normally use 30 seconds instead.
This means a TIME_WAIT delay between 1 and 4 minutes. Solaris 2.x
does indeed use the recommended MSL of 120 seconds.

A wandering duplicate is a packet that appeared to be lost and was
retransmitted. But it wasn't really lost ... some router had
problems, held on to the packet for a while (order of seconds, could
be a minute if the TTL is large enough) and then re-injects the packet
back into the network. But by the time it reappears, the application
that sent it originally has already retransmitted the data contained
in that packet.

Because of these potential problems with TIME_WAIT assassinations, one
should not avoid the TIME_WAIT state by setting the SO_LINGER option
to send an RST instead of the normal TCP connection termination
(FIN/ACK/FIN/ACK). The TIME_WAIT state is there for a reason; it's
your friend and it's there to help you :-)

I have a long discussion of just this topic in my just-released
"TCP/IP Illustrated, Volume 3". The TIME_WAIT state is indeed, one of
the most misunderstood features of TCP.

I'm currently rewriting "Unix Network Programming" (see ``1.5 Where
can I get source code for the book [book title]?''). and will include
lots more on this topic, as it is often confusing and misunderstood.

An additional note from Andrew:

Closing a socket: if SO_LINGER has not been called on a socket, then
close() is not supposed to discard data. This is true on SVR4.2 (and,
apparently, on all non-SVR4 systems) but apparently not on SVR4; the
use of either shutdown() or SO_LINGER seems to be required to
guarantee delivery of all data.
原文連線:http://hi.baidu.com/w_ge/blog/item/105877c6a361df1b9c163d21.html

文章三
當您嘗試從 TCP 埠大於 5000 連線收到錯誤 ' WSAENOBUFS (10055) '
症狀如果您嘗試建立 TCP 連線從埠是大於 5000, 本地計算機響應並以下 WSAENOBUFS (10055) 錯誤資訊: 因為系統缺乏足夠緩衝區空間或者因為佇列已滿無法執行套接字上操作。
解決方案要點 此部分, 方法或任務包含步驟告訴您如何修改登錄檔。 但是, 如果修改登錄檔錯誤可能發生嚴重問題。 因此, 確保仔細執行這些步驟。 用於新增保護之前, 修改備份登錄檔。 然後, 在發生問題時還原登錄檔。 有關如何備份和還原登錄檔, 請單擊下列文章編號以檢視 Microsoft 知識庫中相應:
預設最大數量的短暫 TCP 埠為 5000 ' 適用於 ' 部分中包含產品中。 這些產品中已新增新引數。 要增加最大值是短暫埠, 請按照下列步驟操作:
1.啟動登錄檔編輯器。 2.登錄檔, 中找到以下子項, 然後單擊 引數 : HKEY _ LOCAL _ MACHINE\SYSTEM\CurrentControlSet\Services\Tcpip\Parameters
3.在 編輯 選單, 單擊 新建 , 然後新增以下注冊表項: MaxUserPort 值名稱:
值型別: DWORD
值資料: 65534
有效範圍: 5000 - 65534 (十進位制)
預設: 0x1388 5000 (十進位制)
說明: 此引數控制程式從系統請求任何可用使用者埠時所用最大埠數。 通常, 1024 的值和含 5000 之間分配臨時 (短期) 埠。
4.退出登錄檔編輯器, 並重新啟動計算機。 注意 一個附加 TCPTimedWaitDelay 登錄檔引數決定多久關閉埠等待可以重用關閉埠。

對應英文原文為:
SYMPTOMSIf you try to set up TCP connections from ports that are greater than 5000, the local computer responds with the following WSAENOBUFS (10055) error message: An operation on a socket could not be performed because the system lacked sufficient buffer space or because a queue was full.
RESOLUTIONImportant This section, method, or task contains steps that tell you how to modify the registry. However, serious problems might occur if you modify the registry incorrectly. Therefore, make sure that you follow these steps carefully. For added protection, back up the registry before you modify it. Then, you can restore the registry if a problem occurs. For more information about how to back up and restore the registry, click the following article number to view the article in the Microsoft Knowledge Base:
322756 (http://support.microsoft.com/kb/322756/) How to back up and restore the registry in Windows

The default maximum number of ephemeral TCP ports is 5000 in the products that are included in the 'Applies to' section. A new parameter has been added in these products. To increase the maximum number of ephemeral ports, follow these steps:
1.Start Registry Editor. 2.Locate the following subkey in the registry, and then click Parameters: HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\Tcpip\Parameters
3.On the Edit menu, click New, and then add the following registry entry: Value Name: MaxUserPort
Value Type: DWORD
Value data: 65534
Valid Range: 5000-65534 (decimal)
Default: 0x1388 (5000 decimal)
Description: This parameter controls the maximum port number that is used when a program requests any available user port from the system. Typically , ephemeral (short-lived) ports are allocated between the values of 1024 and 5000 inclusive.
4.Exit Registry Editor, and then restart the computer. Note An additional TCPTimedWaitDelay registry parameter determines how long a closed port waits until the closed port can be reused.

相關文章