rhel4_wget使用詳解

wisdomone1發表於2010-03-17
WGET(1)                                                                            GNU Wget                                                                            WGET(1)

NAME
       Wget - The non-interactive network downloader.

SYNOPSIS
       wget [option]... [URL]...

DESCRIPTION
       GNU Wget is a free utility for non-interactive download of files from the Web.  It supports HTTP, HTTPS, and FTP protocols, as well as retrieval through HTTP proxies.

       Wget is non-interactive, meaning that it can work in the background, while the user is not logged on.  This allows you to start a retrieval and disconnect from the
       system, letting Wget finish the work.  By contrast, most of the Web browsers require constant user’s presence, which can be a great hindrance when transferring a lot
       of data.

       Wget can follow links in HTML and XHTML pages and create local versions of remote web sites, fully recreating the directory structure of the original site.  This is
       sometimes referred to as ‘‘recursive downloading.’’  While doing that, Wget respects the Robot Exclusion Standard (/robots.txt).  Wget can be instructed to convert the
       links in downloaded HTML files to the local files for offline viewing.

       Wget has been designed for robustness over slow or unstable network connections; if a download fails due to a network problem, it will keep retrying until the whole
       file has been retrieved.  If the server supports regetting, it will instruct the server to continue the download from where it left off.

OPTIONS
       Option Syntax

       Since Wget uses GNU getopt to process command-line arguments, every option has a long form. along with the short one.  Long options are more convenient to remember, but
       take time to type.  You may freely mix different option styles, or specify options after the command-line arguments.  Thus you may write:

               wget -r --tries=10 -o log

       The space between the option accepting an argument and the argument may be omitted.  Instead -o log you can write -olog.

       You may put several options that do not require arguments together, like:

               wget -drc

       This is a complete equivalent of:

               wget -d -r -c

       Since the options can be specified after the arguments, you may terminate them with --.  So the following will try to download URL -x, reporting failure to log:

               wget -o log -- -x

       The options that accept comma-separated lists all respect the convention that specifying an empty list clears its value.  This can be useful to clear the .wgetrc set-
       tings.  For instance, if your .wgetrc sets "exclude_directories" to /cgi-bin, the following example will first reset it, and then set it to exclude /~nobody and
       /~somebody.  You can also clear the lists in .wgetrc.

               wget -X ’’ -X /~nobody,/~somebody

       Most options that do not accept arguments are boolean options, so named because their state can be captured with a yes-or-no (‘‘boolean’’) variable.  For example,
       --follow-ftp tells Wget to follow FTP links from HTML files and, on the other hand, --no-glob tells it not to perform. file globbing on FTP URLs.  A boolean option is
       either affirmative or negative (beginning with --no).  All such options share several properties.

       Unless stated otherwise, it is assumed that the default behavior. is the opposite of what the option accomplishes.  For example, the documented existence of --fol-
       low-ftp assumes that the default is to not follow FTP links from HTML pages.

       Affirmative options can be negated by prepending the --no- to the option name; negative options can be negated by omitting the --no- prefix.  This might seem superflu-
       ous---if the default for an affirmative option is to not do something, then why provide a way to explicitly turn it off?  But the startup file may in fact change the
       default.  For instance, using "follow_ftp = off" in .wgetrc makes Wget not follow FTP links by default, and using --no-follow-ftp is the only way to restore the fac-
       tory default from the command line.

       Basic Startup Options

       -V
       --version
           Display the version of Wget.

       -h
       --help
           Print a help message describing all of Wget’s command-line options.

       -b
       --background
           Go to background immediately after startup.  If no output file is specified via the -o, output is redirected to wget-log.

       -e command
       --execute command
           Execute command as if it were a part of .wgetrc.  A command thus invoked will be executed after the commands in .wgetrc, thus taking precedence over them.  If you
           need to specify more than one wgetrc command, use multiple instances of -e.

       Logging and Input File Options

       -o logfile
       --output-file=logfile
           Log all messages to logfile.  The messages are normally reported to standard error.

       -a logfile
       --append-output=logfile
           Append to logfile.  This is the same as -o, only it appends to logfile instead of overwriting the old log file.  If logfile does not exist, a new file is created.

       -d
       --debug
           Turn on debug output, meaning various information important to the developers of Wget if it does not work properly.  Your system administrator may have chosen to
           compile Wget without debug support, in which case -d will not work.  Please note that compiling with debug support is always safe---Wget compiled with the debug
           support will not print any debug info unless requested with -d.

       -q
       --quiet
           Turn off Wget’s output.

       -v
       --verbose
           Turn on verbose output, with all the available data.  The default output is verbose.

       -nv
       --no-verbose
           Turn off verbose without being completely quiet (use -q for that), which means that error messages and basic information still get printed.

       -i file
       --input-file=file
           Read URLs from file.  If - is specified as file, URLs are read from the standard input.  (Use ./- to read from a file literally named -.)

           If this function is used, no URLs need be present on the command line.  If there are URLs both on the command line and in an input file, those on the command lines
           will be the first ones to be retrieved.  The file need not be an HTML document (but no harm if it is)---it is enough if the URLs are just listed sequentially.

           However, if you specify --force-html, the document will be regarded as html.  In that case you may have problems with relative links, which you can solve either by
           adding "" to the documents or by specifying --base=url on the command line.

       -F
       --force-html
           When input is read from a file, force it to be treated as an HTML file.  This enables you to retrieve relative links from existing HTML files on your local disk,
           by adding "" to HTML, or using the --base command-line option.

       -B URL
       --base=URL
           Prepends URL to relative links read from the file specified with the -i option.

       Download Options

       --bind-address=ADDRESS
           When making client TCP/IP connections, bind to ADDRESS on the local machine.  ADDRESS may be specified as a hostname or IP address.  This option can be useful if
           your machine is bound to multiple IPs.

       -t number
       --tries=number
           Set number of retries to number.  Specify 0 or inf for infinite retrying.  The default is to retry 20 times, with the exception of fatal errors like ‘‘connection
           refused’’ or ‘‘not found’’ (404), which are not retried.

       -O file
       --output-document=file
           The documents will not be written to the appropriate files, but all will be concatenated together and written to file.  If - is used as file, documents will be
           printed to standard output, disabling link conversion.  (Use ./- to print to a file literally named -.)

           Note that a combination with -k is only well-defined for downloading a single document.

       -nc
       --no-clobber
           If a file is downloaded more than once in the same directory, Wget’s behavior. depends on a few options, including -nc.  In certain cases, the local file will be
           clobbered, or overwritten, upon repeated download.  In other cases it will be preserved.

           When running Wget without -N, -nc, or -r, downloading the same file in the same directory will result in the original copy of file being preserved and the second
           copy being named file.1.  If that file is downloaded yet again, the third copy will be named file.2, and so on.  When -nc is specified, this behavior. is sup-
           pressed, and Wget will refuse to download newer copies of file.  Therefore, ‘‘"no-clobber"’’ is actually a misnomer in this mode---it’s not clobbering that’s pre-
           vented (as the numeric suffixes were already preventing clobbering), but rather the multiple version saving that’s prevented.

           When running Wget with -r, but without -N or -nc, re-downloading a file will result in the new copy simply overwriting the old.  Adding -nc will prevent this
           behavior, instead causing the original version to be preserved and any newer copies on the server to be ignored.

           When running Wget with -N, with or without -r, the decision as to whether or not to download a newer copy of a file depends on the local and remote timestamp and
           size of the file.  -nc may not be specified at the same time as -N.

           Note that when -nc is specified, files with the suffixes .html or .htm will be loaded from the local disk and parsed as if they had been retrieved from the Web.

       -c
       --continue
           Continue getting a partially-downloaded file.  This is useful when you want to finish up a download started by a previous instance of Wget, or by another program.
           For instance:

                   wget -c ftp://sunsite.doc.ic.ac.uk/ls-lR.Z

           If there is a file named ls-lR.Z in the current directory, Wget will assume that it is the first portion of the remote file, and will ask the server to continue
           the retrieval from




另行摘記,中文字地化wget usage

wget的安裝與使用

來源: ChinaUnix部落格  日期: 2009.07.16 19:59 (共有0條評論)
 
安裝:
先把下載的wget原始檔用tar進行解壓,
然後cd到wget目錄下
# ./configure
#make
#make install
#make clean
使用:
wget url(要下載的地址)例如:wget http://music.qqihome.shangdu.com/music/15637008169/2009/6/15637008169_200906050942.wma
命令格式:
  wget [引數列表] [目標軟體、網頁的網址]
  -V,--version 顯示軟體版本號然後退出;
  -h,--help顯示軟體幫助資訊;
  -e,--execute=COMMAND 執行一個 “.wgetrc”命令
  -o,--output-file=FILE 將軟體輸出資訊儲存到檔案;
  -a,--append-output=FILE將軟體輸出資訊追加到檔案;
  -d,--debug顯示輸出資訊;
  -q,--quiet 不顯示輸出資訊;
  -i,--input-file=FILE 從檔案中取得URL;
  -t,--tries=NUMBER 是否下載次數(0表示無窮次)
  -O --output-document=FILE下載檔案儲存為別的檔名
  -nc, --no-clobber 不要覆蓋已經存在的檔案
  -N,--timestamping只下載比本地新的檔案
  -T,--timeout=SECONDS 設定超時時間
  -Y,--proxy=on/off 關閉代理
    -nd,--no-directories 不建立目錄
  -x,--force-directories 強制建立目錄
    --http-user=USER設定HTTP使用者
  --http-passwd=PASS設定HTTP密碼
  --proxy-user=USER設定代理使用者
  --proxy-passwd=PASS設定代理密碼
    -r,--recursive 下載整個網站、目錄
  -l,--level=NUMBER 下載層次
    -A,--accept=LIST 可以接受的檔案型別
  -R,--reject=LIST拒絕接受的檔案型別
  -D,--domains=LIST可以接受的域名
  --exclude-domains=LIST拒絕的域名
  -L,--relative 下載關聯連結
  --follow-ftp 只下載FTP連結
  -H,--span-hosts 可以下載外面的主機
  -I,--include-directories=LIST允許的目錄
  -X,--exclude-directories=LIST 拒絕的目錄

其它說明:

啟動:
-V, --version 顯示 Wget 的版本並且退出。
-h, --help 列印此幫助。
-b, -background 啟動後進入後臺操作。
-e, -execute=COMMAND 執行‘.wgetrc’形式的命令。
日誌記錄及輸入檔案:
-o, --output-file=檔案 將日誌訊息寫入到指定檔案中。
-a, --append-output=檔案 將日誌訊息追加到指定檔案的末端。
-d, --debug 列印除錯輸出。
-q, --quiet 安靜模式(不輸出資訊)。
-v, --verbose 詳細輸出模式(預設)。
-nv, --non-verbose 關閉詳細輸出模式,但不進入安靜模式。
-i, --input-file=檔案 下載從指定檔案中找到的 URL。
-F, --force-html 以 HTML 方式處理輸入檔案。
-B, --base=URL 使用 -F -i 檔案選項時,在相對連結前新增指定的 URL 。
下載:
-t, --tries=次數 配置重試次數(0 表示無限)。
--retry-connrefused 即使拒絕連線也重試。
-O --output-document=檔案 將資料寫入此檔案中。
-nc, --no-clobber 不更改已經存在的檔案,也不使用在檔名後
新增 .#(# 為數字)的方法寫入新的檔案。
-c, --continue 繼續接收已下載了一部分的檔案。
--progress=方式 選擇下載進度的表示方式。
-N, --timestamping 除非遠端檔案較新,否則不再取回。
-S, --server-response 顯示伺服器回應訊息。
--spider 不下載任何資料。
-T, --timeout=秒數 配置讀取資料的超時時間 (秒數)。
-w, --wait=秒數 接收不同檔案之間等待的秒數。
--waitretry=秒數 在每次重試之間稍等一段時間 (由 1 秒至指定的 秒數 不等)。
--random-wait 接收不同檔案之間稍等一段時間(由 0 秒至 2*WAIT 秒不等)。
-Y, --proxy=on/off 開啟或關閉代理伺服器。
-Q, --quota=大小 配置接收資料的限額大小。
--bind-address=地址 使用本機的指定地址 (主機名稱或 IP) 進行連線。
--limit-rate=速率 限制下載的速率。
--dns-cache=off 禁止查詢存於快取記憶體中的 DNS。
--restrict-file-names=OS 限制檔名中的字元為指定的 OS (作業系統) 所允許 的字元。
目錄:
-nd --no-directories 不建立目錄。
-x, --force-directories 強制建立目錄。
-nH, --no-host-directories 不建立含有遠端主機名稱的目錄。
-P, --directory-prefix=名稱 儲存檔案前先建立指定名稱的目錄。
--cut-dirs=數目 忽略遠端目錄中指定數目的目錄層。
HTTP 選項:
--http-user=使用者 配置 http 使用者名稱。
--http-passwd=密碼 配置 http 使用者密碼。
-C, --cache=on/off (不)使用伺服器中的快取記憶體中的資料 (預設是使用的)。
-E, --html-extension 將所有 MIME 型別為 text/html 的檔案都加上 .html 擴 展檔名。
--ignore-length 忽略“Content-Length”檔案頭欄位。
--header=字串 在檔案頭中新增指定字串。
--proxy-user=使用者 配置代理伺服器使用者名稱。
--proxy-passwd=密碼 配置代理伺服器使用者密碼。
--referer=URL 在 HTTP 請求中包含“Referer:URL”頭。
-s, --save-headers 將 HTTP 頭存入檔案。
-U, --user-agent=AGENT 標誌為 AGENT 而不是 Wget/VERSION。
--no-http-keep-alive 禁用 HTTP keep-alive(永續性連線)。
--cookies=off 禁用 cookie。
--load-cookies=檔案 會話開始前由指定檔案載入 cookie。
--save-cookies=檔案 會話結束後將 cookie 儲存至指定檔案。
--post-data=字串 使用 POST 方法,傳送指定字串。
--post-file=檔案 使用 POST 方法,傳送指定檔案中的內容。
HTTPS (SSL) 選項:
--sslcertfile=檔案 可選的客戶段端證書。
--sslcertkey=金鑰檔案 對此證書可選的“金鑰檔案”。
--egd-file=檔案 EGD socket 檔名。
--sslcadir=目錄 CA 雜湊表所在的目錄。
--sslcafile=檔案 包含 CA 的檔案。
--sslcerttype=0/1 Client-Cert 型別 0=PEM (預設) / 1=ASN1 (DER)
--sslcheckcert=0/1 根據提供的 CA 檢查伺服器的證書
--sslprotocol=0-3 選擇 SSL 協議;0=自動選擇,
1=SSLv2 2=SSLv3 3=TLSv1
FTP 選項:
-nr, --dont-remove-listing 不刪除“.listing”檔案。
-g, --glob=on/off 設定是否展開有萬用字元的檔名。
--passive-ftp 使用“被動”傳輸模式。
--retr-symlinks 在遞迴模式中,下載連結所指示的檔案(連至目錄
則例外)。
遞迴下載:
-r, --recursive 遞迴下載。
-l, --level=數字 最大遞迴深度(inf 或 0 表示無限)。
--delete-after 刪除下載後的檔案。
-k, --convert-links 將絕對連結轉換為相對連結。
-K, --backup-converted 轉換檔案 X 前先將其備份為 X.orig。
-m, --mirror 等效於 -r -N -l inf -nr 的選項。
-p, --page-requisites 下載所有顯示完整網頁所需的檔案,例如影像。
--strict-comments 開啟對 HTML 備註的嚴格(SGML)處理選項。
遞迴下載時有關接受/拒絕的選項:
-A, --accept=列表 接受的檔案樣式列表,以逗號分隔。
-R, --reject=列表 排除的檔案樣式列表,以逗號分隔。
-D, --domains=列表 接受的域列表,以逗號分隔。
--exclude-domains=列表 排除的域列表,以逗號分隔。
--follow-ftp 跟隨 HTML 檔案中的 FTP 連結。
--follow-tags=列表 要跟隨的 HTML 標記,以逗號分隔。
-G, --ignore-tags=列表 要忽略的 HTML 標記,以逗號分隔。
-H, --span-hosts 遞迴時可進入其它主機。
-L, --relative 只跟隨相對連結。
-I, --include-directories=列表 要下載的目錄列表。
-X, --exclude-directories=列表 要排除的目錄列表。
-np, --no-parent 不搜尋上層目錄。

來自 “ ITPUB部落格 ” ,連結:http://blog.itpub.net/9240380/viewspace-629759/,如需轉載,請註明出處,否則將追究法律責任。