命令列引數選項處理:getopt()及getopt_long()函式使用

vonzhou發表於2014-07-04


     在執行某個程式的時候,我們通常使用命令列引數來進行配置其行為。命令列選項和引數控制 UNIX 程式,告知它們如何動作。當 gcc的程式啟動程式碼呼叫我們的入口函式 main(int argc,char *argv[]) 時,已經對命令列進行了處理。argc 引數包含程式引數的個數,而 argv 包含指向這些引數的指標陣列。

程式的引數可以分為三種:選項,選項的關聯值,非選項引數。例如:

$gcc getopt_test.c -o testopt
getopt_test.c是非選項引數,-o是選項,testopt是-o選項的關聯值。根據Linux的慣例,程式的選項應該以一個短橫線開頭,後面包含單個字母或數字,選項分為:帶關聯值的和不帶關聯值的以及可選的不帶關聯值的選項可以在一個短橫線後合併使用,例如 ls -al。此外還有長選項,有兩個短橫線來指明,比如說   -o filename  --output filename  給定輸出檔名等,下面整理了一些國外的資源用來學習。

getopt():短選項處理

getopt() 函式位於 unistd.h 系統標頭檔案中,函式原型是: 
int getopt( int argc, char *const argv[], const char *optstring );
getopt使用main函式的argc和argv作為前兩個引數,optsting是一個字元列表,每個字元代表一個單字元選項,如果一個字元後面緊跟以冒號(:),表示該字元有一個關聯值作為下一個引數;兩個冒號"::"代表這個選項的引數是可選的。getopt的返回值是argv陣列中的下一個選項引數,由optind記錄argv陣列的下標,如果選項引數處理完畢,函式返回-1;如果遇到一個無法識別的選項,返回問號(?),並儲存在optopt中;

如果一個選項需要一個關聯值,而程式執行時沒有提供,返回一個問號(?),如果將optstring的第一個字元設為冒號(:),這種情況下,函式會返回冒號而不是問號。

選項引數處理完畢後,optind會指向argv陣列尾部的其他非選項引數。實際上,getopt在執行過程中會重排argv陣列,將非選項引數移到陣列的尾部
getopt() 所設定的全域性變數(在unistd.h中)包括:
optarg——指向當前選項引數(如果有)的指標。
optind—— getopt() 即將處理的下一個引數 argv 指標的索引。
optopt——最後一個已知選項。

下面是一個使用getopt簡單例子:
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>

int main( int argc, char **argv) {
      int opt = 0;
      int i = 0;
      const char *optstring = ":vV:h:" ;  //

      for(i = 0; i < argc; i++)
           printf ("%d:%s\n" , i, argv[i]);

      //分別處理選項引數中的各個引數
      while((opt = getopt (argc, argv, optstring)) != -1){
           switch (opt){
           case 'v' :
               printf ("verbose\n" );
               break ;
           case 'V' :
               printf ("option %c:the Version is %s\n" , opt, optarg);
               break ;
           case 'h' :
               printf ("The option %c  is %s...\n" , opt, optarg);
               break ;
           case '?' :
               printf ("Unknown option %c\n" ,optopt);
               break ;
          }
     }

      //option index 最終會指向非選項引數
      printf( "After getopt the optind = %d \n" , optind);

      //在執行完getopt之後重新列印 argv陣列
      for(i = 0; i < argc; i++)
           printf ("%d:%s\n" , i, argv[i]);

      return 0;
}
結果:
X:\1.KEEP MOVING\3.C\MyCodes\GetOpt\Debug\GetOpt.exe: invalid option -- x
0:X:\1.KEEP MOVING\3.C\MyCodes\GetOpt\Debug\GetOpt.exe
1:arg1
2:-v
3:-V
4:2.1
5:-h
6:help
7:-x
8:arg2
verbose
option V:the Version is 2.1
The option h  is help...
Unknown option x
After getopt the optind = 7
0:X:\1.KEEP MOVING\3.C\MyCodes\GetOpt\Debug\GetOpt.exe
1:-v
2:-V
3:2.1
4:-h
5:help
6:-x
7:arg1
8:arg2

可以看到getopt執行完後非選項引數都移到了後面,由optind指向。

getopt_long():長選項處理

函式原型  :   int getopt_long (int argc, char *const *argv, const char *shortopts, const struct option *longopts, int *indexptr)
貼一段對這個函式比較清晰的說明:

Decode options from the vector argv (whose length is argc). The argument shortopts describes the short options to accept, just as it does in getopt. The argument longopts describes the long options to accept (see above).

When getopt_long encounters a short option, it does the same thing that getopt would do: it returns the character code for the option, and stores the options argument (if it has one) inoptarg.

When getopt_long encounters a long option, it takes actions based on the flag and val fields of the definition of that option.

If flag is a null pointer, then getopt_long returns the contents of val to indicate which option it found. You should arrange distinct values in the val field for options with different meanings, so you can decode these values after getopt_long returns. If the long option is equivalent to a short option, you can use the short option's character code in val.

If flag is not a null pointer, that means this option should just set a flag in the program. The flag is a variable of type int that you define. Put the address of the flag in the flag field. Put in the val field the value you would like this option to store in the flag. In this case, getopt_long returns 0.

For any long option, getopt_long tells you the index in the array longopts of the options definition, by storing it into *indexptr. You can get the name of the option withlongopts[*indexptr].name. So you can distinguish among long options either by the values in their val fields or by their indices. You can also distinguish in this way among long options that set flags.

When a long option has an argument, getopt_long puts the argument value in the variable optarg before returning. When the option has no argument, the value in optarg is a null pointer. This is how you can tell whether an optional argument was supplied.

When getopt_long has no more options to handle, it returns -1, and leaves in the variable optind the index in argv of the next remaining argument.

getopt_long的選項用結構體option定義:
struct option {
    char *name;   //長選項的名字
    int has_arg;  // 0/1,標誌是否有選項
    int *flag; //上面有詳細說明,通常為NULL
    int val;  
};
This structure describes a single long option name for the sake of getopt_long. The argument longopts must be an array of these structures, one for each long option. Terminate the array with an element containing all zeros.

The struct option structure has these fields:
name - This field is the name of the option. It is a string. 
has_arg - This field says whether the option takes an argument. It is an integer, and there are three legitimate values: no_argument,             required_argument  and optional_argument. 
flag ,val - These fields control how to report or act on the option when it occurs.
If flag is a null pointer, then the val is a value which identifies this option. Often these values are chosen to uniquely identify particular long options.
If flag is not a null pointer, it should be the address of an int variable which is the flag for this option. The value in val is the value to store in the flag to indicate that the option was seen.

上面的英文解釋非常清晰,下面是一個使用getopt_long簡單例子:
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <getopt.h>

int main( int argcchar **argv){
      const char *short_options = "vhVo:" ;

      const struct option long_options[] = {
              { "verbose" optional_argument , NULL, 'v' },
              { "help" no_argument , NULL, 'h' },
              { "version" no_argument , NULL, 'V' },
              { "output" required_argument , NULL, 'o' },
              {NULL, 0, NULL, 0} ,  /* Required at end of array. */
     };

      for (;;) {
           int c;
          c = getopt_long (argc, argv, short_options, long_options, NULL);//
           if (c == -1) {
               break ;
          }
           switch (c) {
           case 'h' :
               printf ("The usage of this program...\n" );
               break ;
           case 'v' :
               printf ("set the program's log verbose...\n");
               break ;
           case 'V' :
               printf ("The version is 0.1 ...\n" );
               break ;
           case 'o' :
               printf ("The output file is %s.\n" ,optarg);
               break ;
           case '?' :
               printf ("Invalid option , abort the program.");
               exit (-1);
           default // unexpected
             abort ();
          }
     }

      return 0;
}

引數是:

結果:
The usage of this program...
set the program's log verbose...
The version is 0.1 ...
The output file is outputfile.

應用場景分析

在openvswitch的原始碼中,每個元件的啟動過程都會牽扯到命令列引數的解析,處理思路都是類似的。下面是我對ovsdb-client中程式碼的這部分程式碼的抽離,明確這個過程做了哪些事情。
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>
#include <getopt.h>
#include <limits.h>


void out_of_memory( void ){
      printf( "virtual memory exhausted\n" );
      abort();
}

// xmalloc最終還是呼叫標準C的 malloc,只不過進行了包裝,
//保證記憶體會分配成功,否則就因此執行終止應用程式。
void *xmalloc( size_t size){
    void *p = malloc (size ? size : 1);
    if (p == NULL) {
        out_of_memory();
    }
    return p;
}

char *xmemdup0( const char *p_, size_t length){
    char *p = xmalloc(length + 1);
    memcpy(p, p_, length);
    p[length] = '\0';
    return p;
}

//Duplicates a character string without fail, using xmalloc to obtain memory.
char *xstrdup( const char *s){
    return xmemdup0(s, strlen (s));
}

/* Given the GNU-style long options in 'options', returns a string that may be
 * passed to getopt() with the corresponding short options.  The caller is
 * responsible for freeing the string. */
char *long_options_to_short_options( const struct option options[]){
    char short_options[UCHAR_MAX * 3 + 1];
    char *p = short_options;

    for (; options-> name; options++) {
        const struct option *o = options;
        if (o->flag == NULL && o-> val > 0 && o-> val <= UCHAR_MAX) {
            *p++ = o-> val;
            if (o->has_arg == required_argument) {
                *p++ = ':';
            } else if (o->has_arg == optional_argument) {
                *p++ = ':';
                *p++ = ':';
            }
        }
    }
    *p = '\0';
    //不能直接返回區域性變數:字元陣列,需要在堆上分配空間,然後返回對應的指標。
    return xstrdup(short_options);
}

static void
parse_options( int argc, char *argv[])
{
    enum {
        OPT_BOOTSTRAP_CA_CERT = UCHAR_MAX + 1,
        OPT_TIMESTAMP ,
        DAEMON_OPTION_ENUMS ,
        TABLE_OPTION_ENUMS
    };
    static struct option long_options[] = {
        { "verbose" optional_argument , NULL, 'v' },
        { "help" no_argument , NULL, 'h' },
        { "version" no_argument , NULL, 'V' },
        { "timestamp "no_argument, NULL, OPT_TIMESTAMP },
        {NULL, 0, NULL, 0},
    };

    char *short_options = long_options_to_short_options(long_options);
    //當把把長短選項分離出來之後,就是上面的處理套路
    //這裡僅僅列印出short options
    printf( "%s\n" ,short_options);

    free(short_options);
}

int main( int argc, char **argv) {
     parse_options(argc, argv);

      return 0;
}

參考資料:
1.http://www.gnu.org/software/libc/manual/html_node/Getopt-Long-Options.html
2.http://www.ibm.com/developerworks/cn/aix/library/au-unix-getopt.html
3. http://www.cppblog.com/cuijixin/archive/2010/06/13/117788.html
4.OVS原始碼



相關文章