開源專案之---cgi ( cgicc )

工程師WWW發表於2015-03-30
CGI是: “公共閘道器介面”(Common Gateway Interface)的簡稱,是HTTP伺服器與其它程式進行“交談”的一種工具,其程式須執行在網路伺服器上。CGI是一段程式,它執行在Server上,提供同客戶端 Html頁面的介面。


CGI的功能:通常情況下CGI程式被用來解釋處理來自表單的輸入資訊,在伺服器產生相應的處理,並將相應的資訊反饋給瀏覽器。CGI程式使網頁具有互動功能。 


CGI處理步驟: 通過Internet把使用者請求送到伺服器伺服器接收使用者請求並交給CGI程式處理CGI程式把處理結果傳送給伺服器伺服器把結果送回到使用者

CGI可以用任何一種語言編寫,只要這種語言具有標準輸入、輸出和環境變數。


CGI程式的輸出: CGI程式中的標準輸出是經過重定向了的。CGI程式並不會在伺服器上產生任何的輸出內容,而是被重定向到客戶瀏覽器。這樣,如果編寫一個C的CGI程式的時候,把一個HTML文件輸出到它的stdout上,這個HTML文件會被在客戶端的瀏覽器中顯示出來。這也是CGI程式的一個基本原理。



CGI程式第一行輸出的內容必須是: "Content-Type:text/html"這個輸出作為HTML的檔案頭。因為CGI不僅可以像瀏覽器輸出HTML文字,而且可以輸出影象,聲音之類的東西,http伺服器向遠端傳送檔案時要說明檔案型別。


兩個重要的CGI環境變數

QUERY-STRING:GET方法表單輸入的資料,URL中間號後的內容。

CONTENT-LENGTH:POST方法輸入的資料的位元組數。


CGI環境變數列表:    
SERVER-NAME:執行CGI序為機器名或IP地址。  
SERVER-INTERFACE:WWW伺服器的型別,如:CERN型或NCSA型。  
SERVER-PROTOCOL:通訊協議,應當是HTTP/1.0。  
SERVER-PORT:TCP埠,一般說來web埠是80。  
HTTP-ACCEPT:HTTP定義的瀏覽器能夠接受的資料型別。  
HTTP-REFERER: 傳送表單的檔案URL。(並非所有的瀏覽器都傳送這一變數)  
HTTP-USER-AGENT:傳送表單的瀏覽器的有關資訊。  
GETWAY-INTERFACE:CGI程式的版本,在UNIX下為 CGI/1.1。 
PATH-TRANSLATED: PATH-INFO中包含的實際路徑名。  
PATH-INFO:瀏覽器用GET方式傳送資料時的附加路徑。  
SCRIPT-NAME: CGI程式的路徑名。  
QUERY-STRING:表單輸入的資料,URL中間號後的內容。  
REMOTE-HOST:傳送程式的主機名,不能確定該值。  
REMOTE-ADDR:傳送程式的機器的IP地址。  
REMOTE-USER:傳送程式的人名。  
CONTENT-TYPE:POST傳送,一般為applioation/xwww-form-urlencoded。 

CONTENT-LENGTH:POST方法輸入的資料的位元組數。 


二,CGICC----用C++實現的一個cgi庫 

cgicc是開發cgi程式的c++庫,它是基於stl的,從使用上來說,可以把它分成兩個部分:第一部分是輸入輸出的處理和封裝,它包括 Cgicc、CgiEnvironment、CgiInput、FormEntry和FormFile類,第二部分是資料輸出模組,它們是以MStreamable為基類的封裝了HTTPHeader和HTML元素的一系列子類。HTTPCookie是繼承MStreamable的,但是,對於輸入Cookie來說,也是通過HTTPCookie來表示的,也許這是因為Cookie通常需要在不同請求中保留而設計的。

 
  Cgicc:封裝了Web Server和CGI程式之間的資料過渡功能,對於Web Server來說它是引數的輸出物件,對於CGI程式來說它是提取Web Server傳遞過來的資料(包括瀏覽器資訊、Web Server自身的資料和使用者提交的資料)的代理。
  
  CgiEnvironment:表示CGI執行的環境變數,這些環境變數是Web Server初始化的,也就CGI需要處理的資料,它是作為Cgicc物件的資料成員而儲存的,當然開發者也可以通過getEnvironment()來直接獲得CgiEnvironment的const引用。
  
  CgiInput:這是對於Web Server資料輸入方式的抽象,對於傳統CGI程式來說它就是標準輸入,對於FastCGI來說,它是一個獨立的sockket,而且對於FastCGI或者是使用者自定義的引數輸入方式來說,可以通過繼承CgiInput來生成定製的類,只要在子類中覆蓋read、getenv成員函式就能夠很好地工作。
            
  FormEntry和FormFile是對於使用者提交的資料的抽象,FormEntry是描述普通name-value對的抽象,而FormFile則是對使用者上傳的檔案的抽象。事實上FormEntry和FormFile的本質差別就是FomFile多了一個檔名和檔案型別。 
  
  輸出資料的封裝類比較多,這裡只是說說它設計的基本思想,如果需要詳細的介面說明,可以參看cgicc的幫助文件。
  cgicc中過載了流輸出函式:CGICC_API std::ostream& operator<<(std::ostream& out, const MStreamable& obj);在具有輸出功能的基類MStreamable裡宣告為友元函式,這樣只要以  "outstream << MStreamable" 的形式呼叫的話,就會呼叫這個自定義的輸出流函式,在這個自定義的流輸出函式中,會呼叫MStreamable.render(outstream), 也就是說只要在MStreamable的子類中覆蓋render成員函式就能夠定製之類的輸出。在應用中,通常之類會把自己的內部資料轉換為字串,然後呼叫outstream << data_str ,只要在outstream的成員函式中覆蓋<<操作符,就能夠實現各種輸出協議(當然也包括FastCGI協議),在傳統的CGI中,這個outstream就是std::cout物件。


  cgicc的詳細說明文件及案例:cgicc-3.2\doc\html\index.html 

例子:

html網頁:

<form method="post" action="http://change_this_path/cgi-bin/foo.cgi">
Your name : <input type="text" name="name" /><br />
Your age : <input type="text" name="age" /><br />
Your sex : <input type="radio" name="sex" value="male"checked="checked" />Male
<input type="radio" name="sex" value="female" />Female <br />
</form>
cgi應用程式:

#include <iostream>
#include <vector>
#include <string>

#include "cgicc/Cgicc.h"
#include "cgicc/HTMLClasses.h"

using namespace std;
using namespace cgicc;

int main(int argc, char **argv)
{
   try {
      Cgicc cgi;

      // Send HTTP header
      cout << HTTPHTMLHeader() << endl;

      // Set up the HTML document
      cout << html() << << head(title("Cgicc example")) << endl;
      cout << body() << endl;

      // Print out the submitted element
      form_iterator name = cgi.getElement("name");
      if(name != cgi.getElements().end()) {
         cout << "Your name: " << **name << endl;
      }

      // Close the HTML document
      cout << body() << html();
   }
   catch(exception& e) {
      // handle any errors - omitted for brevity
   }
}

Initialization

The three main classes of cgicc you will use to process the submitted data are cgicc::Cgicc, cgicc::CgiEnvironment, and cgicc::FormEntry. These classes will be explained in detail later; for now, it is sufficient to know that:

  • The class cgicc::Cgicc is used for retrieving information on the submitted form elements.

  • The class cgicc::CgiEnvironment is used to retrieve information on environment variables passed from the HTTP server.

  • The class cgicc::FormEntry (#add cgicc::FormFile)is used to extract various types of data from the submitted form elements.
All of cgicc's functionality is accessed through class cgicc::Cgicc. Thus, the first step in CGI processing is to instantiate an object of type cgicc::Cgicc:

or

using namespace cgicc;
Cgicc cgi;

Upon instantiation, the class cgicc::Cgicc parses all data passed to the CGI script by the HTTP server.

Since errors are handled using exceptions, you may wish to wrap your CGI code in a try block to better handle unexpected conditions:

try {
   cgicc::Cgicc cgi;
}

catch(exception& e) {
   // Caught a standard library exception
}


Extracting Form Information

Each element of data entered by the user is parsed into a cgicc::FormEntry. A cgicc::FormEntry contains methods for accessing data as strings, integers, and doubles. In the form mentioned above, a user would enter their name, age, and sex. Regardless of the type of value, the data is accessed using cgicc::FormEntry (Note: this is not entirely true. For uploaded files, the data is accessed via the class cgicc::FormFile). You obtain cgicc::FormEntry objects via cgicc::Cgicc's getElement methods, all of which return typedefs of C++ standard template library (STL) iterators:

cgicc::form_iterator name = cgi.getElement("name");

If the item is not found, the iterator will refer to an invalid element, and should not be dereferenced using operator* or 

operator->. cgicc::Cgicc provides methods for determining whether an iterator refers to a valid element:

if(name != cgi.getElements().end()) {
   // iterator refers to a valid element
}


Output of Form Data

Once you have a valid element, you will more than likely want to do something with the data. The simplest thing to do is just echo it back to the user. You can extract a string from a cgicc::FormEntry by calling the getValue method. Since ostream has an overload for writing basic_string objects, it is trivial to output objects of this type:

cout << "Your name is " << name->getValue() << endl;

Since both iterator and cgicc::FormEntry overload operator*, the code given above may also be written as:

cout << "Your name is " << **name << endl;

The first * returns an object of type cgicc::FormEntry, and the second * returns an object of type string


The HTTP Response

A CGI response will generally consist of an HTML document. The HTTP protocol requires that a certain set of headers precede all documents, to inform the client of the size and type of data being received, among other things. In a normal CGI response, the HTTP server will take care of sending many of these headers for you. However, it is necessary for the CGI script to supply the type of content it is returning to the HTTP server and the client. This is done by emitting aContent-Type header. If you're interested, the full HTTP 1.1 specification may be found in RFC 2068 athttp://www.w3.org/Protocols/rfc2068/rfc2068

cgicc provides several classes for outputting HTTP headers, all of which begin with HTTP. A standard HTML 4.0 document need only output a single header:

cout << HTTPHTMLHeader() << endl;

This will generate the output

Content-Type: text/html\n\n

Output of Form Data

cgicc provides one class for every HTML tag defined in the HTML 4.0 standard in the header file "cgicc/HTMLClasses.h". These classes have the same name as the HTML tags. For example, in HTML, to indicate the start of a document you write<html> ; this can be accomplished using cgicc by writing

cout << html() << endl;

The class html keeps state internally, so the code above will produce as output<html>; conversely, the code

cout << html() << "html text!" << html() << endl;

will produce as output <html>html text!</html>.

All of cgicc's HTML output classes are subclasses of the abstract class cgicc::HTMLElement. You can embed the text for the element directly in the constructor:

cout << html("html text!") << endl;

Furthermore, it is possible to embed one cgicc::HTMLElement in another:

cout << head(title("Title")) << endl;

This produces as output

<head><title>Title</title></head>

And, if you wish be more specific about the type of HTML 4.0 you are going to return (strict, transitional, or frameset), you can use the classcgicc::HTMLDoctype before the cgicc::html tag:

cout << HTMLDoctype(HTMLDoctype::eStrict) << endl;

which produces

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0//EN" "http://www.w3.org/TR/REC-html40/strict.dtd">

More Complex HTML Output

In real HTML, most tags possess a set of attributes. For example, the HTML <img> tag requires certain attributes specifying the source image file, the image width, height, and so on. There are a bewildering number of possible attributes in HTML 4.0. For a definitive list, see the HTML 4.0 specification athttp://www.w3.org/TR/REC-html40/ A typical<img> tag might look like:

<img src="file.jpg" width="100" height="100" alt="description" />

This tag has four attributes: srcwidthheight, and alt, with the values file.jpg100100, and description, respectively. Attributes in HTML tags are represented by the class cgicc::HTMLAttribute, which essentially is a name/value pair. To build an cgicc::HTMLElement containing cgicc::HTMLAttribute objects, use the set method on cgicc::HTMLElement. To generate the<img> tag given above:

cout << img().set("src", "file.jpg")
             .set("width", "100").set("height", "100")
             .set("alt", "description") << endl;

In a similar way, multiple cgicc::HTMLElement objects may be embedded at the same level inside another cgicc::HTMLElement. To build an cgicc::HTMLElement containing multiple embedded cgicc::HTMLElement objects, use the add method on cgicc::HTMLElement:

cout << tr().add(td("0")).add(td("1")).add(td("2")) << endl;

This produces as output

<tr><td>0</td><td>1</td><td>2</td></tr>


Notes on Output

All of cgicc's output is written to the C++ standard output stream, cout. It is not necessary to use cgicc's HTML output classes; they are provided as a convenience. If you prefer, you may output the HTML code directly to cout