ZKUI中文編碼以及以docker方式執行的問題

weixin_34377065發表於2016-11-20

ZKUI中文編碼

問題

上週有同事反饋,通過ZKUI這個工具去上傳帶有中文的節點值時會出現中文無法顯示的問題。最終發現編碼是NCR編碼,全稱是:Numeric Character Reference。

什麼是NCR?

這裡引入一段維基百科的描述。
A numeric character reference (NCR) is a common markup construct used in SGML and SGML-derived markup languages such as HTML and XML. It consists of a short sequence of characters that, in turn, represents a single character. Since WebSgml, XML and HTML 4, the code points of the Universal Character Set (UCS) of Unicode are used. NCRs are typically used in order to represent characters that are not directly encodable in a particular document (for example, because they are international characters that don't fit in the 8-bit character set being used, or because they have special syntactic meaning in the language). When the document is interpreted by a markup-aware reader, each NCR is treated as if it were the character it represents.

確認是否是ZKUI的問題

由於zookeeper本身是可以儲存中文的(引用一段zookeeper網站上的介紹),所以基本確認是ZKUI工具本身的問題。
The ZooKeeper Data Model
ZooKeeper has a hierarchal name space, much like a distributed file system. The only difference is that each node in the namespace can have data associated with it as well as children. It is like having a file system that allows a file to also be a directory. Paths to nodes are always expressed as canonical, absolute, slash-separated paths; there are no relative reference. Any unicode character can be used in a path subject to the following constraints:

  • The null character (\u0000) cannot be part of a path name. (This causes problems with the C binding.)
  • The following characters can't be used because they don't display well, or render in confusing ways: \u0001 - \u0019 and \u007F - \u009F.
  • The following characters are not allowed: \ud800 -uF8FFF, \uFFF0-uFFFF, \uXFFFE - \uXFFFF (where X is a digit 1 - E), \uF0000 - \uFFFFF.
  • The "." character can be used as part of another name, but "." and ".." cannot alone be used to indicate a node along a path, because ZooKeeper doesn't use relative paths. The following would be invalid: "/a/b/./c" or "/a/b/../c".
  • The token "zookeeper" is reserved.

ZKUI是基於什麼實現的?

ZKUI個JAVA開源工具,可以下載原始碼。發現網頁功能是基於HttpServlet實現的,沒有使用其它一些高階的產品,比如Spring MVC等。知道是使用HttpServlet後,就會去對比Spring MVC對於中文的處理,然後就很容易去解決中文被NCR編碼的問題。

這個專案結構是不是很像Spring MVC?

解決HttpServlet請求的中文編碼

這裡可以增加一個filter為請求物件以及響應物件增加UTF-8的處理。通過這步,ZKUI上提供的單節點的CRUD就可以正常處理中文節點值了。

@WebFilter(filterName = "filtercharset", urlPatterns = "/*")
public class CharsetFilter implements Filter {

    @Override
    public void init(FilterConfig fc) throws ServletException {
        //Do Nothing
    }

    @Override
    public void doFilter(ServletRequest req, ServletResponse res, FilterChain fc) throws IOException, ServletException {
        HttpServletRequest request = (HttpServletRequest) req;
        HttpServletResponse response = (HttpServletResponse) res;

        request.setCharacterEncoding(StandardCharsets.UTF_8.toString());
        response.setCharacterEncoding(StandardCharsets.UTF_8.toString());

        fc.doFilter(req, res);
    }

    @Override
    public void destroy() {
        //Do nothing
    }

}

解決上傳檔案的中文編碼

上面只是解決了頁面GET,POST解決單節點的CRUD過程中的編碼問題,對於檔案上傳還需要特殊處理。看這段原始碼,細節我刪除了,只留下主體結構以及編碼部分的程式碼:

public class Import extends HttpServlet {

    private final static Logger logger = LoggerFactory.getLogger(Import.class);

    @Override
    protected void doPost(HttpServletRequest request, HttpServletResponse response) throws ServletException, IOException {
        logger.debug("Importing Action!");
        try {

            Iterator iter = items.iterator();
            while (iter.hasNext()) {
                FileItem item = (FileItem) iter.next();
                if (item.isFormField()) {
                    if (item.getFieldName().equals("scmOverwrite")) {
                        scmOverwrite = item.getString();
                    }
                    if (item.getFieldName().equals("scmServer")) {
                        scmServer = item.getString();
                    }
                    if (item.getFieldName().equals("scmFilePath")) {
                        scmFilePath = item.getString();
                    }
                    if (item.getFieldName().equals("scmFileRevision")) {
                        scmFileRevision = item.getString();
                    }

                } else {
                    uploadFileName = item.getName();
                    //原來的邏輯是item.getString(),我增加了指定編碼
                    sbFile.append(item.getString(StandardCharsets.UTF_8.toString()));
                }
            }


            List<String> importFile = new ArrayList<>();
            //獲取節點的值......

            ZooKeeperUtil.INSTANCE.importData(importFile, Boolean.valueOf(scmOverwrite), ServletUtil.INSTANCE.getZookeeper(request, 
            //處理其它一些內容
            request.getSession().setAttribute("flashMsg", "Import Completed!");
            response.sendRedirect("/home");
        } catch (FileUploadException | IOException | InterruptedException | KeeperException ex) {
            logger.error(Arrays.toString(ex.getStackTrace()));
            ServletUtil.INSTANCE.renderError(request, response, ex.getMessage());
        }
    }
}

上傳的檔案格式也有一定的要求:

  • 要求為無BOM的UTF-8,如果是有BOM,在處理時需要將前面的魔術數手工刪除,或者是修改這個正規表示式:
  else if (!inputLine.matches("/.+=.+=.*")) {
     throw new IOException("Invalid format at line " + lineCnt + ": " + inputLine);
  }
  • 如果檔案格式不是UTF-8的,那麼會出現亂碼內容。

ZKUI釋出為docker

將編譯好的java包,config.cfg以及專案中的Dockfile檔案上傳到linux伺服器中。

這裡簡單介紹下Dockerfile

FROM java:8

MAINTAINER jim <jiangmin168168@hotmail.com>

WORKDIR /var/app

ADD zkui-*.jar /var/app/zkui.jar
ADD config.cfg /var/app/config.cfg

ENTRYPOINT [ "java", "-jar", "/var/app/zkui.jar" ]

EXPOSE 9090
  • FROM,這是依賴的一些環境,比如java8,ubuntu,redis等
  • MAINTAINER,這是維護人員的資訊
  • WORKDIR,這裡指定工作目錄,進入docker後,會直接進入這個目錄,root@zkui-host:/var/app#
  • ADD,複製檔案到容器
  • ENTRYPOINT,容器啟動後自動執行的命令,這裡直接執行我們的java包
  • EXPOSE,這是主機連線容器的容器埠,實際啟動容器時,可以指定-p來配置主機與容器埠的對映

製作image

docker build -t jiangmin168168/zkui .

上面的jiangmin168168是在docker網站上的ID,可以將image上傳到網站上去,如果只是本地用可以不用加這個ID。當然即使加了ID,只通過build也不能實現上傳功能,還需要登入docker,通過push命令去實現。

docker build非常慢

由於官方的docker網站在國內非常慢,我基本沒有成功過,期待良久最後無情的顯示超時。最後經過同事的介紹使用了國內的DaoCloud提供的docker加速器,就是docker的一個映象,這樣大部分原本需要從docker.io下載的內容轉到國內的映象了,速度火箭上升。

檢視製作的image

docker images

發現這個image還很大,估計是自動引入java8的原因,後面可以研究下自帶jdk來看看能否減少空間。

啟動容器

docker run -dit --name zkui  --hostname  zkui-host  -v /data:/data -p 9090:9090 zkui:latest

簡單說明下這些引數

  • -dit,以後臺模式執行,後面的it詳細作用可以去查文件
  • --name,是容器的名稱
  • --hostname,是登入進容器後顯示的名稱
  • -v,指定主機與容器的目錄對映,方便於容器訪問主機的目錄
  • -p,是指定主機與容器的埠對映,容器中的埠是固定的,主機的埠是動態配置,這樣就可以部署多個容器節點
  • zkui:latest,前面是image名稱,後面是tag,如果只輸入image預設的tag就是latest

遺留問題

ZKUI中的NCR是在什麼地方引入的?回頭還需要再查詢下

相關文章