首先感謝國家。其次感謝上大的鐘莉穎,讓我知道了大學不僅有校花,還有校雞,而且很多時候這兩者其實沒什麼差別。最後感謝清華女劉靜,讓我深刻體會到了素質教育的重要性,讓我感到有責任寫寫子系統的初始化。

各個子系統的初始化是核心整個初始化過程必然要完成的基本任務,這些任務按照固定的模式來處理,可以歸納為兩個部分:核心選項的解析以及那些子系統入口(初始化)函式的呼叫。

核心選項

Linux允許使用者傳遞核心配置選項給核心,核心在初始化過程中呼叫parse_args函式對這些選項進行解析,並呼叫相應的處理函式。

parse_args函式能夠解析形如“變數名=值”的字串,在模組載入時,它也會被呼叫來解析模組引數。

核心選項的使用格式同樣為“變數名=值”,開啟系統的grub檔案,然後找到kernel行,比如:

    kernel  /boot/vmlinuz-2.6.18 root=/dev/sda1 ro splash=silent vga=0x314 pci=noacpi

其中的“pci=noacpi”等都表示核心選項。

核心選項不同於模組引數,模組引數通常在模組載入時通過“變數名=值”的形式指定,而不是核心啟動時。如果希望在核心啟動時使用模組引數,則必須新增模組名做為字首,使用“模組名.引數=值”的形式,比如,使用下面的命令在載入usbcore時指定模組引數autosuspend的值為2。

    $ modprobe usbcore autosuspend=2

若是在核心啟動時指定,則必須使用下面的形式:

    usbcore.autosuspend=2

從Documentation/kernel-parameters.txt檔案裡可以查詢到某個子系統已經註冊的核心選項,比如PCI子系統註冊的核心選項為:
 pci=option[,option…] [PCI] various PCI subsystem options:
    off  [X86-32] don`t probe for the PCI bus
    bios  [X86-32] force use of PCI BIOS, don`t access
        the hardware directly. Use this if your machine
        has a non-standard PCI host bridge.
    nobios  [X86-32] disallow use of PCI BIOS, only direct
        hardware access methods are allowed. Use this
        if you experience crashes upon bootup and you
        suspect they are caused by the BIOS.
    conf1  [X86-32] Force use of PCI Configuration
        Mechanism 1.
    conf2  [X86-32] Force use of PCI Configuration
        Mechanism 2.
    nommconf [X86-32,X86_64] Disable use of MMCONFIG for PCI
        Configuration
    nomsi  [MSI] If the PCI_MSI kernel config parameter is
        enabled, this kernel boot option can be used to
        disable the use of MSI interrupts system-wide.
    nosort  [X86-32] Don`t sort PCI devices according to
        order given by the PCI BIOS. This sorting is
        done to get a device order compatible with
        older kernels.
    biosirq  [X86-32] Use PCI BIOS calls to get the interrupt
        routing table. These calls are known to be buggy
        on several machines and they hang the machine
        when used, but on other computers it`s the only
        way to get the interrupt routing table. Try
        this option if the kernel is unable to allocate
        IRQs or discover secondary PCI buses on your
        motherboard.
    rom  [X86-32] Assign address space to expansion ROMs.
        Use with caution as certain devices share
        address decoders between ROMs and other
        resources.
    irqmask=0xMMMM [X86-32] Set a bit mask of IRQs allowed to be
        assigned automatically to PCI devices. You can
        make the kernel exclude IRQs of your ISA cards
        this way.
    pirqaddr=0xAAAAA [X86-32] Specify the physical address
        of the PIRQ table (normally generated
        by the BIOS) if it is outside the
        F0000h-100000h range.
    lastbus=N [X86-32] Scan all buses thru bus #N. Can be
        useful if the kernel is unable to find your
        secondary buses and you want to tell it
        explicitly which ones they are.
    assign-busses [X86-32] Always assign all PCI bus
        numbers ourselves, overriding
        whatever the firmware may have done.
    usepirqmask [X86-32] Honor the possible IRQ mask stored
        in the BIOS $PIR table. This is needed on
        some systems with broken BIOSes, notably
        some HP Pavilion N5400 and Omnibook XE3
        notebooks. This will have no effect if ACPI
        IRQ routing is enabled.
    noacpi  [X86-32] Do not use ACPI for IRQ routing
        or for PCI scanning.
    routeirq Do IRQ routing for all PCI devices.
        This is normally done in pci_enable_device(),
        so this option is a temporary workaround
        for broken drivers that don`t call it.
    firmware [ARM] Do not re-enumerate the bus but instead
        just use the configuration from the
        bootloader. This is currently used on
        IXP2000 systems where the bus has to be
        configured a certain way for adjunct CPUs.
    noearly  [X86] Don`t do any early type 1 scanning.
        This might help on some broken boards which
        machine check when some devices` config space
        is read. But various workarounds are disabled
        and some IOMMU drivers will not work.
    bfsort  Sort PCI devices into breadth-first order.
        This sorting is done to get a device
        order compatible with older (<= 2.4) kernels.
    nobfsort Don`t sort PCI devices into breadth-first order.
    cbiosize=nn[KMG] The fixed amount of bus space which is
        reserved for the CardBus bridge`s IO window.
        The default value is 256 bytes.
    cbmemsize=nn[KMG] The fixed amount of bus space which is
        reserved for the CardBus bridge`s memory
        window. The default value is 64 megabytes.

註冊核心選項

就像我們不需要明白鍾莉穎是如何走上校雞的修煉之道,我們也不必理解parse_args函式的實現細節。但我們必須知道如何註冊核心選項:模組引數使用module_param系列的巨集註冊,核心選項則使用__setup巨集來註冊。

__setup巨集在include/linux/init.h檔案中定義。

171 #define __setup(str, fn)     
172   __setup_param(str, fn, fn, 0)

__setup需要兩個引數,其中str是核心選項的名字,fn是該核心選項關聯的處理函式。__setup巨集告訴核心,在啟動時如果檢測到核心選項str,則執行函式fn。str除了包括核心選項名字之外,必須以“=”字元結束。

不同的核心選項可以關聯相同的處理函式,比如核心選項netdev和ether都關聯了netdev_boot_setup函式。

除了__setup巨集之外,還可以使用early_param巨集註冊核心選項。它們的使用方式相同,不同的是,early_param巨集註冊的核心選項必須要在其他核心選項之前被處理。

兩次解析

相應於__setup巨集和early_param巨集兩種註冊形式,核心在初始化時,呼叫了兩次parse_args函式進行解析。

parse_early_param();
parse_args(“Booting kernel”, static_command_line, __start___param,
    __stop___param – __start___param,
           &unknown_bootoption);

parse_args的第一次呼叫就在parse_early_param函式裡面,為什麼會出現兩次呼叫parse_args的情況?這是因為核心選項又分成了兩種,就像現實世界中的我們,一種是普普通通的,一種是有特權的,有特權的需要在普通選項之前進行處理。

現實生活中特權的定義好像很模糊,不同的人有不同的詮釋,比如哈醫大二院的紀委書記在接受央視的採訪“老人住院費550萬元”時如是說:“我們就是一所人民醫院……就是一所貧下中農的醫院,從來不用特權去索取自己身外的任何利益……我們不但沒有多收錢還少收了。”
人生就是如此的複雜和奇怪。核心選項相對來說就要單純得多,特權都是陽光下的,不會藏著掖著,直接使用early_param巨集去宣告,讓你一眼就看出它是有特權的。使用early_param宣告的那些選項就會首先由parse_early_param去解析。