別以為真懂Openstack: 虛擬機器建立的50個步驟和100個知識點(4)

hxw2ljj發表於2015-11-18

六、Libvirt

image

對於Libvirt,在啟動虛擬機器之前,首先需要define虛擬機器,是一個XML格式的檔案

列出所有的Instance

# virsh list
Id    Name                           State
----------------------------------------------------
10    instance-00000006              running

# virsh dumpxml instance-00000006
<domain type='kvm' id='10'>
  <name>instance-00000006</name>
  <uuid>73b896bb-7c7d-447e-ab6a-c4089532f003</uuid>
  <memory unit='KiB'>2097152</memory>
  <currentMemory unit='KiB'>2097152</currentMemory>
  <vcpu placement='static'>1</vcpu>
  <resource>
    <partition>/machine</partition>
  </resource>
  <sysinfo type='smbios'>
    <system>
      <entry name='manufacturer'>OpenStack Foundation</entry>
      <entry name='product'>OpenStack Nova</entry>
      <entry name='version'>2014.1.1</entry>
      <entry name='serial'>80590690-87d2-e311-b1b0-a0481cabdfb4</entry>
      <entry name='uuid'>73b896bb-7c7d-447e-ab6a-c4089532f003</entry>
    </system>
  </sysinfo>
  <os>
    <type arch='x86_64' machine='pc-i440fx-trusty'>hvm</type>
    <boot dev='hd'/>
    <smbios mode='sysinfo'/>
  </os>
  <features>
    <acpi/>
    <apic/>
  </features>
  <cpu mode='host-model'>
    <model fallback='allow'/>
  </cpu>
  <clock offset='utc'>
    <timer name='pit' tickpolicy='delay'/>
    <timer name='rtc' tickpolicy='catchup'/>
    <timer name='hpet' present='no'/>
  </clock>
  <on_poweroff>destroy</on_poweroff>
  <on_reboot>restart</on_reboot>
  <on_crash>destroy</on_crash>
  <devices>
    <emulator>/usr/bin/kvm-spice</emulator>
    <disk type='file' device='disk'>
      <driver name='qemu' type='qcow2' cache='none'/>
      <source file='/var/lib/nova/instances/73b896bb-7c7d-447e-ab6a-c4089532f003/disk'/>
      <target dev='vda' bus='virtio'/>
      <alias name='virtio-disk0'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x04' function='0x0'/>
    </disk>
    <controller type='usb' index='0'>
      <alias name='usb0'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x2'/>
    </controller>
    <controller type='pci' index='0' model='pci-root'>
      <alias name='pci.0'/>
    </controller>
    <interface type='bridge'>
      <mac address='fa:16:3e:ae:f4:17'/>
      <source bridge='qbrc51a349e-87'/>
      <target dev='tapc51a349e-87'/>
      <model type='virtio'/>
      <alias name='net0'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x03' function='0x0'/>
    </interface>
    <serial type='file'>
      <source path='/var/lib/nova/instances/73b896bb-7c7d-447e-ab6a-c4089532f003/console.log'/>
      <target port='0'/>
      <alias name='serial0'/>
    </serial>
    <serial type='pty'>
      <source path='/dev/pts/20'/>
      <target port='1'/>
      <alias name='serial1'/>
    </serial>
    <console type='file'>
      <source path='/var/lib/nova/instances/73b896bb-7c7d-447e-ab6a-c4089532f003/console.log'/>
      <target type='serial' port='0'/>
      <alias name='serial0'/>
    </console>
    <input type='tablet' bus='usb'>
      <alias name='input0'/>
    </input>
    <input type='mouse' bus='ps2'/>
    <input type='keyboard' bus='ps2'/>
    <graphics type='vnc' port='5900' autoport='yes' listen='0.0.0.0' keymap='en-us'>
      <listen type='address' address='0.0.0.0'/>
    </graphics>
    <video>
      <model type='cirrus' vram='9216' heads='1'/>
      <alias name='video0'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x0'/>
    </video>
    <memballoon model='virtio'>
      <alias name='balloon0'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x05' function='0x0'/>
    </memballoon>
  </devices>
  <seclabel type='dynamic' model='apparmor' relabel='yes'>
    <label>libvirt-73b896bb-7c7d-447e-ab6a-c4089532f003</label>
    <imagelabel>libvirt-73b896bb-7c7d-447e-ab6a-c4089532f003</imagelabel>
  </seclabel>
</domain>

我們發現裡面定義了虛擬化型別kvm, vcpu, memory, disk, pty等,需要注意的是network,是一個tap device,attach到了qbr上。

虛擬化有很多種型別,參考下面的文章

虛擬化技術

[轉]Virtualization Basics

當然虛擬機器啟動了之後,透過程式的檢視,便能看到複雜無比的引數

# ps aux | grep instance-00000006
libvirt+ 22200  6.3  0.4 5464532 282888 ?      Sl   09:51   0:09 qemu-system-x86_64 -enable-kvm -name instance-00000006 -S -machine pc-i440fx-trusty,accel=kvm,usb=off -cpu SandyBridge,+erms,+smep,+fsgsbase,+pdpe1gb,+rdrand,+f16c,+osxsave,+dca,+pcid,+pdcm,+xtpr,+tm2,+est,+smx,+vmx,+ds_cpl,+monitor,+dtes64,+pbe,+tm,+ht,+ss,+acpi,+ds,+vme -m 2048 -realtime mlock=off -smp 1,sockets=1,cores=1,threads=1 -uuid 73b896bb-7c7d-447e-ab6a-c4089532f003 -smbios type=1,manufacturer=OpenStack Foundation,product=OpenStack Nova,version=2014.1.1,serial=80590690-87d2-e311-b1b0-a0481cabdfb4,uuid=73b896bb-7c7d-447e-ab6a-c4089532f003 -no-user-config -nodefaults -chardev socket,id=charmonitor,path=/var/lib/libvirt/qemu/instance-00000006.monitor,server,nowait -mon chardev=charmonitor,id=monitor,mode=control -rtc base=utc,driftfix=slew -global kvm-pit.lost_tick_policy=discard -no-hpet -no-shutdown -boot strict=on -device piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2 -drive file=/var/lib/nova/instances/73b896bb-7c7d-447e-ab6a-c4089532f003/disk,if=none,id=drive-virtio-disk0,format=qcow2,cache=none -device virtio-blk-pci,scsi=off,bus=pci.0,addr=0x4,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1 -netdev tap,fd=24,id=hostnet0,vhost=on,vhostfd=25 -device virtio-net-pci,netdev=hostnet0,id=net0,mac=fa:16:3e:ae:f4:17,bus=pci.0,addr=0x3 -chardev file,id=charserial0,path=/var/lib/nova/instances/73b896bb-7c7d-447e-ab6a-c4089532f003/console.log -device isa-serial,chardev=charserial0,id=serial0 -chardev pty,id=charserial1 -device isa-serial,chardev=charserial1,id=serial1 -device usb-tablet,id=input0 -vnc 0.0.0.0:0 -k en-us -device cirrus-vga,id=video0,bus=pci.0,addr=0x2 -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x5

然而誰能解釋這些引數是幹什麼的?

請仔細閱讀下面兩篇文章

 

QEMU KVM libvirt 手冊(3) - Storage Media

QEMU KVM Libvirt手冊(7): 硬體虛擬化

QEMU KVM Libvirt手冊(8): 半虛擬化裝置virtio

machine引數是匯流排Architecture,透過qemu-system-x86_64 --machine ?檢視,default就是引數中的值。

accel=kvm說明虛擬化使用的是kvm

cpu表示處理器的引數以及處理器的一些flags,可以使用命令qemu-system-x86_64 --cpu ?檢視

smp是對稱多處理器,

-smp 1,sockets=1,cores=1,threads=1

qemu模擬了一個具有1個vcpu,一個socket,一個core,一個threads的處理器。

socket, core, threads是什麼概念呢

(1)socket就是主機板上插cpu的槽的數目,也即管理員說的”路“
(2)core就是我們平時說的”核“,即雙核,4核等
(3)thread就是每個core的硬體執行緒數,即超執行緒

具體例子,某個伺服器是:2路4核超執行緒(一般預設為2個執行緒),那麼,透過cat /proc/cpuinfo看到的是2*4*2=16個processor,很多人也習慣成為16核了!

SMBIOS全稱System Management BIOS,用於表示x86 architectures的硬體資訊,包含BIOS,主機板的資訊,這裡都是openstack,是假的了

-chardev socket,id=charmonitor,path=/var/lib/libvirt/qemu/instance-00000006.monitor,server,nowait

-mon chardev=charmonitor,id=monitor,mode=control

這是一對,用unix socket方式暴露monitor,從而可以透過virsh操作monitor

rtc是指system clock, -no-hpet是指不用更精準的時間。

-device piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2 是USB,連線到PCI匯流排0上,是device 0, function 1

下面兩個是一對

-drive file=/var/lib/nova/instances/73b896bb-7c7d-447e-ab6a-c4089532f003/disk,if=none,id=drive-virtio-disk0,format=qcow2,cache=none

-device virtio-blk-pci,scsi=off,bus=pci.0,addr=0x4,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1

表示硬碟,drive指向檔案,device使用virtio,連到pci的匯流排0上,是device 4, funtion 0

下面兩個是一對

-netdev tap,fd=24,id=hostnet0,vhost=on,vhostfd=25

-device virtio-net-pci,netdev=hostnet0,id=net0,mac=fa:16:3e:ae:f4:17,bus=pci.0,addr=0x3

表示網路卡,用tap device,device使用virtio,連線到pci的匯流排0上,是device 3,function 0

下面兩個是一對

-chardev file,id=charserial0,path=/var/lib/nova/instances/73b896bb-7c7d-447e-ab6a-c4089532f003/console.log

-device isa-serial,chardev=charserial0,id=serial0

是chardev,將log重定向到console.log

下面兩個是一對,是pty

-chardev pty,id=charserial1

-device isa-serial,chardev=charserial1,id=serial1

這是顯示卡

-device cirrus-vga,id=video0,bus=pci.0,addr=0x2

這是記憶體

-device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x5

都連線到pci匯流排上,透過命令virsh # qemu-monitor-command instance-00000024 --hmp "info pci"可以檢視pci匯流排上的所有裝置。

這裡面有很多半虛擬化的裝置,從而可以提高效能

[轉] KVM VirtIO paravirtualized drivers: why they matter

Virtio: An I/O virtualization framework for Linux

QEMU KVM Libvirt手冊(8): 半虛擬化裝置virtio

[轉]Virtio balloon

除了硬體的虛擬化和半虛擬化,對於網路,qemu和kvm本身也有自己的機制

 

QEMU KVM Libvirt手冊(9): network 

QEMU Networking

Virtual Networking

同時對於儲存,也有自己的機制

QEMU KVM Libvirt手冊(11): Managing Storage 

這一節最後要說的,就是libvirt對虛擬機器的管理

有一個強大的工具叫monitor,可以進行多種操作,相當於機器的管理介面,也可以透過virsh進行操作,參考文章QEMU KVM libvirt手冊(2)

最重要的命令列工具就是virsh了,參考QEMU KVM Libvirt手冊(10):Managing Virtual Machines with libvirt

 

七、Neutron

image

這一步,就是講instance連線到已經建立好的網路裝置上

步驟33:建立qbr網橋

步驟34:建立veth pair,qvo和qvb

步驟35:將qvb新增到qbr上

步驟36:將qvo新增到br-int上

看起來複雜而清晰的連線過程,為什麼要這樣,需要理解neutron中的網路裝置架構

其實很早就有人畫出來了,如下面的圖

在network Node上:

under-the-hood-scenario-1-ovs-network

在Compute Node上:

under-the-hood-scenario-1-ovs-compute

當看到這裡,很多人腦袋就大了,openstack為什麼要建立這麼多的虛擬網路卡,他們之間什麼關係,這些dl_vlan, mod_vlan_vid都是什麼東東啊?

請參考文章neutron的基本原理

neutron的不同的private network之間是隔離的,租戶隔離技術三種常用的VLAN, GRE,VXLAN,各有優缺點

VLAN原理

A virtual LAN (VLAN) is a group of networking devices in the same broadcast domain.

image

有兩種VLAN

Static VLAN/Port-based VLAN

  • manually assign a port on a switch to a VLAN using an Interface Subconfiguration mode command.

Dynamic VLANs

  • the switch automatically assigns the port to a VLAN using information from the user device, such as its MAC address, IP address, or even directory information (a user or group name, for instance).
  • The switch then consults a policy server, called a VLAN membership policy server (VMPS), which contains a mapping of device information to VLANs.

有兩種connection

Access-Link Connections

  • a device that has a standardized Ethernet NIC that understands only standardized Ethernet frames
  • Access-link connections can only be associated with a single VLAN.

Trunk Connections

  • trunk connections are capable of carrying traffic for multiple VLANs.

image

IEEE’s 802.1Q

image

優點

Increased performance

  • reducing collisions
  • limiting broadcast traffic
  • Less need to be routed

Improved manageability

  • Manage logical groups

Increased security options

  • packets only to other members of the VLAN.

缺點

limited number of VLANs 4000 -> 1000

number of MAC addresses supported in switches

GRE的原理

Generic Routing Encapsulation (GRE) is a tunneling protocol that can encapsulate a wide variety of network layer protocols inside virtual point-to-point links over an Internet Protocol internetwork.

image

Header

image

優點

Resolve the VLAN and MAC limitations by encapsulating communications within p2p 'tunnels' which hid the guest MAC information exposing only the MAC addresses of host systems.

L2 to L3, after leaving the encapsulated L2 virtual network, the traffic is forwarded to a gateway which can de-encapsulate traffic and route it out onto the leveraged unencapsulated network.

 

缺點

Point to point tunnel

Pool extensibility

Few switches can understand GRE Header, so load distribution and ACL (both depends on IPs and ports) can not be applied

image

VXLAN原理

Allow for virtual machines to live in two disparate networks yet still operate as if they were attached to the same L2.

Components:

  • Multicast support, IGMP and PIM
  • VXLAN Network Identifier (VNI): 24-bit segment ID
  • VXLAN Gateway
  • VXLAN Tunnel End Point (VTEP)
  • VXLAN Segment/VXLAN Overlay Network

 image

image

 

 

 

  1. When VM1 wants to send a packet to VM2, it needs the MAC address of VM2 this is the process that is followed:
  2. VM1 sends a ARP packet requesting the MAC address associated with 192.168.0.101
  3. This ARP is encapsulated by VTEP1 into a multicast packet to the multicast group associated with VNI 864
  4. All VTEPs see the multicast packet and add the association of VTEP1 and VM1 to its VXLAN tables
  5. VTEP2 receives the multicast packet decapsulates it, and sends the original broadcast on portgroups associated with VNI 864
  6. VM2 sees the ARP packet and responds with its MAC address
  7. VTEP2 encapsulates the response as a unicast IP packet and sends it back to VTEP1 using IP routing
  8. VTEP1 decapsulates the packet and passes it on to VM1
  9. At this point VM1 knows the MAC address of VM2 and can send directed packets to it as shown in in Figure 2: VM to VM communication:
  10. VM1 sends the IP packet to VM2 from IP address 192.168.0.100 to 192.168.0.101
  11. VTEP1 takes the packet and encapsulates it by adding the following headers:
  12. VXLAN header with VNI=864
  13. Standard UDP header and sets the UDP checksum to 0×0000, and the destination port being the VXLAN IANA designated port.  Cisco N1KV is currently using port ID 8472.
  14. Standard IP header with the Destination being VTEP2’s IP address and Protocol 0×011 for the UDP packet used for delivery
  15. Standard MAC header with the MAC address of the next hop.  In this case it is the router Interface with MAC address 00:10:11:FE:D8:D2 which will use IP routing to send it to the destination
  16. VTEP2 receives the packet as it has it’s MAC address as the destination.  The packet is decapsulated and found to be a VXLAN packet due to the UDP destination port.  At this point the VTEP will look up the associated portgroups for VNI 864 found in the VXLAN header.  It will then verify that the target, VM2 in this case, is allowed to receive frames for VNI 864 due to it’s portgroup membership and pass the packet on if the verification passes.
  17. VM2 receives the packet and deals with it like any other IP packet.

優點

Address 4K VLAN Limitation

Solves mac address scaling issues

Better scalability and failover

缺點

VXLAN expects multicast to be enabled on physical networks, and it does MAC flooding to learn end points.

But IP multicast is usually disabled

Need MAC preprovisioning via a SDN Controller

Software VTEPs may have performance issue

在Openstack中,neutron的很多網路功能都是由openvswitch實現的,因而本人專門研究了一下openvswitch,參考下面的文章

 

OpenFlow學習筆記

Openvswitch手冊(1)

Openvswitch手冊(2)

Openvswitch手冊(3)

Openvswitch手冊(4)

[轉]Comparing sFlow and NetFlow in a vSwitch

[轉]Rapidly detecting large flows, sFlow vs. NetFlow/IPFIX

Openvswitch手冊(5)

Openvswitch手冊(6)

Openvswitch手冊(7)

 

Openvswitch手冊(8)

Openvswitch手冊(9)

Openvswtich 學習筆記

對於網路的管理,有很多好的工具可以使用

 

[轉] iptables

HTB Linux queuing discipline manual - user guide筆記

iproute2學習筆記

tcpdump

[轉]Linux作業系統tcpdump抓包分析詳解

[轉] IPTables for KVM Host

[轉] Firewall and network filtering in libvirt

[轉] XEN, KVM, Libvirt and IPTables

來自 “ ITPUB部落格 ” ,連結:http://blog.itpub.net/18796236/viewspace-1840124/,如需轉載,請註明出處,否則將追究法律責任。

相關文章