all-about-virtual-link
  • Introduction
  • 1.overview
  • 2.DPDK memory tuning
  • 3.Agent architecture
  • 4.virtual link host PMD
  • 5.put all together
Powered by GitBook
On this page
  • 1.first thing is make it clear how much memory we must preserve for the virtual link .
  • 2.set up the environment for virtual link.
  • 1) io ports
  • 2) io memory.

Was this helpful?

5.put all together

This chapter will present how to put all the components together ,and make it work well.

1.first thing is make it clear how much memory we must preserve for the virtual link .

I would say,The maximum memory would be 1GB,i.e. 512 huge pages,the reason is simple:we must map the dpdk hugepage into guest device by qemu,the pci device memory limitation make it 1GB,so,is this possible for a virtual link to run well? the memory mapping facilitate packet rx\/tx between guest and host,so the memory which could be used for other purpose is limited,we address this limitation by making host dpdk huge pages memory exclusive for packets transmiting ,application should be aware of this ,and manage their own memory if necessary.

2.set up the environment for virtual link.

build the projects:

git clone https://github.com/chillancezen/vlinky2 and export this directory as VLINK_SDK ,and then change directory to $(VLINK_SDK)/vlinkinfra build and install vlink infrastructure library by make install, so we could also start up demo agent :make -C sample;./sample/bin/virt_compute so far ,the agent start to work for qemu and dpdk.

next build a eal-modified dpdk version git clone https://github.com/chillancezen/dpdk-16.07-vlink this projects will changed some code of eal and introducer a new PMD driver:eth_vlink<x>,link=<name>,mode=[bond|mq] ,build it as usual.

next also build a special version of qemu and a new pci device is introduced :-device pci-vlink,vm_domain=testvm1,init_config=./link.config,the initial config file has the following format:

tap123 00:12:23:34:45:56 3
tap456 00:12:23:34:45:57 5

the are name mac-address number-of-link,and the channel number should be number of data channels plus 1,the one is the ctrl channel.

note qemu side allocate virtual link ,and dpdk side attach to virtual link,so we define virtual link in the initial config script ,then start an instance use qemu ,a possible command parameter is presented as below: taskset -c 9-11 ./qemu-system-x86_64 --enable-kvm -m 2G -smp 6 -cpu host -hda mywork/centos.qcow2 -boot c -nographic -vnc :0 -netdev user,id=mynet0,net=192.168.76.0/24,dhcpstart=192.168.76.99 -device virtio-net-pci,netdev=mynet0 -redir tcp:2222::22 -device pci-vlink,vm_domain=testvm1,init_config=./link.config,next we can ssh into the virtual machine . through lspci ,we get :

[root@localhost ~(keystone_admin)]# lspci -vvv -nn -s 00:04.0
00:04.0 Ethernet controller [0200]: Catapult Communications Device [cccc:2222]
        Subsystem: Red Hat, Inc Device [1af4:1100]
        Physical Slot: 4
        Control: I/O+ Mem+ BusMaster- SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR+ FastB2B- DisINTx-
        Status: Cap- 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
        Interrupt: pin B routed to IRQ 10
        Region 0: I/O ports at c000 [size=1K]
        Region 2: Memory at 80000000 (32-bit, prefetchable) [size=1G]
        Region 3: Memory at c1000000 (32-bit, prefetchable) [size=8M]

there are three regions in the pci device ,the 0 region is io port region ,driver could fetch information through io port read \/write ,this it common way to configure device or retrive information from devices.

next two regions are host dpdk huge page memory region and virtual link control channels memory, note that the whole host dpdk memory is mmaped into device ,the size is 1GB(even though virtual address spaces maybe not continuous with host dpdk context,here the these memory is physically continuous,we know how to translate address between different contexts,we describe this later.)which is rather large ,and rounded to 1GB even the real space is lower than 1GB(pci device memory size must be power of 2). the ctrl channel memory is place which hold and accommodate virt queues.it 's a little smaller .

[root@localhost uio_vlink(keystone_admin)]# make 
[root@localhost uio_vlink(keystone_admin)]# make install
[root@localhost uio_vlink(keystone_admin)]# lspci -vvv -n -s 00:04.0
00:04.0 0200: cccc:2222
        Subsystem: 1af4:1100
        Physical Slot: 4
        Control: I/O+ Mem+ BusMaster- SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR+ FastB2B- DisINTx-
        Status: Cap- 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
        Interrupt: pin B routed to IRQ 10
        Region 0: I/O ports at c000 [size=1K]
        Region 2: Memory at 80000000 (32-bit, prefetchable) [size=1G]
        Region 3: Memory at c1000000 (32-bit, prefetchable) [size=8M]
        Kernel driver in use: uio_vlink

here we could find out what kernel driver in use. further more ,uio framework will generate a char device for every uio device under \/dev,now let's check that.

[root@localhost uio_vlink(keystone_admin)]# ls /dev/uio0 
/dev/uio0
[root@localhost uio_vlink(keystone_admin)]# ls /sys/class/uio/uio0/
dev  device  event  maps  name  portio  power  subsystem  uevent  version
[root@localhost uio_vlink(keystone_admin)]#

1) io ports

io ports is system-wide ,even in user space ,it's so easy to read\/write them ,please refer to sys\/io.h .

2) io memory.

io memory must be remapped into process virtual address space ,there are usually two ways to do that : \/dev\/uioX mapping with certain offset ,and \/sys file mapping .both of which I will not detail here still.

here in my project ,a library can do all the thing ,it 's libvlinkguest.so,so let's change pwd to $(VLINK_SDK)\/vlinkguest, then still make;make install.

next come to librte_vlink which is a DPDK extension which regulates interface how to rx\/tx(not limited ) packet from\/to virtual links:

[root@localhost lib(keystone_admin)]# pwd
/root/dpdk-16.07/lib
[root@localhost lib(keystone_admin)]# ls -l
total 60
drwxrwxr-x. 2 root root 4096 Jul 28 14:48 librte_acl
drwxrwxr-x. 2 root root   91 Jul 28 14:48 librte_cfgfile
drwxrwxr-x. 2 root root 4096 Jul 28 14:48 librte_cmdline
drwxrwxr-x. 2 root root   40 Jul 28 14:48 librte_compat
drwxrwxr-x. 2 root root 4096 Jul 28 14:48 librte_cryptodev
drwxrwxr-x. 2 root root  103 Jul 28 14:48 librte_distributor
drwxrwxr-x. 5 root root   62 Oct 20 23:32 librte_eal
drwxrwxr-x. 2 root root 4096 Jul 28 14:48 librte_ether
drwxrwxr-x. 2 root root 4096 Jul 28 14:48 librte_hash
drwxrwxr-x. 2 root root 4096 Jul 28 14:48 librte_ip_frag
drwxrwxr-x. 2 root root   91 Jul 28 14:48 librte_ivshmem
drwxrwxr-x. 2 root root   94 Jul 28 14:48 librte_jobstats
drwxrwxr-x. 2 root root  100 Jul 28 14:48 librte_kni
drwxrwxr-x. 2 root root   88 Jul 28 14:48 librte_kvargs
drwxrwxr-x. 2 root root 4096 Jul 28 14:48 librte_lpm
drwxrwxr-x. 2 root root   82 Jul 28 14:48 librte_mbuf
drwxrwxr-x. 2 root root 4096 Oct 25 02:46 librte_mempool
drwxrwxr-x. 2 root root   85 Jul 28 14:48 librte_meter
drwxrwxr-x. 2 root root 4096 Jul 28 14:48 librte_net
drwxrwxr-x. 2 root root   85 Jul 28 14:48 librte_pdump
drwxrwxr-x. 2 root root   94 Jul 28 14:48 librte_pipeline
drwxrwxr-x. 2 root root 4096 Jul 28 14:48 librte_port
drwxrwxr-x. 2 root root 4096 Jul 28 14:48 librte_power
drwxrwxr-x. 2 root root   91 Jul 28 14:48 librte_reorder
drwxrwxr-x. 2 root root   82 Jul 28 14:48 librte_ring
drwxrwxr-x. 2 root root 4096 Jul 28 14:48 librte_sched
drwxrwxr-x. 2 root root 4096 Jul 28 14:48 librte_table
drwxrwxr-x. 2 root root   85 Oct 25 02:36 librte_timer
drwxrwxr-x. 6 root root 4096 Jul 28 14:48 librte_vhost
lrwxrwxrwx  1 root root   26 Nov 14 09:49 librte_vlink -> /root/vlinky2/librte_vlink
-rw-rw-r--. 1 root root 3218 Oct 25 02:43 Makefile

make the projects ,then goto $(VLINK_SDK)\/librte_vlink\/test ,build the test program, and then ,run the program which will start the loop to rx packets from virtual links and echo them back.

[root@localhost test(keystone_admin)]# ./build/rte_vlink_test 
EAL: Detected 6 lcore(s)
EAL: Probing VFIO support...
EAL: WARNING: cpu flags constant_tsc=yes nonstop_tsc=no -> using unreliable clock cycles !
PMD: bnxt_rte_pmd_init() called for (null)
EAL: PCI device 0000:00:03.0 on NUMA socket -1
EAL:   probe driver: 1af4:1000 rte_virtio_pmd
[rte_vlink]nr of host hugepages:512
[rte_vlink]host hugepages mapping:0x7faa81c00000
[rte_vlink]ctrl channel mapping:0x7faa81200000
[p2v_tbl]:number of entries:2048
[p2v_tbl]:aligned(original) table base:0x7faac51afbc0
[v2p_tbl]nr_pages:512(real_entries:512)
[v2p_tbl]allocated aligned address:0x7faac51aeb80
[rte_vlink]scan virtual links... ...
[rte_vlink]link-id: 0  ctrl-channel-index:0 data-channels:2 mac:00:12:23:34:45:56
        [x]data-channel-index:1  (rx:67584) (free:84480) (alloc:101376) (tx:118272)
        [x]data-channel-index:2  (rx:135168) (free:152064) (alloc:168960) (tx:185856)
[rte_vlink]link-id: 1  ctrl-channel-index:3 data-channels:4 mac:00:12:23:34:45:57
        [x]data-channel-index:4  (rx:270336) (free:287232) (alloc:304128) (tx:321024)
        [x]data-channel-index:5  (rx:337920) (free:354816) (alloc:371712) (tx:388608)
        [x]data-channel-index:6  (rx:405504) (free:422400) (alloc:439296) (tx:456192)
        [x]data-channel-index:7  (rx:473088) (free:489984) (alloc:506880) (tx:523776)

we probe two virtual links ,and before starting rx\/tx we setup translation table in two directions .

last step would be launching pktgen-dpdk and start generating flows ,we put qemu and dpdk-pktgen on the same numa node (aka.numa 1),also disable hyper-thread :

./app/app/x86_64-native-linuxapp-gcc/pktgen -c 1c0 --vdev=eth_vlink0,link=tap123,mode=mq -- P -T -f themes/black-yellow.theme -m '[7/8:7/8].0'

still ,the throughput does not change at all because of zero-copy .

Previous4.virtual link host PMD

Last updated 4 years ago

Was this helpful?

next we will build a uio driver which takes over our virtual pci nic ,still clone the gt repo, ,then export environment like this:declare -x VLINK_SDK="/root/vlinky2" ,then change pwd to $(VLINK_SDK)\/uio_vlink,their we will build the uio kernel driver by:

as a matter of fact ,there are lots of ways to access io ports and io memory resource in userspace ,if you are interested with uio device internals ,please refer to: which will give you a overview about how uio works. here I tend to probe my own uio device through \/sys filesystem.and then make these resource available .there are two kinds of resource :

with 64-bytes packets ,we get this :

remember ,it's only two-channel virtual link,one one channel may share more than 10.5 Mpps through .one thing more ,the packet size impose same overhead on virtual links which is quite different from other PMD implementation .

https:\/\/github.com\/chillancezen\/vlinky2.git
http:\/\/nairobi-embedded.org\/category\/device-drivers.html