4.performance of vecring
4.1 hardware environment
processor capability:
we could see,two cpu sockets are avalable.next we we see cpu flags it supports:
we know that even AVX2 is surported.
next ,we will allocate memory for each sockets(1024 2M hugepages for each sockets):
note there are 4 pages that are already allocated somehow.
4.2 cores allocation
we choose socket 1 as the prefered socket in our test, next we assign cores for rx/tx task,
core id
role
location
13
rx
host
15
tx
host
17
rx
LXC
19
tx
LXC
4.3 pktgen-dpdk setup
host side pktgen-dpdk command line:
lxc side pktgen-dpdk command line:
4.4 throughput rate overview
first ,we use 64-byte minimum frame size packets as the base when only one direction traffic flows ,we will get this below:
more than 20Mpps every core can generate or receive,it's rather huge load.
with dual directions ,the pps a single core can gen/recv decreases .
~18Mpps, single core performance decrease a little ,total throughput rate does not increase linearly,we would say it's basing on memory ,and memory bandwidth for every core/socket is limited.
next we increase packets size to 1514.
still ,here is what single core can generate or receive:
about 2.4Mpps(30Gbps) for 1514 frame size .
also we expect single core gens/recvs a little less than one can can do , we verify it:
see,single core capability is about 1.2Mpps(15Gbps),almost halved.
4.5 smoketest
we will generate four virtual links ,two are assigned to socket 0 and another two are assigned to socket 1,here is the host eal command line :
container side has similar parameter :
we have the initial port view now:
virtual links on different sockets will not influence each other that much,here we get this throughpktgen> start 0,2
:
that's to say,only virtual links on the same socket can affect each other:
we get this through pktgen>start 0,1,2
:
At last ,we set frame size to 1514 and start all ports only in one direction:
note that we can achieve total 4.9Mpps(60Gbps). next we start all ports with dual directions:
so far ,we know no matter how we arrange the layout of cpu and numa and virtual link ,the memory bandwidth limits the total throughput in the system.
we are definitely sure about this from the following figure:total throughput of 64-byte frame:
so,we could be a little proud that virtual link bundles can handle almost 60Gbps packets flows of frames of any size.
4.6 summary
the bottleneck of vecring is absolutely memory bandwidth,same with virtio backed virtual link .
Last updated
Was this helpful?