Hello Everybody,
I am not sure where I should post this question so I try it here first.
On my KVM VPS, I am trying to get the UnixBench score of kernel 4.1.5 to
the same level as the one of Devuan kernel 3.16.7-ckt11-1+deb8u3. The
issue is mainly on the performance of running parallel processes. The
scores for single process are quite similar. The setup is exactly the
same, so I just boot using each kernel and run UnixBench. But the
results are quite different as you can see at the bottom of this email.
One thing that I am not sure of is the use of virtio-pci driver. When I
boot my KVM VPS using Devuan kernel 3.16.7-ckt11-1+deb8u3, I got the
following on dmesg:
virtio-pci 0000:00:03.0: irq 40 for MSI/MSI-X
virtio-pci 0000:00:03.0: irq 41 for MSI/MSI-X
virtio-pci 0000:00:03.0: irq 42 for MSI/MSI-X
virtio-pci 0000:00:04.0: irq 43 for MSI/MSI-X
virtio-pci 0000:00:04.0: irq 44 for MSI/MSI-X
On kernel 4.1.15, I got the following:
virtio-pci 0000:00:03.0: virtio_pci: leaving for legacy driver
virtio-pci 0000:00:04.0: virtio_pci: leaving for legacy driver
virtio-pci 0000:00:05.0: virtio_pci: leaving for legacy driver
I have got the kernel config of 4.1.15 based on the one of 3.16.7. I
just did "make oldconfig" and took the default values for the new entries.
When I disabled CONFIG_VIRTIO_PCI_LEGACY on kernel 4.1.15, my VPS failed
to boot and dropped into initramfs prompt. There is no
CONFIG_VIRTIO_PCI_LEGACY on kernel 3.16.7.
I have got the impression that kernel 3.16.7 does not use legacy driver.
Is that true or is that just the illusion of the dmesg display so that
kernel 3.16.7 actually uses legacy driver?
Do you have any suggestions on where to look at to improve the
performance of kernel 4.1.15?
Thanks in advance for your help.
Kind regards,
Anto
----
OS: GNU/Linux -- 3.16.0-4-amd64 -- #1 SMP Debian 3.16.7-ckt11-1+deb8u3
(2015-08-04)
------------------------------------------------------------------------
Benchmark Run: Sat Jan 23 2016 16:17:26 - 16:45:46
2 CPUs in system; running 2 parallel copies of tests
Dhrystone 2 using register variables 50659090.7 lps (10.0 s, 7
samples)
Double-Precision Whetstone 7125.6 MWIPS (10.0 s, 7
samples)
Execl Throughput 6829.9 lps (29.9 s, 2
samples)
File Copy 1024 bufsize 2000 maxblocks 1164709.8 KBps (30.0 s, 2
samples)
File Copy 256 bufsize 500 maxblocks 369515.8 KBps (30.0 s, 2
samples)
File Copy 4096 bufsize 8000 maxblocks 2681384.0 KBps (30.0 s, 2
samples)
Pipe Throughput 3296092.2 lps (10.0 s, 7
samples)
Pipe-based Context Switching 614170.0 lps (10.0 s, 7
samples)
Process Creation 13784.1 lps (30.0 s, 2
samples)
Shell Scripts (1 concurrent) 9228.9 lpm (60.0 s, 2
samples)
Shell Scripts (8 concurrent) 1468.9 lpm (60.0 s, 2
samples)
System Call Overhead 4345996.5 lps (10.0 s, 7
samples)
System Benchmarks Index Values BASELINE RESULT INDEX
Dhrystone 2 using register variables 116700.0 50659090.7 4341.0
Double-Precision Whetstone 55.0 7125.6 1295.6
Execl Throughput 43.0 6829.9 1588.4
File Copy 1024 bufsize 2000 maxblocks 3960.0 1164709.8 2941.2
File Copy 256 bufsize 500 maxblocks 1655.0 369515.8 2232.7
File Copy 4096 bufsize 8000 maxblocks 5800.0 2681384.0 4623.1
Pipe Throughput 12440.0 3296092.2 2649.6
Pipe-based Context Switching 4000.0 614170.0 1535.4
Process Creation 126.0 13784.1 1094.0
Shell Scripts (1 concurrent) 42.4 9228.9 2176.6
Shell Scripts (8 concurrent) 6.0 1468.9 2448.1
System Call Overhead 15000.0 4345996.5 2897.3
========
System Benchmarks Index Score 2269.1
OS: GNU/Linux -- 4.1.15-kvm-v3 -- #1 SMP Sat Jan 23 15:34:21 CET 2016
------------------------------------------------------------------------
Benchmark Run: Sat Jan 23 2016 17:45:23 - 18:14:43
2 CPUs in system; running 2 parallel copies of tests
Dhrystone 2 using register variables 33186817.4 lps (10.0 s, 7
samples)
Double-Precision Whetstone 6982.4 MWIPS (9.7 s, 7
samples)
Execl Throughput 4174.1 lps (29.7 s, 2
samples)
File Copy 1024 bufsize 2000 maxblocks 874549.1 KBps (30.0 s, 2
samples)
File Copy 256 bufsize 500 maxblocks 269488.6 KBps (30.0 s, 2
samples)
File Copy 4096 bufsize 8000 maxblocks 1738899.1 KBps (30.0 s, 2
samples)
Pipe Throughput 2204174.7 lps (10.0 s, 7
samples)
Pipe-based Context Switching 282724.3 lps (10.0 s, 7
samples)
Process Creation 8668.9 lps (30.0 s, 2
samples)
Shell Scripts (1 concurrent) 6661.8 lpm (60.0 s, 2
samples)
Shell Scripts (8 concurrent) 1067.7 lpm (60.1 s, 2
samples)
System Call Overhead 3376500.3 lps (10.0 s, 7
samples)
System Benchmarks Index Values BASELINE RESULT INDEX
Dhrystone 2 using register variables 116700.0 33186817.4 2843.8
Double-Precision Whetstone 55.0 6982.4 1269.5
Execl Throughput 43.0 4174.1 970.7
File Copy 1024 bufsize 2000 maxblocks 3960.0 874549.1 2208.5
File Copy 256 bufsize 500 maxblocks 1655.0 269488.6 1628.3
File Copy 4096 bufsize 8000 maxblocks 5800.0 1738899.1 2998.1
Pipe Throughput 12440.0 2204174.7 1771.8
Pipe-based Context Switching 4000.0 282724.3 706.8
Process Creation 126.0 8668.9 688.0
Shell Scripts (1 concurrent) 42.4 6661.8 1571.2
Shell Scripts (8 concurrent) 6.0 1067.7 1779.5
System Call Overhead 15000.0 3376500.3 2251.0
========
System Benchmarks Index Score 1558.2