How to tune the bmv2 parameter so that it can get a larger cache and maximum number of processors when running p4 programs？

Duang · March 26, 2023, 4:39am

When I run the p4 program through simple_switch, when I use the iperf tool for udp testing, I find that only one process is working with the top command, and the cpu usage of a single process quickly reaches 100%, but the total cpu usage is only less than 5%, how can I adjust the p4 program or bmv2 configuration parameters to get more throughput and less latency for my p4 program?
I read in the forum about ways to get the best performance out of the bmv2 switch by recompiling it.

./configure 'CXXFLAGS=-g -O3' 'CFLAGS=-g -O3' --disable-logging-macros --disable-elogger

If this is the way to go, is there anything else to do after the make and make install commands are executed?

DavideS · March 26, 2023, 8:35am

Hi @Duang

Here you can find more detail about it behavioral-model/performance.md at main · p4lang/behavioral-model · GitHub despite that I don’t think you can improve performance further because the purpose and architecture of the bmv2 is design to provide fully comptibility with P4 language and not in achieve high performance

andyfingerhut · March 26, 2023, 10:54am

I do not recall if it is the default behavior, or whether it requires command line options to enable, but BMv2 can use separate threads for ingress processing vs. egress processing, and also for ingress processing for packets arriving on different input ports, and for egress processing for packets being transmitted on different output ports.

That will not help total throughput if all of your packets are going from one input port to one output port, and most of the processing is ingress processing, though.

Topic		Replies	Views
The traffic manager of the BMv2 - Implementation of different scheduling policies P4 Dev	3	785	October 25, 2022
Poor P4 VM Performance	1	333	April 11, 2023
P4 VM Performance	4	867	November 15, 2023
Queuing_Traffic Manager_V1architecture Getting Started with P4	1	1064	June 2, 2022
Run multiple p4 programs on different P4 switches using bmv2 P4 Dev	3	512	September 3, 2023

How to tune the bmv2 parameter so that it can get a larger cache and maximum number of processors when running p4 programs？

Related topics