While reading the P4Runtime API specification, I noticed that a data plane can have multiple controls, even outside the target. The specification states that multiple controllers with write access can be connected to the same data plane if they have different roles. However, it doesn’t specify how to define these roles for write access to specific entities.
I’m using the P4Runtime shell for Python and BMv2 (simple_switch_grpc), but I couldn’t find a way to define roles in either of these.
How can I define roles?
Is there a way to allow a remote controller to connect to the target gRPC server (I can only connect from the switch, since the server is running on the loopback addres)?
I would recommend attempting to write to extremely simple controller programs that simply fill in this role field with different string values from each other, and see whether they can both connect and continuously add/delete entries of a table (different P4 tables for each controller), and verify that this actually works.
I suspect that BMv2 probably currently does nothing to check that the value of the role field have particular values, only whether they are the same or different for different client connections.
If the role value is the same for a set of clients, then only one of them is selected to be the primary for that role, and only the primary should be able to successfully perform WriteRequest messages.
If the role value is different for two clients, both of them should be able to simultaneously be considered primaries, and both succeed at performing WriteRequest messages.
As the P4Runtime API spec mentions, it is currently out of scope, and not defined anywhere public that I have seen, to more completely specify the meaning of particular values for the role field. I suspect that the existing open source P4Runtime API server should simply compare them to see if they are equal or different, and behave as described above.
When you start a simple_switch_grpc process on a Linux system, it should start listening for incoming TCP connection requests on the default P4Runtime API TCP port number (port number 9559 by default, but you can change that via a command line option when starting simple_switch_grpc).
If you do not have any kind of firewall software configured on the Linux system, it should be able to accept connections from other hosts that can reach it over the Internet. Going to another system and first trying to “ping” the host running simple_switch_grpc to see if they can communicate at all, if that succeeds, then try ‘telnet 9559` and see if it can connect. If not, start with trying to find out why that is being blocked, by looking up firewall configuration options for the operating system where you are running simple_switch_grpc.
I tried to connect multiple controllers to the switch, to do this I had to use incresing election_id(s).
According to the specifications:
P4Runtime’s client arbitration mechanism ensures that only the current primary can modify state on the switch, and that the election_id is monotonically increasing. For example, the switch must finish all previous write operations before before selecting a different primary, and must only accept write requests from the current primary
The result is that only the controller connected with the highest election_id (the primary) can use write access and packet I/O. If I try to perform actions that require write access from another controller, even with a different role name, I get the “Not primary” error.
This is exactly what I expected, since I didn’t define the role and the corresponding non-overlapping entites, as required by the specification:
Multiple controllers may have orthogonal, non-overlapping, “roles” (or “realms”) and should be able to push forwarding entities simultaneously. The control plane can be partitioned into multiple roles and each role will have a set of controllers, one of which is the primary and the rest are backups. Role definition, i.e. how P4 entities get assigned to each role, is out-of-scope of this document.
It could be that BMv2 is simply ignoring the role, or it is ignoring the role name I’m using because no such role has been defined (is there currently a way to define roles using BMv2?), so the default role is used.
A controller can omit the role message in MasterArbitrationUpdate. This implies the “default role”, which corresponds to “full pipeline access”.
I am working in a Mininet-defined topology, and from other devices than the switch I can’t access the gRPC server (the switch has’t an IP address).
Outside of the Mininet topology I can of course access the server, since it is running on the loopback address of the operating system: maybe I should place the controller outside of the Mininet topology?
If you have multiple controllers that connect with different roles, then the election_id values should be irrelevant. If only one controller connects with a particular role value, it should automatically be the one with the highest election_id, and thus should become primary for that (device, role).
You said that you know how to have multiple controllers connect with different election_id values. Have you actually tried setting the role field to different values in different controllers? That is not clear to me from your message.
If I understand correctly, Mininet uses Linux network namespaces, creating a separate one for each switch and host in your topology.
I would guess that there is a way to enable connections to port X from outside of the host system, to be “forwarded” to connect to port Y inside of a specified network namespace, but I have not done so before. If there is, you could, for example, configure:
connection requests to TCP port 5910 from outside the physical host to be forwarded to TCP port 5995 on network namespace 1 (whatever the name of the network namespace is running one of your BMv2 switches)
also configure connection requests to TCP port 5911 from outside of the physical host to be forwarded to TCP port 5995 on network namespace 2 where the second BMv2 switch is running (if you have more than one switch)
etc. for any additional Bmv2 switches in your topology.
Then connect from an outside host to your physical host’s IPv4 address on TCP port 5910 to connect to the first switch, or on TCP port 5911 to connect to the second switch, etc.
Sorry I do not have details on how to configure such port forwarding handy right now, but hopefully ChatGPT or Google are your friends there.
If I use different role names but the same election_id I can’t connect (I get an error: Election id already exists), but since I never defined the role and the relative entries, the switch maybe simply ignore it because it doesn’t recognize it.
I am not certain, but skimming through some of the code in the PI repository [1] for creating new connections, especially add_connection [2], which is the only place in that code that the error message “Election id already exists” exists, it appears that it uses only the election id when deciding whether to accept the new client – it completely ignores the role field of the message. Thus it appears that this project never implemented multiple roles in any form at all. That would explain the behavior you are seeing.
I do not know how much work it would be to enable support for multiple distinct roles, each with their own independent election id space, but I would guess not a large amount. Most of it would be familiarizing oneself with that part of the code.
In searching on the PI repository [1] for “role”, I found this relevant issue [2]. See especially this comment [3], saying that if you skip sending the MasterArbitrationMessage completely when establishing a session from any client, it should still have the ability to read and write all P4 object state, even if multiple clients are connected at the same time.
This is a workaround, and it does have the limitation that none of the controllers can receive PacketIn messages from the device when they connect in this way, according to this comment: [4]
Unfortunately, I need to use Packet I/O.
However, I have found another solution to the problem: using a single embedded controller that connects to a central controller outside the target. This way, I can delegate some of the work to the remote controller without using roles.