Hello,
I am trying to run multiple controllers and let each of them handle a subset of switches and/or a subset of tables, referring to the p4runtime doc(P4Runtime Specification).
If I understood correctly, it is possible to assign different role ids to different p4 entities(I guess in my case those would be the tables), and then let different controllers have different read/write access to each (device, role) pair.
However, on section 5 of the doc, it says the role assignment to p4 entities is out-of-scope of the document, and I can’t find any other document that explains this part.
Does this mean for now I can’t have multiple controllers managing different tables, or is there something I misunderstood?
Also, is there any additional materials I can use to learn more about what I am trying to do?
Short answer: The spec states that you can use different controllers that control different entities within the data plane. The last time I heard about the implementation status (maybe a year or two ago), I think it was not yet implemented, and thus, it is not supported. The sentence about role assignment being out-of-scope is that the spec cannot tell you how to create and manage roles from a design perspective. It is up to you which functionality each role would perform and which entities to read/write.
If you need a more trustworthy answer about current multi-controller support in terms of roles, I think @antonin, @ccascone, or M. Pudelko (you can ask at Stratum repository) could be more aware of the current status.
Long answer:
This has already been possible with OpenFlow switches and the ONOS controller. Other control planes might also support it, of course. It is about keeping a single control plane channel for every controller and switch you want to “pair”. In particular, for ONOS, you can use the balance-masters command to distribute switches across controllers in a cluster. I am not sure if it works with P4 switches, though you will have to test it if you are interested in ONOS.
Well, now, when it comes to letting each controller manage different tables, things get a little more complicated. From my perspective, each switch would have to be able to maintain n individual P4Runtime channels. This is mentioned in the spec, but not sure how far the development has been from the perspective of the control plane, but more importantly, the data plane P4Runtime agent.
I think you are right to most extent. The spec says:
Partitioning of the control plane: Multiple controllers may have orthogonal, non-overlapping, “roles” (or “realms”) and should be able to push forwarding entities simultaneously. The control plane can be partitioned into multiple roles and each role will have a set of controllers, one of which is the primary and the rest are backups.
I understand that each controller can have a different role (functionality or duty). With the word “realm” I believe the document means that each controller will take part in a different domain within the data plane scope. For instance, you might have a table that handles ARP and another IPv4. You might have a local (not primary) controller that handles the ARP table but a centralized primary controller that handles the IPv4 table. Both tables can also be considered to be different domains or scopes, and each controller has “something to say” regarding each table.
I believe that a role is both a real concept in the P4 data plane and also a representation of functions. So you can assign a particular role to a controller, represented by an entity named “role” in the P4runtime messages. Please see the following text from the spec, which explains it better than I do:
5.1. Default Role
A controller can omit the role message in MasterArbitrationUpdate . This implies the “default role”, > which corresponds to “full pipeline access”. This also implies that a default role has a role_id of "" > (default). If using a default role, all RPCs from the controller (e.g. Write ) must leave the role unset.
5.2. Role Config
The role.config field in the MasterArbitrationUpdate message sent by the controller describes the >role configuration, i.e. which operations are in the scope of a given role. In particular, the definition of >a role may include the following:
A list of P4 entities for which the controller may issue Write updates and receive notification messages (e.g. DigestList and IdleTimeoutNotification ).
Whether the controller is able to receive PacketIn messages, along with a filtering mechanism based on the values of the PacketMetadata fields to select which PacketIn messages should be s>ent to the controller.
Whether the controller is able to send PacketOut messages, along with a filtering mechanism based on the values of the PacketMetadata fields to select which PacketOut messages are allowed to be sent by the controller.
And you can also see how the proto looks for the MasterArbitrationUpdate message (see more messages here):
message MasterArbitrationUpdate {
uint64 device_id = 1;
// The role for which the primary client is being arbitrated. For use-cases
// where multiple roles are not needed, the controller can leave this unset,
// implying default role and full pipeline access.
Role role = 2;
// The stream RPC with the highest election_id is the primary. The 'primary'
// controller instance populates this with its latest election_id. Switch
// populates with the highest election ID it has received from all connected
// controllers.
Uint128 election_id = 3;
// Switch populates this with OK for the client that is the primary, and
// with an error status for all other connected clients (at every primary
// client change). The controller does not populate this field.
.google.rpc.Status status = 4;
}
The sentence about role assignment being out-of-scope is that, as I understand it, it is up to you which role or functionality you want to assign each controller to. It is up to you if one controller should handle some tables or others, the read-write permissions and if they should be centralized or local. It is up to your use case, so the spec cannot discuss “the best solution” or the best role design.
Therefore, having multiple controllers (depends on current implementation status) and the role assignment being “out-of-scope” sentence in the spec are not related.
This answer is probabably were I last heard about roles. consider that this information is related to P4Runtime. Other protocols propietary or not might support a functionality similar to roles.
Unfortunately there are a few problems that make it hard to realize the scenario you describe.
The P4Runtime API defines a mechanism for role config, where multiple masters can connect to the same switch as long as they use different roles.
ONOS currently supports establishing sessions with the “default role” only. If your P4Runtime switch implementation supports role config, then you should be able to establish a new session as master via a different P4Runtime client as long as you don’t use the default role. If you are using Stratum, unfortunately that doesn’t support role config, yet. I’m not sure if p4runtime-sh supports setting a role config.
If you use a separate P4Runtime client to become master for the default role, that will interfere with ONOS operations, as the switch is expected to notify to ONOS that it is no longer the master, which immediately triggers a mechanism in ONOS to regain mastership, thus removing it from the other client.
If all you want to do with the other client is reading state, then there should be no need to gain mastership or deal with role config, simply issue a Read RPC for the tables you’re interested in.
However, it sounds like you are interested in using the other client to write state… so, if you manage to to bring up a separate client with role config and make it master, unfortunately state that you write might get removed by the ONOS data plane state reconciliation mechanism. For example, the FlowRuleManager periodically asks the P4Runtime driver to report all table entries for a given device, during this process all table entries that are known (i.e., not stored in the ONOS core) are removed.
Ideally, ONOS should not remove entries written by other controllers with different roles, but this functionality is currently not implemented