P4 16 Language design - why tables are not allowed to be updated from Data plane

Hey

I was reading through P416 and P4Runtime specifications and was trying to understand the modification of tables - the look-up tables and not the table objects defined in the p4 program - and wanted to know/understand the design logic behind the following specifications: -

  1. In the sections 6.4.1 of P4Runtime API (P4~16~ Language Specification) and 14.2.1.4 of P416 specs (P4~16~ Language Specification), it simply mentions that table modifications can be done by Control Plane. It does not explicitly state anywhere that table updates are not possible by P4 prrograms installed on the target or via Dataplane. Is it true that table updates are only allowed via Control plane? If yes then please consider the following question as well: -
  2. Why are table modifications - add, delete, modification of table entries - only allowed through Control Plane? Why aren’t table updates possible via code logic in p4 programs via the Data plane? (For example, if packet processing metadata indicates increase or decrease in traffic on some link, then updating packet forwarding rules directly by calling necessary APIs from the packet processing pipeline.)
1 Like

Read somewhere that updating table will cost time and the switch needs to be fast.

1 Like

Thanks for the prompt response! I agree that might be a supporting reason for the design choice. Also, looking for reasons if any from an implementation constraints or security concerns PoV. Any other reasons would help as well.
But still is it so that updates are only possible from CP and not p4 programs in DP?

Yeah It is not possible from data plane. You should make a controller and send the packets to controller and make the controller update the table.

The basic answer to why P4 does not stress the ability to modify tables in the data plane is that doing this in general is an expensive complex feature to implement in the packet processing devices with the best price/performance/power ratios, like switch ASICs and NICs.

Yes, it can be implemented in general purpose CPU cores, but such packet processing devices do not ahve the best price/performance/power ratios, but are orders of magnitude worse in those ratios, typically.

The Portable NIC Architecture has in the last couple of years introduced a couple of ways that the data plane can modify tables without the control plane having to do so, that did not exist in previous architectures:

(1) The add_on_miss feature, where if you do an apply() operation on a table and it gets a miss, the default_action (the action that is executed when the table gets a miss) can call a new extern function called add_entry that can add a new entry. It is proposed to be limited to only adding exact match keys, and only the key fields values that were just looked up and missed, but the action and action parameters for this data-plane-added entry can be any of the possible hit actions of the table from the P4 source code.

Admittedly, this is not a fully general mechanism to add new entries, because it does not let the P4 program add any other key other than the one just looked up and missed, nor does it enable adding entries to other tables, only the one that experienced a miss, and it does not support anything except exact match key fields.

However, this does cover a large number of use cases, particularly flow tables where you want to add a new entry if you get a miss, and in particular a new entry that will match future packets with exactly the same key.

This is implemented in many switch ASICs and NICs.

(2) The ability to delete old entries in the data plane that have not been matched in a configurable time interval. See the documentation for pna_idle_timeout = PSA_IdleTimeout_t.AUTO_DELETE in the PNA specification: P4 Portable NIC Architecture (PNA)

This ability is also not a fully general capability to delete arbitrary table entries without control plane help. Again, it is a feature supported by multiple switch ASICs and NICs in the industry, and it goes hand-in-hand with feature (1), usually on the same tables. If you want to add entries at high rate without control plane help, you often also wish to delete entries at high rate without control plane help.

3 Likes

Another possible use-case: stateful dataplane processing. For example, you want the dataplane to implement a certainly policy for the first 5 packets of a new flow (eg create mirror copies and forward), thereafter implement another policy (eg just forward, no mirroring). This requires the dataplane to have a feedback loop to take real-time dataplane info and feed it back into the dataplane. I think the original poster’s question hinted at this, but I am not sure if it was the primary use-case.

@rgrindley I will link to another reply to @Svenzer’s same question in another forum, the p4-discuss email list: https://groups.google.com/a/lists.p4.org/g/p4-discuss/c/r6i33CjTsys/m/XesIgni8CAAJ

That reply links to this pull request to the language spec: Data plane writable per table entry state v1 by jafingerhut · Pull Request #1239 · p4lang/p4-spec · GitHub

That PR proposes a (for now) experimental enhancement to the language that lets you concisely write P4 actions where some of the action parameters can be modified in the data plane while processing a packet, and the results of those changes will be visible to the next packet matching that table entry.

Note: You can do these kinds of things already today, without this change to the language spec, by instead using P4 register arrays that exist in many P4 implementations. P4 register arrays can be read, and new values written back, while processing data packets, and enable pretty much the same features as the language spec extension above does, but with a different syntax. So in some ways, that language spec extension does not actually provide new functionality to P4 – just a more concise syntax to do things that could already be done before.

There are many, many interesting features implemented on Tofino using P4 register arrays in ways that I would not have even imagined before I saw the research papers describing them. For example, implementing “the fast path” of Paxos message processing in a switch ASIC, while handling the relatively rare occurrences of link and/or switch failures in control plane software to get a full Paxos implementation:

The example of “handle the first 5 packets of a flow differently than the 6th or later packets of the same flow” is a much simpler thing to do, it has been done use P4 register arrays for years now, and the above proposal would make the syntax a bit nicer.

1 Like

Thank you so much fior your inputs @andyfingerhut . This really helped.

Adding a follow up question and answer here, in hopes that this web forum discussion will be easier to point to in the future:

Question:

Is data plane, specifically the P4 program aware of table entry updates made by the control plane? I understand the answers a no, since P4RT server calls switch OS/SDK to make table updates and P4 programs are not involved in that process pipeline.

Please correct me if am wrong.

My answer:

If by your question you are asking “Is there a way to write code in a P4 program, such that its execution is triggered by the addition of a table entry from the control plane, regardless of whether a packet is being processed?” then the answer is “No, there is nothing in any P4 specification or implementation that I am aware of that implements such a feature.”

If your question is “Does the data plane’s behavior change as a result of the addition of a table entry from the control plane?” then the answer is “Yes, after a new table entry is added, then every time after that when an apply() call is made on that table, if the lookup key matches the new entry (and for tables with priorities on entries the new entry is the highest priority among all matching entries), then the new entry’s action will be executed.”

If your question was something other than one of those two, please feel free to follow up here with a more detailed question.

1 Like

so now think about netflow ipfix jflow rfcs site:rfc-editor.org and you’ll see…

even worse if you want to code a 32x400g firewall without the help of the control plane on the first some packets and it just gets updated so reboots and starts to pick up that 100-1000m flows in a minute…

but with the help of a control plane, you can even do a tls.server_name_indication filtering/logging firewall…VC-minar 021: Stateful firewall/packet inspection in freeRtr - YouTube

So add-on-miss mentioned earlier is specifically to cover the cases where the rule you want to add into the data plane is simple enough for the data plane to determine what it is, without bothering the control plane software. The precise performance of this capability is not mandated by P4.org specifications – it is a quality-of-implementation issue, but the spec encourages that if you implement it, you should do so at “a large percentage of line rate”. In general, if a device vendor claims to support P4 add-on-miss, you should talk to them to find out exactly what its performance characteristics are.

Clearly, if the logic to determine what rule needs to be added to a table is too complex to be calculated in the data plane, because your data plane has too limited of calculation capabilities, then you need to execute that more complex logic somewhere else, typically a general purpose CPU somewhere nearby. That will have different price/power/performance tradeoffs, and peak rates of adding new entries, but if the logic is too complex for the data plane, you need to do it elsewhere.

issue?

well then try to reproduce the above in theory, just the fw restart case lets say fb . com orders one per inrgress link of them…

then watch the video again and again and pls understand the below:

clearly speaking, we can produce chips that unroll the aes256gcm up to 9200bytes in linerate (khm i have one in my closet btw but im still thinking on how to boot it up… XDDDD) then truncate it to ip.len before sending out but we shouldnt.

'till that ill reoccur here and every “why?” question will shortly end in “calc i=42!” in 1 answer from me. <— the warning had been spoken here…

and now lets restaurant in peace: http://fun.nop.hu/autocorrect-died.jpg

If “quality-of-implementation issue” is unclear, let me try to be clearer.

There are multiple companies making silicon that want to make them P4-programmable. They did not get together and make a common design for all of their ASICs. No. They each designed their own ASICs with different capabilities. So if one can add-on-miss at 10% of line rate, and another 50% of line rate, and another 100% of line rate, sure that is a big difference. But all of those are hugely faster than the 0.01% to 0.1% of line rate that a lot of switch ASICs enable from the control plane CPU.

let u do but lemme not read further. the reasons are almost clearly spoken now…

about the rest, just think about the github / p4 and the backends

same story but at github… and i already know 3 more that not yet reached the official merge, tbh… XDDDDDDDD

google ? q = p4 dpdk site:bme.hu

thats how those began, independently of my work… my fav was tapas but in a hacker language… happy googling and reading, and dont forget to bounce the public url if/when you succeeded…

until that, my fav rfc is this: RFC 3173 - IP Payload Compression Protocol (IPComp)

good luck unrolling the gzip, soon shipping linerate in dpdk from our shitload… a way better then wireguard could even be btw… aand wireshark decodes it nicely when i run

wget freertr.org/rtr.zip
unzip rtr.zip
cd src
./c.sh
./tw.sh conn-ipcomp01 capture r1 eth1 nowait

  • summary: 2023-06-25 07:09:39, took 00:00:04, with 1 workers, on 1 cases, 0 failed, 0 traces, 0 retries

compiled 4 a while to amd64elf at portable.freertr.org but i still consider this nsfw because its not jav just rusty if you dont update daily. for a quick demo well nothing wrong with the elf64s but as a long running process it just bloats the i-caches and the jump prediction of the x86-64s… i consider this feature for this use case worse than the same public cpu bug…

edit: ^^^^ hmmm dunno dontcare, i swap between the two (embedded openjre-latest or graalvm-latest builds of the same control plane) yearly or so…

edit2: at the moment it’s openjre and some *.class files inside and have some bgp also, and i thats my preferred rtr2.bin for a long ago tbh… XDDDDD