Is there an approximate limitation on tables number or register read/write times for one packet?

Hi,
I learned that the programmable switch has a strict limitation for memory access per packet, but I cannot find a specific explanation about this. And I wonder:

  1. how many tables can be applied for one packet at most?
  2. how about registers access limitation for one packet?

Can someone help me with this?
Thanks a lot!

The answers to these questions depend entirely upon what target device your program is being compiled for. The answers for particular devices are not always published, i.e. sometimes the exact limits of a switch ASIC’s capabilities are only given to potential customers under non-disclosure agreements.

One example is the Tofino1 switch ASIC, where I know it was mentioned in early 2017 in a publicly recorded talk that at least one model of it (there are several variants) has 12 processing stages in ingress, plus another 12 processing stages in egress.

If you write a P4 program where every table lookup has a data dependency on the previous table lookup, e.g. it has a key that might be assigned a value in an action of the previous table, then each later table must be done in a later processing stage than the previous table lookup, at least, and so you would get at most 12 table lookups in ingress, plus 12 more in egress, for such a P4 program.

It is quite common for P4 programs NOT to have so many data dependencies, and if there are tables that can correctly be done in parallel, the P4 compiler will determine this at compile time and perform multiple table lookups in the same pipeline stage. There are limits to how many table lookups can be done in the same processing stage which I do not know if they are public, but I think I can safely say that limit is more than 1, but not more than 10.

The answers for register accesses are similar to those for table accesses per packet, but not necessarily identical.

Thanks a lot for your information! So if I read and then write one register, is it actually need two stages?
And is it cost the same to read/write a register of different sizes? for example, read a register 1 bit and and another of 320 bits.

I cannot speak to all P4 implementations, but for Tofino, its hardware is designed such that a single P4 register array can be both read and written back in a single stage, at the same address (the address can change from packet to packet, but for each packet, the address read and then written back must be the same).

Tofino is among a category of switch ASICs where ingress is guaranteed a throughput of one packet per clock cycle per pipeline, and similarly for egress. There are many different switch ASICs that have such a performance guarantee, due to the way their hardware is designed.

For any such switch ASIC, there is at most a finite amount of things you can do per clock cycle. Thus if such a switch ASIC implements a P4 register, there must be a maximum number of bits that can be read and written back per packet. This maximum number can differ between different switch ASIC designs, and is chosen by the hardware designers when they designed that particular chip.

In general, all resource limits are finite. Some examples of other limits:

  • the maximum width of a search key in a table
  • the maximum number of different actions that a table can support
  • the maximum number of bits in all action parameters of a single table entry
  • the maximum number of counter and/or meter updates that can be done
  • the maximum number of tables that can be looked up in a single cycle for a packet

etc. If you want to know what these limits are for a particular switch ASIC design, you must ask the manufacturer. As I mentioned earlier, the answers are not always made public. “sometimes the exact limits of a switch ASIC’s capabilities are only given to potential customers under non-disclosure agreements.”

2 Likes

Many thanks for your comprehensive reply. You really help me a lot.