Registers size for flow tracking

Hello, I have a question about the register size. I want to write a P4 program that saves each flow’s information in registers. However, as the number of registers that can be defined is limited, and it is probable that the number of flows becomes more than the number of available registers in the P4 program, I am wondering how this problem can be handled?

Thanks,
Sara

Hi,

I want to write a P4 program that saves each flow’s information in registers

What kind of information do you want to save?

Cheers,

The information is the IP source and destination addresses and destination TCP port of the flow and ingress and egress port of the switch.

Hi Sarah,

In this case, it probably makes sense to send a digest to the control plane. I am not sure that you can store that much information in registers. :thinking:

These slides probably explain the same as what you want to achieve (see slide 5).

You also have one example here from the p4-guide.

Maybe others can give a better solution.

Cheers,

Hello Sarah,

I think that the specific answer depends on a variety of additional circumstances that I could not glean from your post so far.

  1. We need to recognize, that the answers to many (if not most) questions will be quite different, depending on the P4 target and its P4 architecture. Therefore it is really important to mention which one you are asking about, otherwise you probably will get no more than a general, non-specific answer.

  2. It is very important to explain the details of the desired use case. For example, I’d wonder about the following: do you want to record these values for each and every packet? Are you OK having duplicates? How do you plan to use those values further?

Without this information it is really difficult to provide meaningful suggestions.

Happy hacking,
Vladimir

1 Like

Thanks for the reply.

I want to use Bmv2 and SmartNics as the target platform.
And also, I want to save the information related to each flow in the switch. I want to send this information to the controller after a threshold of the available registers is used. Is this possible?

How much data can I save in registers, and what is the maximum size of registers I can use? Does it depend on the platform?

This is nice, but consider that software switches and SmartNICs might support registers differently (or not support it at all).

Which information IPv4/IPv6 addresses? Also ports for TCP or UDP? Also, the protocol identifier from the L3 header? It all depends on how much information you can store in the registers. That is defined by each target.

You would have to think on how to count when a register is used. Maybe you could count anytime you write into the register. But, that would also count when the same register is written multiple times, so you would have to think of a smart way when counting this. You should send the information you want to the controller using a PACKET_IN packet. However, sending all information in registers might be difficult, so you would have to think how to do it smartly.

This is a nice question, but I cannot remember right now for each of them. But it definitely depends on the target (rather than platform). Software switches like the BMv2 will be much more flexible than hardware switches or NICs. For BMv2:

register(bit<32> size): it allows you to declare an array or register of size size and cell width of T (e.g bit<8>).
    void read(out T result, in bit<32> index): function to read the content of cell at index. Stores the output at the variable result (which must have width T).
    void write(in bit<32> index, in T value): function that write value (also with width T) at the cell index.

I max size of register array is 32 bits but cannot remember T. For other targets it will be different.
.

Hope it helps

1 Like

Thanks a lot for your reply. I have some questions.

  1. register(bit<32> size) means that we can define a register of size 2^32, right?

  2. Can you explain more about this sentence “But, that would also count when the same register is written multiple times”
    when this may happen? In my program, after writing in the register, the index will be saved and the next time, it will be written in the next index of the register.

  3. How does the program handles writing into the same register at the same time in case two packets come at the same time. (because I am saving the index of the last written cell of the register, it is important how does it handle it)

Hi @sarahshakeri ,

  1. You can find the definition in the language spec, let me show:

For example, the following extern declaration describes a generic block of registers, where the type of the elements stored in each register is an arbitrary T.

extern Register<T> {
    Register(bit<32> size);
    T read(bit<32> index);
    void write(bit<32> index, T value);
}

Therefore, T is the size of the information you want to store. Let’s say you want to store numbers 0 to 3 (imagine this is a value of a header field). Then bit<2> for T should be enough. But you want to store that number for every packet coming in every of your hypothetical 16 ports. Then you can instantiate is as register<T>(16). But, because mas size for registers in V1Model is 232, you could in principle make it as big as 232. This is further explained n the specification:

The type T has to be specified when instantiating a set of registers, by specializing the Register type:

Register<bit<16>>(128) registerBank; //I modified T size to 16

The instantiation of registerBank is made using the Register type specialized with the bit<16> bound to the T type argument.

If this was not clear enough, maybe the following figure makes it clear:

  1. Oh, I see. I understand now. I meant that, if you implement an example that counts every time you write in a register index, then it counts every time you write but not every time you write in a register for the first time (because you would have to make it possible to monitor that, which is complicated I guess.). But if you increment the index by yourself and then count, it will work as you expect. :slight_smile:

  2. To be honest, I lack a deep knowledge about how this is handled in hardware or software switches. But let’s say two packets are received at the same time if you write to a register, I guess the switch has to handle that somewhat sequentially or blocking the access for writing if you write actions have to happen pretty much at the same time. However, P4 allows the operations to be performed in a @atomic way. In particular, if you need to save the index of the last written register, then you need to write and read atomically, and then let the next packet’s write and read bet atomically done too (and so on). Let me show what the specification says about it:

In contrast, extern blocks instantiated by a P4 program are global, shared across all threads. If extern blocks mediate access to state (e.g., counters, registers)—i.e., the methods of the extern block read and write state, these stateful operations are subject to data races. P4 mandates that execution of a method call on an extern instance is atomic.

To allow users to express atomic execution of larger code blocks, P4 provides an @atomic annotation, which can be applied to block statements, parser states, control blocks, or whole parsers.

Consider the following example:

extern Register { /* body omitted */ }
control Ingress() {
  Register() r;
  table flowlet { /* read state of r in an action */ }
  table new_flowlet { /* write state of r in an action */ }
  apply {
    @atomic {
       flowlet.apply();
       if (ingress_metadata.flow_ipg > FLOWLET_INACTIVE_TIMEOUT)
          new_flowlet.apply();
}}}

This program accesses an extern object r of type Register in actions invoked from tables flowlet (reading) and new_flowlet (writing). Without the @atomic annotation these two operations would not execute atomically: a second packet may read the state of r before the first packet had a chance to update it.

Note that even within an action definition, if the action does something like reading a register, modifying it, and writing it back, in a way that only the modified value should be visible to the next packet, then, to guarantee correct execution in all cases, that portion of the action definition should be enclosed within a block annotated with @atomic .

A compiler backend must reject a program containing @atomic blocks if it cannot implement the atomic execution of the instruction sequence. In such cases, the compiler should provide reasonable diagnostics.

In other words, you need to atomically write and read to prevent another write operation to change the value in the nanoseconds between the write and read operations of your action.

I am sorry for the long response. I hope I made no mistake but if anyone points and error please write a reply and I will fix the answer :slight_smile:

Cheers,

2 Likes

To add a little bit of detail for one example of a P4-programmable switch ASIC, Tofino, a 6.4 Tbps Tofino1 with 64 100GigE Ethernet ports internally has 4 “pipes”.

Pipe 0 is physically connected to 16 of the 100GigE ports, call them ports 0 through 15.
Pipe 1 is physically connected to 16 of the 100GigE ports, 16 through 31
Pipe 2 is connected to ports 32 through 47
Pipe 3 is connected to ports 48 through 63

Each pipe processes packets many at a time in a pipelined fashion, starting a new packet up to 1 new packet per clock cycle. At each stage of the pipeline, there is either exactly 1 packet, or no packet (a “bubble”). Each register is accessed at only 1 place in the pipeline for a read, and shortly afterwards at 1 place in the pipeline for a write. So there are never 2 packets at the same stage in a pipeline, and no possible conflicts.

Each pipe has physically separate memory storage, so a P4 register array in pipe 0 is physically distinct from a P4 register array in pipe 1. Packets processed by pipe 0 cannot access P4 register arrays in any pipe except the ones in pipe 0.

There can be 4 packets in the same pipeline stage of pipe 0, pipe 1, pipe 2, and pipe 3 simultaneously, but they are packets from different input ports, and they cannot access the same instance of a P4 register array, only the one in their own pipe.

Hopefully that clarifies the part of the question regarding “in case two packets come at the same time”, at least for Tofino. That isn’t the only way to implement it, but all practical implementations I can imagine would do something similar.

2 Likes

Thanks a lot for your reply @andyfingerhut @ederollora .
Now, it is clear enough for me.
Just a question, I understood the maximum size of registers in bmv2. But do you have any idea about hardware switches or SmartNics? Is there any constraint on the maximum size of memory dedicated to each of them?

Thanks,
Sara

The memory size of switch ASICs that can achieve tens of Tbps tends to be entirely on-chip in SRAM and TCAM (or perhaps also embedded DRAM). If it is limited to SRAM and TCAM, then it tends to be in the tens to hundreds of megabytes total, and there may be restrictions on how that may be allocated for different purposes.

The memory size of some switch ASICs has external DRAM, and NICs frequently have external DRAM, meaning that GBytes of storage is accessible, again perhaps with limits on how many times per packet you can access that, and/or on-chip caches that can cause packet processing performance to vary from packet-to-packet depending upon cache hit rates.

In general, you need to check with the vendor of the device to get any more detail than that.

2 Likes