How to create loop for register?

Hello everyone.

I’m new in P4. I tried to implement a method that collects statistics from registers. To do that, it needs to check every element of register array. But, i saw that P4 does not support using loops, like for or while. So, the question is how can i check every elements of a register array or write to them in a control block?

Hi @mertkanakkoc ,

It is not possible to create loops in the ingress or egress pipeline of P4 conformant targets. There is no loop construct in P4 as far as I know, such as the ones you mentioned. I believe this a design criterion precisely related to P4-based targets. It would be very difficult to create loops and achieve line rate packet processing. The only loop you can create is in the parser. For instance, when extracting the MPLS headers or IP options. Someone correct me if I am wrong in this paragraph :slight_smile:

Extending my response, please check this paragraph extracted from the P4 spec:

Although parsers may contain loops, provided some header is extracted on each cycle, the packet itself provides a bound on the total execution of the parser. In other words, under these assumptions, the computational complexity of a P4 program is linear in the total size of all headers, and never depends on the size of the state accumulated while processing data (e.g., the number of flows, or the total number of packets processed). These guarantees are necessary (but not sufficient) for enabling fast packet processing across a variety of targets.

In your case, you need to design a “search” method, purely based on the register index. In other words, you need to design a way that having the index calculated, extracted from a packet header or a hash output (just to name a few examples) that will bring you the data you need. In the example posted below, the port number is used as index, and packets are counted per port.

See an example here (p4-guide) for working with registers and perform packet counting.

The demo6 programs are nearly identical to the corresponding programs in the demo2 directory. The only difference is that they demonstrate the use of a P4 register for maintaining counts of packets received, with a separate count for each input port.

Yes, it is certainly possible to use a P4 counter to do this, so this example does not demonstrate the full power of P4 registers. I only created it to have a minimal working example that uses P4 registers.

Dear @mertkanakkoc

Aside from the excellent reply from @ederollora , I’d like to suggest looking at the problem at hand as opposed to looking at what’s not available in the language, because there are, indeed, very good reasons for that.

To tell you the truth, it is completely unclear to me, what would you have done, provided that you had loops available. What would be the purpose of “checking every element of a register array” in the first place?

So, it might be more productive if you try to describe what would you like to achieve as opposed to how would you like to do it.

You can state something like this “I have a data plane program that collects XXX in a register array and I use YYY to select an individual register on a per-packet basis. Now I would like to use this information to do ZZZ, but I am not sure if that’s possible”. The better you describe what ZZZ is, the more helpful answer we might be able to provide you. It might also allow you to better understand what you are doing and why (e.g. why would you insist on doing that ZZZ in the data plane and in the control plane, for example).

Happy hacking,

Hi @ederollora,

Thank you for your detailed reply.
Actually I have also another solution in my mind that creating an action and giving the parameters (in my case these parameters will be indexes) from the controller using tables in ingrees or egress blocks. But both this solution and your suggested solutions will not be effective because in some scenarios size of the register arrays may be more or less. Every time that size of the register array changes, I have to change the code accordingly. But, on the other way, it makes sense not to create loops when we try to achive line rate packet processing. Anyway, again thank you for your detailed response, it seems that I have to think differently.

Hi @vgurevich,

Also thank you for your reply and suggestions. As i said above, it seems that I have to think differently. While doing that, I also have to take into account your suggestions.

Hi @mertkanakkoc ,

I second the opinion of Vladimir in the sense that if you explain what you want to do (Vlaidimir’s ZZZ), we might be able to provide a solution that might already be published in some repository… or, at least, just give you some (pseudo) code.

If your control plane is providing the register index (let’s say, as part of a header), you probably do not need a table or an action. This method is an alternative to collect stateful values from register’s. Controller sends a packet with an index and gets a response with the register value. Unless, of course, that the purpose of the packet out is to trigger the execution of a table because that design suits you more.

Check this README and the example in that folder from A. Fingerhut:

The program registeraccess.p4 and the PTF tests in demonstrate perhaps an unusual way to access a P4 register array from the controller, which is to use PacketOut messages to inject packets from the controller into the data plane, which are then processed by the P4 program. The P4 program is written so that at least some such injected packets from the controller perform read or write operations on a P4 register array, and send a packet back to the controller.

What do you mean by “size of the register”? When you declare one, both index and value size are static before compilation. Let’s imagine the following case: you want to store the last source IPv4 address (the value) for every source TCP port you see (index). If your key size is 16 bits (then you have 2^16 positions in the array) and the value is 32 bits (so the value will be in the range of 0 to 2^32-1), then the size is already predefined. Sometimes, you might have a field (that is supposed to be the value) that is not 32 bits long, but you can always cast the value to 32 bits. Please check this answer about registers. As said before, if you can explain the use case, how and what you want to achieve our answers will be more fulfilling :slight_smile:


Thank you again your response @ederollora.

What I want to do is taking the screenshot of the register when a packet came to switch. More specifically, let’s say, we store the numbers of packets that a switch sends it’s adjacent neighboring switches separately in the register array. Then at a moment, we send a packet that contains a specific value in it’s customly defined header. When this packet came to switch, the switch checks it’s register and collects value of each element in the register. Because we want to see every number of packets that was sent each adjacent switch.

In that scenario, if we give register indexes to switch or calculated in a way, that will depend on the topology of the network. Because, we will create the register accordingly (number of the element in the register array < this is what i meant when say size of the register array by the way> will be the number of adjacent switches of the switch) and give indexes for these registers. But if the topology of the network changes, we have to change both size of the register array and indexes that was given. Changing the size of the register array is okay, because that will be static as you said. But, changing also the indexes given every time will not be effective.

That was pretty long answer, but hope I could explain the problem. Yeah, there are other ways to collect statics from the register. I just wondered is there a way to collect values from each element of the register at once (that at once will be when a packet came). Because when I saw the P4 does not support loops in ingress and egress blocks, I just thought that what if we want to check the all elements of the register at once when a packet triggers to the that. How we can do this? That’s the story.

Cheers :grinning:

Hi @mertkanakkoc ,

Let me begin with my long response.

To achieve this you need either the actual port to output or the standard_metadata.egress_port field at Egress pipeline in the bmv2 (or the equivalent one for any other target). If you want a specific port then that is the procedure. If you want more than 1 port at the same time, there might be a couple of ways to achieve it.

Then you are probably not carrying the index but rather, you will need a specific table that recognizes the “packet that contains a specific value in it’s customly defined header” or, if possible, the IPv4/v6 headers or transport protocol fields to determine that particular packet (like e table that can identify packet fields). You have several ways to do it.

You actually do not need to worry about that. How many actual ports could the switch have? 16? 32? 64? Even if you added all the ports, the values of indexes of unused ports will always be = 0 (if I am not mistaken from the last time I read register initialization).

As I just mentioned, the size will be the same because as long as you create a register array of sufficient positions, you will not need to really worry about that. You might use port 1 and 2 at first as output ports but then maybe port 13. If you read the register regarding all ports and get the packet counters it is going to be fine as long as you do not need to tracker hundreds of ports. If you have 16 ports for the switch you can add one statement per port and register read operation and add the value to a new value-holding header.

If adding one read operation per register position and header/field assignment is too costly or unconvenient/not optimial, there is another way of collecting the information. You can try to immitate the implementation for In-band Network Telemetry (INT) and carry instruction bits. You can use these bits for a table key and select the appropriate registers, like we do in INT with metadata actions. Please check this file to contextualize my reply. As seen in the file, with 4 instruction bits you can command the switch to include any combination of 4 ports (16 combinations). Therefore you need an instruction bitmap as large as the number of ports, if you are more interested in any possible combination of ports rather than all ports.


Thank you again your detailed response @ederollora

Yeah that’s fine adding one statement per port when need to less than hundreds of ports.

But I was actually interested in all ports and curious what if we have getting more and more ports. That scenario may not be a real one, but as i said, i just wondered what can we do, if that scenario become a real scenario and we have hundreds of ports. On the other hand, you really showed very nice and elegant methods and suggestions if we have getting more ports but not hundreds of them.

Actually, at a moment, I had a thought to continue to question or subject. Because elegant methods or detailed explanations may keep coming. :slight_smile: But, these responses seem enough.

Again thank you your time and responses.

Cheers :slight_smile: