Parser does not use the correct key

Hello everybody!

I am trying to run a customized network by following the script already existing in the tutorials folder of P4lang.
I modified run_exercise.py (that stays in utlis) and the basic.p4 script to fit my requirements. My topology contains six hosts, each one of them is connected to a leaf switch and each leaf is connected to a spine switch. The spines create a sort of start as the core of the network.
I create my own topology.json file and all the json files for switches configuration. The compiler compiles the p4 program and through a Makefile I create the folders build, pcaps and log, that are populated with the relative files during the launch of mininet.
Everything works fine, I can even (apparently) install the rules in the switches, as log files show. It works exactly as it works with the basic exercise of the tutorials.
However, when I try to ping two hosts, the destination is unreachable. Checking the logs, I can see that the parser does not use 0x0800 as a key for parsing the ethernet frame, but it only uses 0x86DD (IPv6) (that I can disable in hosts and switches) and 0x0806 (ARP).
Again, if I run the basic exercise, everything works fine.
P.S. Even the json file created by the compiler shows that the parser should use 0x800 as key
I hope you can help me with that, thank you in advance!

Thanks! Can you share your program (ideally, the complete code), the commands you’re running on the hosts, and the Bmv2 logs so we can diagnose?

Very likely you are seeing the parser using ethertype 0x86DD in the logs because the switch is receiving an IPv6 packet. This can often happen, even if you do not attempt to send IPv6 packets, because often hosts send IPv6 router alert packets for various keepalive kinds of protocols. If your P4 program drops them, then the only harm is that they can confuse you when looking at the switch logs. Often commands like the ones linked here: p4-guide/veth_setup.sh at master · jafingerhut/p4-guide · GitHub can disable the sending of such IPv6 packets.

ARP is a normal thing for a host to send out just before sending an IPv4 packet on an Ethernet link, if the host does not yet have an entry in its ARP table for that IPv4 address. It should be possible to force the addition of ARP entries in a host, but sorry I do not have the necessary Linux commands handy at the moment. It is possible that the hosts might never send an IPv4 packet to a switch until and unless it has successfully created an ARP entry for the destination IPv4 address of the packet.

I had already disabled IPv6 with the commands you suggest. They do not appear anymore in the log file. Not a big deal.

For what concerns the ARP, in the tutorial exercises there are no ARP routes instantiated between the hosts, a part for the commands in the topology.json file that I have in mine too. Nonetheless, pings work perfectly between them.

I am attaching the Makefile, the python script, the p4 program, and the json files for topology and paths that are instantiated. It is ready to run with “sudo make run”. To notice that it has to run in the same directory of the directory “utils” in p4lang/tutorials and with the files receive.py and send.py.

I had to change a bit the file “convert.py” in utils/p4runtime_lib at line 63 from len(x)==1 to len(x)==2.
Last thing, there is the need to change the path at the beginning of j.py to locate the needed p4 libraries in the system.

You can find the files at: https://drive.google.com/drive/folders/1FWGGhjx5OXawAt46JesXzXayWwiiLJbj?usp=sharing

I am trying to understand your use case. I have downloaded the files and checked some of them. I will ask you to paste here some information (in bold letters) please :slight_smile: and answer some questions.

The switch P4 file seems correct, although you apply two tables that might be “rewriting” the output port. So you have to be aware of that.

Then I have seen your topology.json file. h1 and h2 have you tried to see if they can ping each other? Is “destination unreachable” the message at the console when you ping from h1 to h2? Is it the same when you ping h3?
Can you also paste here the output of arp -n? Also, paste here route -n output at h1 and h2.

Besides, I see that you reference X-runtime.json files from the switches object at topology.json file. Are you actually using those runtime.json files? Or at you installing rules from run_exercise.py?

Also, are you trying a typical leaf-spine topology? I tried to build the network from your links at the topology.json file, but it does not resemble to this picture. I could be wrong in this, but just confirm that you actually have the links properly established. I see a single link from each leaf to their spines and multiple links among spines. So keep this in mind also with your rules. You might be installing rules for a topology that might not be the same one as you configured with your links. You should confirm this.

Still, I am pretty sure that the problem could be related to ARP. Can you ping from h1 to h2 and sniff the traffic with Wireshark? Do you see ICMP packets or ARP packets? If your python script does not install ARP rules on h1 (about h2’s IP-MAC) then ping will never work. If your program does not handle layer 2 packets (which seems like from your p4 program) or run an ARP proxy in the control plane, h1 will never send an ICMP request because it is expecting the ARP response.

Besides, if you don’t see IPv4 packets being parsed it probably means those are never sent, either because it misses an ARP entry or because the routing is wrong. If ICMP packets were actually sent, you should be looking into the P4 code or the rules you install.

Cheers,

Thank you for your answer. Let me reply by bullets.

  1. I have two tables, but it is to differentiate the table for leaves from the one for spines. In fact, the json file I use to configure every switch contains only the relative table (MyIngress.basic_forward for leaves and MyIngress.core_forward for spines").
  2. I tried to ping h1 and h2 and viceversa, here is the output:
mininet> h1 ping h2
E0924 12:54:53.440247983    2147 fork_posix.cc:63]           Fork support is only compatible with the epoll1 and poll polling strategies
E0924 12:54:53.445850901    2147 fork_posix.cc:63]           Fork support is only compatible with the epoll1 and poll polling strategies
PING 160.0.1.2 (160.0.1.2) 56(84) bytes of data.
From 160.0.1.1 icmp_seq=1 Destination Host Unreachable
From 160.0.1.1 icmp_seq=2 Destination Host Unreachable
From 160.0.1.1 icmp_seq=3 Destination Host Unreachable
From 160.0.1.1 icmp_seq=4 Destination Host Unreachable
From 160.0.1.1 icmp_seq=5 Destination Host Unreachable
From 160.0.1.1 icmp_seq=6 Destination Host Unreachable
^C
--- 160.0.1.2 ping statistics ---
8 packets transmitted, 0 received, +6 errors, 100% packet loss, time 7156ms
pipe 4

It happens the same when I ping from h2 to h1 and from h1 to h3 (even if in this case they should not ping since I do not put rules for that route.

  1. When I type h1 arp -n before pinging h2 I get:
mininet> h1 arp -n
E0924 13:05:07.210047798    3366 fork_posix.cc:63]           Fork support is only compatible with the epoll1 and poll polling strategies

after the ping to h2:

E0924 13:06:40.937631073    3366 fork_posix.cc:63]           Fork support is only compatible with the epoll1 and poll polling strategies
Address                  HWtype  HWaddress           Flags Mask            Iface
160.0.1.2                        (incomplete)                              eth0
  1. The output of h1 route -n:
mininet> h1 route -n
E0924 13:07:14.631316551    3366 fork_posix.cc:63]           Fork support is only compatible with the epoll1 and poll polling strategies
Kernel IP routing table
Destination     Gateway         Genmask         Flags Metric Ref    Use Iface
160.0.0.0       0.0.0.0         255.0.0.0       U     0      0        0 eth0

while from h2:

mininet> h2 route -n
E0924 13:08:51.349454600    3366 fork_posix.cc:63]           Fork support is only compatible with the epoll1 and poll polling strategies
Kernel IP routing table
Destination     Gateway         Genmask         Flags Metric Ref    Use Iface
160.0.0.0       0.0.0.0         255.0.0.0       U     0      0        0 eth0
  1. To install the rules on switches, I use topology.json and paths.json to write the files X-runtime.json for every switch in the network. Then I use program_switch function from simple_controller to write rules on the switches. I never call run_exercise.py since all the functions I need are in j.py.

  2. The topology I am using is a bit different from the one of your picture. Basically, every host is connected to a leaf, and every leaf is connected only to a spine. Then the spines are connected each other with multiple links but without a complete connection between them (e.g. S1 is connected to S2, S4 and S6). Rules are designed to respect that topology. In particular, they are designed to follow some pre-determined path I establish during the setup- For sake of easiness, I have shared with you paths.json that contains a configuration of them.

  3. By sniffing the packets with Wireshark during h1 ping h2 I can see:
    a. On the in interafce h1-l1 on l1 I can see these kinds of messages:

13:22:15.650950 IP6 :: > ff02::16: HBH ICMP6, multicast listener report v2, 1 group record(s), length 28
13:22:15.907105 IP6 jaff > ip6-allrouters: ICMP6, router solicitation, length 16
13:22:16.007697 IP6 jaff.mdns > ff02::fb.mdns: 0 [2q] PTR (QM)? _ipps._tcp.local. PTR (QM)? _ipp._tcp.local. (45)
13:22:16.073626 IP6 jaff.mdns > ff02::fb.mdns: 0 [2q] [2n] ANY (QM)? f.f.d.1.3.a.e.f.f.f.f.1.b.8.4.d.0.0.0.0.0.0.0.0.0.0.0.0.0.8.e.f.ip6.arpa. ANY (QM)? jaff.local. (148)
13:23:04.085783 ARP, Request who-has 160.0.1.2 tell 160.0.1.1, length 28

Apparently, seems that the network is only using IPv6 traffic instead of IPv4, which is the one I want and need. Moreover, in topology.json it is possible to see that every host has an arp command to execute during its setup, as happens in the basic exercise in p4lang/tutorials. The fact is that I cannot understand why I have this problem since I am using exactly the same structure of p4 program that is in the tutorials, and also the same python file.

Hi,

It seems that the routes are fine on hosts. But the ARP rules show no output from your answer. Please, could you try to confirm the MAC and IP pairs by running ifconfig in both h1 and h2? I guess one can already consider them from topology.json but confirming them does not hurt.

Assuming the pairs are there:

  • 160.0.1.1 - 00:00:00:00:01:01, for h1
  • 160.0.1.2 - 00:00:00:00:01:02, for h2

Then you should run these command:

  • in h1: arp -s 160.0.1.2 00:00:00:00:01:02
  • in h2: arp -s 160.0.1.1 00:00:00:00:01:01

Once those commands are inserted, could you try to ping again? If the flows are incorrect (which could still happen) the ping might not work, but if you sniff the traffic with Wireshark you should see ICMP packets.

Let us know if it works or still fails.

Cheers,

Good evening,

MAC and IP addresses are correct. Now, with the arp entries manually inserted, I can see the ICMP packets in the pcap file of l1. Finally, through the log file, I can see that the parser uses 0x800 as key for the header matching. So, apparently this problem is solved.

As you predict, since I have two tables, l1 looks up for the first one first (MyIngress.basic_forward) and then for the second (MyIngress.core_forward), doing again the match and missing it. How can I solve this?

Cheers,

Hi,

I am happy to see that the first issue is corrected.

So your second problem seems easier. But first, when I mentioned a possible issue with both tables, my concern was that the second (core_forward) table (if there is a match), will overwrite what the first (basic_forward) table did (if there was a match too). Therefore, the forwarding done by the first table is overwritten by the second. However, you said that each table is used by a different switch, so I guess this is not a problem because, if your rules are correct, both tables should not have a match in the same switch.

Your problem here seems to be the default action (in both tables). See here:

table basic_forward {
    key = {
        standard_metadata.ingress_port : exact;
    }
    actions = {
        ipv4_forward;
        drop;
    }
    size = 1024;
    default_action = drop();
}

The default action should point to NoAction instead of drop(). Right now, you are dropping packets that do not match a table. NoAction does nothing, so if you matched the first table and assigned an output port, then the second table will do nothing and the packet will continue being processed. A correct version (in both tables) should be:

table basic_forward {
    key = {
        standard_metadata.ingress_port : exact;
    }
    actions = {
        ipv4_forward;
        drop;
        NoAction;
    }
    size = 1024;
    default_action = NoAction;
}

I should warn you that, according to your code, you run a L3 table (i.e., forward the packet based on the IP address, change the MACs, etc). The problem here is that h1 and h2 belong to the same subnet, therefore if h1 pings h2 without needing the gateway, but you still change the MACs (in ipv4_forward action). Therefore, you will probably not receive a reply at h1. This is because h2 will receive an ICMP request with a source MAC different to the one from h1. Or even if it answers, the reply would also have a pair of source MAC/IP different to what h1 expects. If this is the case, to fix this, you can do several things: (1) you can program a table that works as a L2 switch (forward based on MACs). Else, (2) you can create another action for basic_forward table (so do not use ipv4_forwarding) that does not change the MACs.

Finally, consider that even if you thought your rules are right, they can always fail for a misconfiguration. So do not discard this option :slight_smile: . I have not checked them, so they might be alright.

Option (2) is something like this (I just edited this action again, it was wrong):

action ipv4_simple_forward(macAddr_t mac_dst, egressSpec_t port) {
    standard_metadata.egress_spec = port;
    hdr.ipv4.ttl = hdr.ipv4.ttl - 1;
}

Hi,

Thank you for your answer. I followed the second option, but since I do not need to change the MAC address, I got back to my first idea, which was to perform a match on the following fields:

  • hdr.ipv4.ip_src
  • hdr.ipv4.ip_dst
  • hdr.ethernet.mac_src
  • hdr.ethernet.mac_src

The idea is to make the match stronger and increase its security. Can this cause any problem?

I put NoAction as the default action for both tables I am writing.
However, I noticed that the problem of the match with both tables might be caused by the apply function in the p4 program. In fact, at the moment it is written like this:

apply{if(hdr.ipv4.isValid()){
        basic_forward.apply();
         }
         if(hdr.ipv4.isValid()){
         core_forward.apply();}}  

or even like this (that produces the same result):

apply{if(hdr.ipv4.isValid()){
        basic_forward.apply();
        core_forward.apply();}}  

This produces in the json file created by the compiler the follower statement:

"tables":[ ...
   "name":"MyIngress.basic_forward",
   ...
   "next_tables":{
     "MyIngress.ipv4_forward":"node_4",
     "MyIngress.drop":"node_4",
     "NoAction":"node_4"}
    ...
    "name":"MyIngress.core_forward",
    ...
    "next_tables":{
        "MyIngress.ipv4_forward":null,
        "MyIngress.drop":null,
        "NoAction":null}
    ...
   ]

Since I am not using any other kind of header that can distinguish the two appply(), how can I avoid table base_forward output to be re-written by core_forward?

Best,

Hi,

I can see that you wrote hdr.ethernet.mac_src, I guess you meant mac_dst or similar.

The idea is to make the match stronger and increase its security. Can this cause any problem?

Do you mean “stronger and increase security” as in making to prevent malicious use of the network, or “stronger and increase security” as in making sure that the rule is actually matched? If you mean the first, then I am not sure how matching 4 fields making it more secure than 1 or 2 :thinking:. If it is the second meaning, then one only field as a key will be easier to debug (which I recommend, so just use destination IP). However, it is totally fine to use 4 keys and there is no reason not to do so. You just need to be sure that rules are correct and that packets are using the MACs and IPs that you expect (so that packets match the rules).

If you want to differentiate both table execution, I can see a few different ways:

  1. I remember some student/developer that made different P4 data plane files for different switches, but I cannot remember if this is available out-of-the-box in the tutorial repository. You might be able to make some tweaks to your tutorial code so that, using a configuration file, you can associate the compiled JSON from different data planes to be assigned to different switches.

  2. Still, I think the easiest is using hit or miss fields:

apply{
    if(hdr.ipv4.isValid()){
        if(basic_forward.apply().miss){
            //if basic_forward is not matched then apply core_forward
            core_forward.apply();
        }
    }
}        

Then, if you install the correct rules on each switch’s tables, the core_forward will never be applied if basic_forward’s table is matched.

Hope it helps.

Cheers,

Hi,

yes, I meant mac_dst. Actually, is not a matter of the number of fields but more of the type of fields. Having a match on IP addresses and MAC addresses can prevent IP spoofing, hence it is a (small) defense that acts as a firewall. Moreover, I also need a match on bot IP addresses for reasons of what I am implementing. Up to now, I am performing the match only on IP addresses, and is working: I can finally ping the hosts.

About the tables differentiations, I still have to try your approach, but in the meanwhile, I just deleted MyIngress.basic_forward and I apply only MyIngress.core_forward even on the leaves, and it works.

Thank you for your help and time. I will for sure write again some questions on the forum since I am still far from the end.

Cheers,

JB

Hi,

I am happy that this works for you.

Actually, is not a matter of the number of fields but more of the type of fields. Having a match on IP addresses and MAC addresses can prevent IP spoofing, hence it is a (small) defense that acts as a firewall.

Oh, now I see what you mean. Then what I would personally do is match source MAC, source IP and ingress port. In this way, you are making sure that the traffic with a pair of MAC/IP is always attached to a port. This prevents spoofing packets in any other port (unless there is physical access to it). Because, if you just put MACs and IPs as keys, this does not prevent anyone that has access to a third computer in the same switch or somewhere else to generate the same packet. In P4 it would be something like:

table your_forward{
    key = {
        hdr.ipv4.ip_src: exact;
        hdr.ethernet.mac_src: exact;
        standard_metadata.ingress_port : exact;
    }
    actions = {
        drop;
        NoAction;
    }
    size = 1024;
    default_action = drop;
}

As you can see, if you insert the correct rule for your host in a particular switch, you should perform a NoAction action. In other words, if there is a match, do nothing. But if there is no match, the default drop action takes place and any packet that does not match port, mac_src and ip_src combination will be dropped. In this way, if a packet is spoofed in another port, it will be dropped. Therefore, any packet not matching your criteria and details about known hosts will be dropped.

Please make any other questions you need.

Cheers,

Hi,

Thank you for your tips, for the moment this part is fixed. Even if I don’t know why the match on the MAC address does not work, but it is fine, I just use the IP).

I have a question on the use of p4runtime-shell, but I think it is better to open a new topic for it.

So thanks again for your time and help!

Cheers,

JB