One of the best ways I know of debugging P4 program behavior is to enable logging when you start the simple_switch process, and then look at those logs to see what happens to your packets as they are being processed. It can be tedious to do, but once you find out exactly how your P4 code is behaving on the packets of interest to you, it tells you everything that happened on most, if not all, of the P4 lines of code executed on those packets.
It can be tedious, and it helps find the packets of interest to you if you can reproduce the problem with as few packets as possible, to make it easier to find the one of interest to you in the log, ideally only one packet sent to the switch if you can arrange that.
A packet-out packet when it arrives to the switch starts with the controller header that you define, followed immediately by whatever packet that the controller sent. There should not be anything else in it. Note that the controller header is just a sequence of bytes, just like the packet that follows it, so the only way you can distinguish these from other packets is knowing that they arrived to the switch on the CPU port. You typically will want to call
setInvalid() on the controller header to remove it from your packet before it leaves the switch, since if you sent the packet to another device expecting Ethernet frames with that controller header still in front of it, that device would interpret those bytes as the beginning of the Ethernet header.