To what extent can P4 extract attributes/headers/etc from HTTPS packets

bcheevers123 · November 27, 2023, 2:42pm

To what extent can P4 extract attributes/headers/etc from HTTPS packets?

I understand that P4 can extract information from protocols at Layer’s 4 and below:

`header ethernet_t {
bit<48> dstAddr;
bit<48> srcAddr;
bit<16> etherType;
}

header ipv4_t {
bit<4> version;
bit<4> ihl;
bit<8> diffserv;
bit<16> totalLen;
bit<16> identification;
bit<3> flags;
bit<13> fragOffset;
bit<8> ttl;
bit<8> protocol;
bit<16> hdrChecksum;
bit<32> srcAddr;
bit<32> dstAddr;
}

header tcp_t{
bit<16> srcPort;
bit<16> dstPort;
bit<32> seqNo;
bit<32> ackNo;
bit<4> dataOffset;
bit<3> res;
bit<9> tcp_flags;
bit<16> window;
bit<16> checksum;
bit<16> urgentPtr;
}

header udp_t {
bit<16> srcPort;
bit<16> dstPort;
bit<16> udplen;
bit<16> udpchk;
}

header icmp_t {
bit<8> type;
bit<8> code;
bit<16> checksum;
}

struct headers {
ethernet_t ethernet;
ipv4_t ipv4;
tcp_t tcp;
udp_t udp;
icmp_t icmp;
}
`

But how would one go about extracting Layer 6 protocol information, such as HTTPS? Are their examples of this?

andyfingerhut · November 27, 2023, 6:12pm

To the extent that packets are encrypted, and your P4-programmable device does not have the decryption keys available, any part of the packet that is encrypted is effectively random gibberish to your P4 program.

I believe that for HTTPS, what you can see not-encrypted in packets are Ethernet + IPv4/IPv6 + TCP headers, and everything after the TCP header is encrypted, but I have not recently reviewed HTTPS to verify that.

bcheevers123 · November 28, 2023, 8:17am

Yes, I am aware that one would not be able to actually see the content of the HTTPS (as it is encrypted) but could one obtain packet features such as “Content-Length” from the header etc?

andyfingerhut · November 28, 2023, 8:40am

If by “Content-Length” you mean a field value defined by HTTP, then I am fairly sure that and every other HTTP field and value are part of the TCP payload data, and encrypted when you use HTTPS.

bcheevers123 · December 3, 2023, 10:14am

Ah yes you are correct, however I would be able to access the TLS packet information:

header tls_handshake_t {
bit<8> handshake_type;
bit<24> length;
}

p4prof · December 10, 2023, 2:06am

@bcheevers123 ,

In addition to the excellent responses provided by @andyfingerhut , I think I should point out an often overlooked fact and that is that the TCP is a streaming protocol, meaning that packet boundaries are largely irrelevant there. An individual message can be split into any number of packets (at the extreme ed it can be 1 byte per packet), it does not need to start on the packet boundary (until you are looking at the very first byte of the stream), etc.

Usually, properly matching on anything inside a TCP stream (even an unencrypted one) requires you to (a) reassemble the stream and (b) use regular expression matching.

Both (a) and (b) are pretty difficult to implement in P4 unless the target provides a lot of specialized support for this functionality.

Happy hacking,
Vladimir

bcheevers123 · December 10, 2023, 10:32am

Thank you @p4prof for your insight here.

Yes, since TCP is a streaming protocol, I’ve written a python script that interfaces with P4 and attributes each packet to the corresponding flow/stream.

I suppose my main query was what fields could one extract from an HTTPS packet or TLS packet provided that one doesn’t know the decryption key. Things such as; number of bytes, size, content length etc.

Has anyone created a header(s) for this?

Topic		Replies	Views
Extract payload from packet Getting Started with P4	4	1556	August 25, 2022
Detecting data in TCP payload Getting Started with P4	2	1192	January 22, 2023
Header MQTT Protocol Getting Started with P4	2	560	May 9, 2023
Decoding Header Stacks in Python (Scapy) Getting Started with P4	6	5085	May 13, 2022
How to parse packet trailer Getting Started with P4	2	1175	June 27, 2022

To what extent can P4 extract attributes/headers/etc from HTTPS packets

Related topics