Hi Niklas,
Short and efficient answer: There is a way to automate Scapy to extract header automatically (bind_layers()
etc.). Back in time I could not do this so I went step by step and extracted header by header. This is very inneficient so if you can investigate how to do it properly, I recommend it. Only take my example if you have nothing else. I used to check how IP was processed or how other headers are processed. Could not implement it in time, so I went the longest but easiest way (see my next answer). Consider that some headers like IP or TCP have options, or maybe other headers with TLV headers have to be similar (to a certain extent) if you are interested in extracting the INT meta stack. All those headers must have some kind of length field that defines “how many bytes” you have to extract.
Long but not so efficient answer: Let me show you my example. I have a private repository with an INT demo VM (hopefully public some day, when I have time) that does exactly this. My way of “decoding” the INT meta header from each hop is not the best but it worked. Let me tell you that I am not an expert in using scapy and when I programmed this I needed it to be fast so this is the solution I came up with. If I had to do this again, I would probably be more efficient and change some code parts. You should know that back in time I used to extract the information from Telemetry Reports. Not sure if the header remains the same, but headers and fields have probably changed if you check the latest specificaition.
First I define the metadata that I collect and could be added to the stack. For example the ingress global timestamp or the queue ID and queue occupancy. I just list a couple of examples so you understand my point:
class INT_q_occupancy(Packet):
name = "Queue Occupancy"
fields_desc = [
ByteField('q_id', 0),
BitField('q_occupancy', 0, 24),
]
class INT_ingress_tstamp(Packet):
name = "Ingress Timestamp"
fields_desc = [
IntField('ingress_global_timestamp', 0),
]
Then, le me show you INT shim, meta and the Telemetry Report too:
class INT_shim(Packet):
oName = "Telemetry Report Header"
fields_desc = [
ByteField('int_type', 0),
ByteField('rsvd1', 0),
ByteField('len', 0),
BitField('dscp', 0, 6),
BitField('rsvd2', 0, 2)
]
class INT_meta(Packet):
name = "INT Metadata Header"
fields_desc = [
BitField('ver', 0, 4),
BitField('rep', 0, 2),
BitField('c', 0, 1),
BitField('e', 0, 1),
BitField('m', 0, 1),
BitField('rsvd1', 0, 7),
BitField('rsvd2', 0, 3),
BitField('hop_metadata_len', 0, 5),
ByteField('remaining_hop_cnt', 0),
BitField('instruction_mask_0003', 0, 4),
BitField('instruction_mask_0407', 0, 4),
BitField('instruction_mask_0811', 0, 4),
BitField('instruction_mask_1215', 0, 4),
ShortField('rsvd3', 0),
]
class TelemetryReport(Packet):
name = "INT telemetry report"
fields_desc = [
BitField("ver" , 1 , 4),
BitField("len" , 4 , 4),
BitField("nProto", 0, 3),
BitField("repMdBits", 0, 6),
BitField("rsvd", 0, 6),
BitField("d", 0, 1),
BitField("q", 0, 1),
BitField("f", 0, 1),
BitField("hw_id", 0, 6),
IntField("switch_id", None),
IntField("seq_no", None),
IntField("ingress_tstamp", None)
]
This is how I handled the code, probbaly not the most efficient way but it worked:
def handle_pkt(packet, conn, flows):
info = { }
print("Handling report.")
info["rec_time"] = datetime.now().strftime("%Y-%m-%d %H:%M:%S.%f")
pkt = bytes(packet)
#print "## PACKET RECEIVED ##"
ICMP_PROTO = 1
TCP_PROTO = 6
UDP_PROTO = 17
ETHERNET_HEADER_LENGTH = 14
IP_HEADER_LENGTH = 20
ICMP_HEADER_LENGTH = 8
UDP_HEADER_LENGTH = 8
TCP_HEADER_LENGTH = 20
INT_REPORT_HEADER_LENGTH = 16
INT_SHIM_LENGTH = 4
INT_SHIM_WORD_LENGTH = 1
INT_META_LENGTH = 8
INT_META_WORD_LENGTH = 2
OUTER_ETHERNET_OFFSET = 0
OUTER_IP_HEADER = OUTER_ETHERNET_OFFSET + ETHERNET_HEADER_LENGTH
OUTER_L4_HEADER_OFFSET = OUTER_IP_HEADER + IP_HEADER_LENGTH
INNER_ETHERNET_OFFSET = INT_REPORT_HEADER_LENGTH
INNER_IP_HEADER_OFFSET = INNER_ETHERNET_OFFSET + ETHERNET_HEADER_LENGTH
INNER_L4_HEADER_OFFSET = INNER_IP_HEADER_OFFSET + IP_HEADER_LENGTH
INT_SHIM_OFFSET = INT_REPORT_HEADER_LENGTH+\
ETHERNET_HEADER_LENGTH+\
IP_HEADER_LENGTH
eth_report = Ether(pkt[0:ETHERNET_HEADER_LENGTH])
#eth_report.show()
ip_report = IP(pkt[OUTER_IP_HEADER:OUTER_IP_HEADER+IP_HEADER_LENGTH])
#ip_report.show()
udp_report = UDP(pkt[OUTER_L4_HEADER_OFFSET:OUTER_L4_HEADER_OFFSET+UDP_HEADER_LENGTH])
#udp_report.show()
raw_payload = bytes(packet[Raw]) # to get payload
telemetry_report = TelemetryReport(raw_payload[0:INT_REPORT_HEADER_LENGTH])
#telemetry_report.show()
inner_eth = Ether(raw_payload[INNER_ETHERNET_OFFSET:INNER_ETHERNET_OFFSET+ETHERNET_HEADER_LENGTH])
#inner_eth.show()
inner_ip = IP(raw_payload[INNER_IP_HEADER_OFFSET : INNER_IP_HEADER_OFFSET+IP_HEADER_LENGTH])
#inner_ip.show()
info["ip_src"] = (inner_ip.src).strip("'")
info["ip_dst"] = (inner_ip.dst).strip("'")
info["ip_proto"] = inner_ip.proto
info["port_dst"] = 0
info["port_src"] = 0
inner_tcp = None
inner_udp = None
if inner_ip.proto == ICMP_PROTO:
INT_SHIM_OFFSET+=ICMP_HEADER_LENGTH
inner_icmp = ICMP(raw_payload[INNER_L4_HEADER_OFFSET : INNER_L4_HEADER_OFFSET+ICMP_HEADER_LENGTH])
#inner_icmp.show()
elif inner_ip.proto == TCP_PROTO:
INT_SHIM_OFFSET+=TCP_HEADER_LENGTH
inner_tcp = TCP(raw_payload[INNER_L4_HEADER_OFFSET : INNER_L4_HEADER_OFFSET+TCP_HEADER_LENGTH])
#inner_tcp.show()
info["port_src"] = inner_tcp.sport
info["port_dst"] = inner_tcp.dport
elif inner_ip.proto == UDP_PROTO:
INT_SHIM_OFFSET+=UDP_HEADER_LENGTH
inner_udp = UDP(raw_payload[INNER_L4_HEADER_OFFSET : INNER_L4_HEADER_OFFSET+UDP_HEADER_LENGTH])
#inner_udp.show()
info["port_src"] = inner_udp.sport
info["port_dst"] = inner_udp.dport
else:
return
INT_META_OFFSET = INT_SHIM_OFFSET + INT_SHIM_LENGTH
#print("SHIM OFFSET: "+str(INT_SHIM_OFFSET))
int_shim = INT_shim(raw_payload[INT_SHIM_OFFSET : INT_SHIM_OFFSET+INT_SHIM_LENGTH])
#int_shim.show()
int_meta = INT_meta(raw_payload[INT_META_OFFSET : INT_META_OFFSET+INT_META_LENGTH])
int_meta.show()
INT_METADATA_STACK_OFFSET = INT_META_OFFSET + INT_META_LENGTH
# This is the key variable, it will tell you how many bytes of the stack you need to extract
INT_METADATA_STACK_LENGTH = (int_shim.len - INT_SHIM_WORD_LENGTH - INT_META_WORD_LENGTH) * 4
stack_payload = raw_payload[INT_METADATA_STACK_OFFSET:INT_METADATA_STACK_OFFSET+INT_METADATA_STACK_LENGTH]
info = extract_metadata_stack(stack_payload,\
INT_METADATA_STACK_LENGTH,
int_meta.hop_metadata_len * 4,\
int_meta.instruction_mask_0003,\
int_meta.instruction_mask_0407,\
info)
#Uncomment and magic happens
#print(info)
info["mon_id"] = get_flow_uuid(conn, info)
insert_data_to_db(conn, info)
sys.stdout.flush()
At this point, the info
variable holds INT meta from all hops. If you uncomment print(info)
you should be able to see the INT meta stack. The way that the INT meta stack is extracted is explained here:
def extract_0003_i0():
return
def extract_0003_i1(b):
return
#more of them until i10
def extract_0003_i10(b):
data = {}
s_id = INT_switch_id(b[0:4])
s_id.show()
hop_l = INT_hop_latency(b[4:8])
hop_l.show()
data["switch_id"] = s_id.switch_id
data["hop_latency"] = hop_l.hop_latency
return data
# more until finished
def extract_0003_i15(b):
return
def extract_ins_00_03(instruction, b):
if(instruction == 0):
return extract_0003_i0(b)
elif(instruction == 1):
return extract_0003_i1(b)
# more until i10, the one I decided to use for this example
return extract_0003_i10(b)
# more until the end of possible instructions (4 bits, 0 to 16)
elif(instruction == 15):
return extract_0003_i15(b)
def extract_metadata_stack(b, total_data_len, hop_m_len, instruction_mask_0003, instruction_mask_0407, info):
numHops = total_data_len / hop_m_len
info["instruction_mask_0003"] = instruction_mask_0003
info["instruction_mask_0407"] = instruction_mask_0407
info["data"] = {}
#print("##[ INT Metadata Stack ]##")
i=0
for hop in range(numHops,0,-1):
offset = i*hop_m_len
#print("##[ Data from hop "+str(hop)+" ]##")
info["data"]["hop_"+str(hop)] = {}
if(instruction_mask_0003 != 0):
data_0003 = extract_ins_00_03(instruction_mask_0003, b[offset:offset+hop_m_len])
info["data"]["hop_"+str(hop)] = data_0003
if(instruction_mask_0407 != 0):
data_0407 = extract_ins_04_07(instruction_mask_0407, b[offset:offset+hop_m_len])
info["data"]["hop_"+str(hop)].update(data_0407)
i+=1
return info
Consider that the Telemetry report length or INT meta length (to name two examples) are related to my specification implementation. You need to adjust most of the “constants” like INT_META_WORD_LENGTH
or INT_REPORT_HEADER_LENGTH
if my implementation does not fit your use case.
And of course, let me give you the whole file. It will be easier in the end. Ignore how I used to insert data into the database, I would probably use Influx now (MariaDB was easy at that point in time).
Cheers,