IOT Network Intrusion Detection Analysis
Proposal
Dataset
Rows: 123,117
Columns: 77
$ id.orig_p <dbl> 38667, 51143, 44761, 60893, 51087, 48579, 540…
$ id.resp_p <dbl> 1883, 1883, 1883, 1883, 1883, 1883, 1883, 188…
$ proto <chr> "tcp", "tcp", "tcp", "tcp", "tcp", "tcp", "tc…
$ service <chr> "mqtt", "mqtt", "mqtt", "mqtt", "mqtt", "mqtt…
$ flow_duration <dbl> 32.01160, 31.88358, 32.12405, 31.96106, 31.90…
$ fwd_pkts_tot <dbl> 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, …
$ bwd_pkts_tot <dbl> 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, …
$ fwd_data_pkts_tot <dbl> 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, …
$ bwd_data_pkts_tot <dbl> 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, …
$ fwd_pkts_per_sec <dbl> 0.281148, 0.282277, 0.280164, 0.281593, 0.282…
$ bwd_pkts_per_sec <dbl> 0.156193, 0.156821, 0.155647, 0.156440, 0.156…
$ flow_pkts_per_sec <dbl> 0.437341, 0.439097, 0.435811, 0.438033, 0.438…
$ down_up_ratio <dbl> 0.555556, 0.555556, 0.555556, 0.555556, 0.555…
$ fwd_header_size_tot <dbl> 296, 296, 296, 296, 296, 296, 296, 296, 296, …
$ fwd_header_size_min <dbl> 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 3…
$ fwd_header_size_max <dbl> 40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 4…
$ bwd_header_size_tot <dbl> 168, 168, 168, 168, 168, 168, 168, 168, 168, …
$ bwd_header_size_min <dbl> 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 3…
$ bwd_header_size_max <dbl> 40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 4…
$ flow_FIN_flag_count <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, …
$ flow_SYN_flag_count <dbl> 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, …
$ flow_RST_flag_count <dbl> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, …
$ fwd_PSH_flag_count <dbl> 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, …
$ bwd_PSH_flag_count <dbl> 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, …
$ flow_ACK_flag_count <dbl> 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 1…
$ fwd_URG_flag_count <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, …
$ fwd_pkts_payload.min <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, …
$ fwd_pkts_payload.max <dbl> 33, 33, 33, 33, 33, 33, 33, 33, 33, 33, 33, 3…
$ fwd_pkts_payload.avg <dbl> 8.444444, 8.444444, 8.222222, 8.222222, 8.444…
$ fwd_pkts_payload.std <dbl> 13.11594, 13.11594, 12.85280, 12.85280, 13.11…
$ bwd_pkts_payload.min <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, …
$ bwd_pkts_payload.max <dbl> 23, 23, 21, 21, 23, 23, 23, 23, 23, 23, 23, 2…
$ bwd_pkts_payload.tot <dbl> 32, 32, 30, 30, 32, 32, 32, 32, 32, 32, 32, 3…
$ bwd_pkts_payload.avg <dbl> 6.4, 6.4, 6.0, 6.0, 6.4, 6.4, 6.4, 6.4, 6.4, …
$ bwd_pkts_payload.std <dbl> 9.555103, 9.555103, 8.689074, 8.689074, 9.555…
$ flow_pkts_payload.avg <dbl> 7.714286, 7.714286, 7.428571, 7.428571, 7.714…
$ flow_pkts_payload.std <dbl> 11.61848, 11.61848, 11.22987, 11.22987, 11.61…
$ fwd_iat.min <dbl> 761.9858, 247.0016, 283.9565, 288.9633, 387.9…
$ fwd_iat.max <dbl> 29729183, 29855277, 29842149, 29913775, 29814…
$ fwd_iat.tot <dbl> 32011598, 31883584, 32124053, 31961063, 31902…
$ fwd_iat.avg <dbl> 4001450, 3985448, 4015507, 3995133, 3987795, …
$ fwd_iat.std <dbl> 10403074, 10463456, 10442378, 10482528, 10447…
$ bwd_iat.min <dbl> 4438.87711, 4214.04839, 2456.90346, 3933.9065…
$ bwd_iat.max <dbl> 1511694, 1576436, 1476049, 1551892, 1632083, …
$ bwd_iat.tot <dbl> 2026391, 1876261, 2013770, 1883784, 1935984, …
$ bwd_iat.avg <dbl> 506597.8, 469065.2, 503442.5, 470946.0, 48399…
$ bwd_iat.std <dbl> 680406.1, 741351.7, 660344.4, 724569.3, 76854…
$ flow_iat.min <dbl> 761.98578, 247.00165, 283.95653, 288.96332, 3…
$ flow_iat.max <dbl> 29729183, 29855277, 29842149, 29913775, 29814…
$ flow_iat.tot <dbl> 32011598, 31883584, 32124053, 31961063, 31902…
$ flow_iat.avg <dbl> 2462431, 2452583, 2471081, 2458543, 2454028, …
$ flow_iat.std <dbl> 8199747, 8242459, 8230593, 8257786, 8230584, …
$ payload_bytes_per_second <dbl> 3.373777, 3.387323, 3.237450, 3.253959, 3.385…
$ fwd_subflow_pkts <dbl> 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, …
$ bwd_subflow_pkts <dbl> 1.666667, 1.666667, 1.666667, 1.666667, 1.666…
$ fwd_subflow_bytes <dbl> 25.33333, 25.33333, 24.66667, 24.66667, 25.33…
$ bwd_subflow_bytes <dbl> 10.66667, 10.66667, 10.00000, 10.00000, 10.66…
$ fwd_bulk_bytes <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, …
$ bwd_bulk_bytes <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, …
$ fwd_bulk_packets <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, …
$ bwd_bulk_packets <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, …
$ fwd_bulk_rate <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, …
$ bwd_bulk_rate <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, …
$ active.min <dbl> 2282415, 2028307, 2281904, 2047288, 2087657, …
$ active.max <dbl> 2282415, 2028307, 2281904, 2047288, 2087657, …
$ active.tot <dbl> 2282415, 2028307, 2281904, 2047288, 2087657, …
$ active.avg <dbl> 2282415, 2028307, 2281904, 2047288, 2087657, …
$ active.std <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, …
$ idle.min <dbl> 29729183, 29855277, 29842149, 29913775, 29814…
$ idle.max <dbl> 29729183, 29855277, 29842149, 29913775, 29814…
$ idle.tot <dbl> 29729183, 29855277, 29842149, 29913775, 29814…
$ idle.avg <dbl> 29729183, 29855277, 29842149, 29913775, 29814…
$ idle.std <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, …
$ fwd_init_window_size <dbl> 64240, 64240, 64240, 64240, 64240, 64240, 642…
$ bwd_init_window_size <dbl> 26847, 26847, 26847, 26847, 26847, 26847, 268…
$ fwd_last_window_size <dbl> 502, 502, 502, 502, 502, 502, 502, 502, 502, …
$ Attack_type <chr> "MQTT_Publish", "MQTT_Publish", "MQTT_Publish…
id.orig_p id.resp_p proto service
Min. : 0 Min. : 0 Length:123117 Length:123117
1st Qu.:17702 1st Qu.: 21 Class :character Class :character
Median :37221 Median : 21 Mode :character Mode :character
Mean :34639 Mean : 1014
3rd Qu.:50971 3rd Qu.: 21
Max. :65535 Max. :65389
flow_duration fwd_pkts_tot bwd_pkts_tot fwd_data_pkts_tot
Min. : 0.00 Min. : 0.000 Min. : 0.00 Min. : 0.000
1st Qu.: 0.00 1st Qu.: 1.000 1st Qu.: 1.00 1st Qu.: 1.000
Median : 0.00 Median : 1.000 Median : 1.00 Median : 1.000
Mean : 3.81 Mean : 2.269 Mean : 1.91 Mean : 1.471
3rd Qu.: 0.00 3rd Qu.: 1.000 3rd Qu.: 1.00 3rd Qu.: 1.000
Max. :21728.34 Max. :4345.000 Max. :10112.00 Max. :4345.000
bwd_data_pkts_tot fwd_pkts_per_sec bwd_pkts_per_sec flow_pkts_per_sec
Min. : 0.00 Min. : 0.0 Min. : 0.0 Min. : 0.0
1st Qu.: 0.00 1st Qu.: 74.5 1st Qu.: 72.9 1st Qu.: 149.1
Median : 0.00 Median : 246723.8 Median : 246723.8 Median : 493447.5
Mean : 0.82 Mean : 351806.3 Mean : 351762.0 Mean : 703568.3
3rd Qu.: 0.00 3rd Qu.: 524288.0 3rd Qu.: 524288.0 3rd Qu.:1048576.0
Max. :10105.00 Max. :1048576.0 Max. :1048576.0 Max. :2097152.0
down_up_ratio fwd_header_size_tot fwd_header_size_min fwd_header_size_max
Min. :0.0000 Min. : 0.00 Min. : 0.00 Min. : 0.00
1st Qu.:1.0000 1st Qu.: 20.00 1st Qu.:20.00 1st Qu.:20.00
Median :1.0000 Median : 20.00 Median :20.00 Median :20.00
Mean :0.8546 Mean : 53.89 Mean :19.78 Mean :20.65
3rd Qu.:1.0000 3rd Qu.: 20.00 3rd Qu.:20.00 3rd Qu.:20.00
Max. :6.0879 Max. :69296.00 Max. :44.00 Max. :52.00
bwd_header_size_tot bwd_header_size_min bwd_header_size_max
Min. : 0.0 Min. : 0.0 Min. : 0.00
1st Qu.: 20.0 1st Qu.:20.0 1st Qu.:20.00
Median : 20.0 Median :20.0 Median :20.00
Mean : 46.6 Mean :17.7 Mean :18.43
3rd Qu.: 20.0 3rd Qu.:20.0 3rd Qu.:20.00
Max. :323592.0 Max. :40.0 Max. :44.00
flow_FIN_flag_count flow_SYN_flag_count flow_RST_flag_count fwd_PSH_flag_count
Min. : 0.0000 Min. :0.0000 Min. : 0.0000 Min. : 0.0000
1st Qu.: 0.0000 1st Qu.:1.0000 1st Qu.: 1.0000 1st Qu.: 0.0000
Median : 0.0000 Median :1.0000 Median : 1.0000 Median : 0.0000
Mean : 0.1156 Mean :0.9509 Mean : 0.7965 Mean : 0.3513
3rd Qu.: 0.0000 3rd Qu.:1.0000 3rd Qu.: 1.0000 3rd Qu.: 0.0000
Max. :10.0000 Max. :8.0000 Max. :10.0000 Max. :864.0000
bwd_PSH_flag_count flow_ACK_flag_count fwd_URG_flag_count
Min. : 0.0000 Min. : 0.000 Min. :0.00000
1st Qu.: 0.0000 1st Qu.: 1.000 1st Qu.:0.00000
Median : 0.0000 Median : 1.000 Median :0.00000
Mean : 0.3936 Mean : 2.678 Mean :0.01629
3rd Qu.: 0.0000 3rd Qu.: 1.000 3rd Qu.:0.00000
Max. :1446.0000 Max. :11772.000 Max. :1.00000
fwd_pkts_payload.min fwd_pkts_payload.max fwd_pkts_payload.avg
Min. : 0.00 Min. : 0.0 Min. : 0.0
1st Qu.: 120.00 1st Qu.: 120.0 1st Qu.: 120.0
Median : 120.00 Median : 120.0 Median : 120.0
Mean : 96.26 Mean : 120.7 Mean : 100.5
3rd Qu.: 120.00 3rd Qu.: 120.0 3rd Qu.: 120.0
Max. :1097.00 Max. :1420.0 Max. :1319.4
fwd_pkts_payload.std bwd_pkts_payload.min bwd_pkts_payload.max
Min. : 0.000 Min. : 0.000 Min. : 0.00
1st Qu.: 0.000 1st Qu.: 0.000 1st Qu.: 0.00
Median : 0.000 Median : 0.000 Median : 0.00
Mean : 8.108 Mean : 3.817 Mean : 52.41
3rd Qu.: 0.000 3rd Qu.: 0.000 3rd Qu.: 0.00
Max. :731.579 Max. :1357.000 Max. :5124.00
bwd_pkts_payload.tot bwd_pkts_payload.avg bwd_pkts_payload.std
Min. : 0 Min. : 0.00 Min. : 0.00
1st Qu.: 0 1st Qu.: 0.00 1st Qu.: 0.00
Median : 0 Median : 0.00 Median : 0.00
Mean : 513 Mean : 18.79 Mean : 20.55
3rd Qu.: 0 3rd Qu.: 0.00 3rd Qu.: 0.00
Max. :13610415 Max. :1457.05 Max. :1506.01
flow_pkts_payload.avg flow_pkts_payload.std fwd_iat.min
Min. : 0.00 Min. : 0.00 Min. : 0
1st Qu.: 60.00 1st Qu.: 50.22 1st Qu.: 0
Median : 60.00 Median : 84.85 Median : 0
Mean : 65.01 Mean : 76.04 Mean : 8843
3rd Qu.: 60.00 3rd Qu.: 84.85 3rd Qu.: 0
Max. :1156.08 Max. :924.65 Max. :300252571
fwd_iat.max fwd_iat.tot fwd_iat.avg
Min. : 0 Min. :0.000e+00 Min. : 0
1st Qu.: 0 1st Qu.:0.000e+00 1st Qu.: 0
Median : 0 Median :0.000e+00 Median : 0
Mean : 1721566 Mean :3.780e+06 Mean : 237357
3rd Qu.: 0 3rd Qu.:0.000e+00 3rd Qu.: 0
Max. :300252571 Max. :2.173e+10 Max. :300252571
fwd_iat.std bwd_iat.min bwd_iat.max bwd_iat.tot
Min. : 0 Min. : 0 Min. : 0 Min. :0.000e+00
1st Qu.: 0 1st Qu.: 0 1st Qu.: 0 1st Qu.:0.000e+00
Median : 0 Median : 0 Median : 0 Median :0.000e+00
Mean : 577557 Mean : 3765 Mean : 407727 Mean :1.780e+06
3rd Qu.: 0 3rd Qu.: 0 3rd Qu.: 0 3rd Qu.:0.000e+00
Max. :212296532 Max. :43196220 Max. :300028179 Max. :1.876e+10
bwd_iat.avg bwd_iat.std flow_iat.min flow_iat.max
Min. : 0 Min. : 0 Min. : 0 Min. : 0
1st Qu.: 0 1st Qu.: 0 1st Qu.: 1 1st Qu.: 1
Median : 0 Median : 0 Median : 4 Median : 4
Mean : 87652 Mean : 147480 Mean : 4283 Mean : 1725999
3rd Qu.: 0 3rd Qu.: 0 3rd Qu.: 5 3rd Qu.: 5
Max. :150148934 Max. :211961260 Max. :43510042 Max. :299999988
flow_iat.tot flow_iat.avg flow_iat.std
Min. :0.000e+00 Min. : 0 Min. : 0
1st Qu.:1.000e+00 1st Qu.: 1 1st Qu.: 0
Median :4.000e+00 Median : 4 Median : 0
Mean :3.811e+06 Mean : 139654 Mean : 450136
3rd Qu.:5.000e+00 3rd Qu.: 5 3rd Qu.: 0
Max. :2.173e+10 Max. :72835758 Max. :134122073
payload_bytes_per_second fwd_subflow_pkts bwd_subflow_pkts
Min. : 0 Min. : 0.000 Min. : 0.000
1st Qu.: 2581 1st Qu.: 1.000 1st Qu.: 1.000
Median : 29606852 Median : 1.000 Median : 1.000
Mean : 41053452 Mean : 1.552 Mean : 1.338
3rd Qu.: 55924053 3rd Qu.: 1.000 3rd Qu.: 1.000
Max. :125829120 Max. :276.833 Max. :1685.333
fwd_subflow_bytes bwd_subflow_bytes fwd_bulk_bytes bwd_bulk_bytes
Min. : 0.0 Min. : 0.0 Min. : 0.0 Min. : 0
1st Qu.: 120.0 1st Qu.: 0.0 1st Qu.: 0.0 1st Qu.: 0
Median : 120.0 Median : 0.0 Median : 0.0 Median : 0
Mean : 136.5 Mean : 217.5 Mean : 19.2 Mean : 155
3rd Qu.: 120.0 3rd Qu.: 0.0 3rd Qu.: 0.0 3rd Qu.: 0
Max. :52067.8 Max. :2268402.5 Max. :465095.0 Max. :6805208
fwd_bulk_packets bwd_bulk_packets fwd_bulk_rate bwd_bulk_rate
Min. : 0.0000 Min. : 0.000 Min. : 0 Min. : 0
1st Qu.: 0.0000 1st Qu.: 0.000 1st Qu.: 0 1st Qu.: 0
Median : 0.0000 Median : 0.000 Median : 0 Median : 0
Mean : 0.0241 Mean : 0.131 Mean : 3836 Mean : 48415
3rd Qu.: 0.0000 3rd Qu.: 0.000 3rd Qu.: 0 3rd Qu.: 0
Max. :343.0000 Max. :5052.500 Max. :46336283 Max. :28300874
active.min active.max active.tot
Min. : 0 Min. : 0 Min. :0.000e+00
1st Qu.: 1 1st Qu.: 1 1st Qu.:1.000e+00
Median : 4 Median : 4 Median :4.000e+00
Mean : 133155 Mean : 178590 Mean :2.929e+05
3rd Qu.: 5 3rd Qu.: 5 3rd Qu.:5.000e+00
Max. :312507974 Max. :848097909 Max. :2.945e+09
active.avg active.std idle.min
Min. : 0 Min. : 0 Min. : 0
1st Qu.: 1 1st Qu.: 0 1st Qu.: 0
Median : 4 Median : 0 Median : 0
Mean : 148135 Mean : 23536 Mean : 1616655
3rd Qu.: 5 3rd Qu.: 0 3rd Qu.: 0
Max. :437493062 Max. :477486236 Max. :299999988
idle.max idle.tot idle.avg
Min. : 0 Min. :0.000e+00 Min. : 0
1st Qu.: 0 1st Qu.:0.000e+00 1st Qu.: 0
Median : 0 Median :0.000e+00 Median : 0
Mean : 1701956 Mean :3.518e+06 Mean : 1664985
3rd Qu.: 0 3rd Qu.:0.000e+00 3rd Qu.: 0
Max. :299999988 Max. :2.097e+10 Max. :299999988
idle.std fwd_init_window_size bwd_init_window_size
Min. : 0 Min. : 0 Min. : 0
1st Qu.: 0 1st Qu.: 64 1st Qu.: 0
Median : 0 Median : 64 Median : 0
Mean : 45502 Mean : 6119 Mean : 2740
3rd Qu.: 0 3rd Qu.: 64 3rd Qu.: 0
Max. :120802871 Max. :65535 Max. :65535
fwd_last_window_size Attack_type
Min. : 0.0 Length:123117
1st Qu.: 64.0 Class :character
Median : 64.0 Mode :character
Mean : 751.6
3rd Qu.: 64.0
Max. :65535.0
Description
The dataset consists of 123,117 rows and 77 columns, capturing network traffic flow data. Key features include:
Network Identifiers: Columns like id.orig_p and id.resp_p capture originating and responding port IDs.
Protocol and Service Information: Columns such as proto (protocol) and service (e.g., MQTT) identify the communication protocol and services in use.
Traffic Statistics: These include metrics like packet counts (fwd_pkts_tot, bwd_pkts_tot), packet rates (fwd_pkts_per_sec), and header sizes.
Flow Characteristics: Features like flow duration (flow_duration), and flags such as flow_FIN_flag_count, flow_ACK_flag_count,flow_SYN_flag_count flow_RST_flag_count capture communication patterns.
Attack Type: The Attack_type column labels the type of attack or event detected (e.g., MQTT_Publish).
Payload Information: This payload information describes the size of packets that is flowing through the network during an attack vs normal traffic for eg : fwd_pkts_payload.avg,fwd_pkts_payload.min, fwd_pkts_payload.max which will be higher during an attack.
Bandwidth Information: The amount of data flowing through IOT infrastructure varies during different type of attack for e.g. during DDOS slowloris is denial of service attack where the data flow is much higher in comparison to normal traffic. So these variables are used for bandwidth information fwd_pkts_tot, bwd_pkts_tot, fwd_data_pkts_tot, bwd_data_pkts_tot,fwd_pkts_per_sec, bwd_pkts_per_sec, flow_pkts_per_sec.
Inter-arrival time information: Variables like fwd_iat.min, fwd_iat.avg, flow_iat.min can be used to determine what is the time difference between two packets which corresponds to payload information as the payload gets bigger IAT and IAT flow time will be larger.
Idle vs In-use information: active.avg , idle.avg will provide the information about the IOT devices if it is forwarding or not forwarding the network traffic.
The dataset appears to be useful for studying network behavior, identifying attacks, and analyzing flow-based communication statistics.
Source of the data
The RT-IoT 2022 dataset, available from the UCI Machine Learning Repository, is designed for research on detecting attacks in IoT (Internet of Things) systems. It contains network flow data from various IoT devices and captures both normal and malicious traffic, making it valuable for studying cybersecurity in IoT environments. The dataset includes features such as packet counts, traffic flow statistics, and communication protocols, which are essential for intrusion detection and anomaly analysis in smart systems. Researchers often use it to train machine learning models to detect cyberattacks.
Dataset Generation
The RT-IoT2022 dataset was specifically created to train and test the IDS. The dataset comprises normal and attack traffic, captured using real-time IoT devices like ThingSpeak-LED, MQTT-Temp, Amazon Alexa, and Wipro Bulb. The authors used a router setup to connect both victim (IoT devices) and attacker devices, capturing network traffic through the open-source tool Wireshark, which recorded and converted traces into PCAP files.
Attack Simulation: SSH Brute-Force Attack: Metasploit’s modules were employed to launch SSH brute-force attacks after scanning for open ports using Nmap. DDoS Attack: The Hping3 tool from Kali Linux was utilized to generate DDoS attacks, transmitting thousands of packets to simulate high traffic.
Feature Engineering: The collected PCAP files were processed using the CICFlowmeter tool, converting the network traffic data into bidirectional flow features for analysis. Irrelevant information like source and destination addresses were removed, and categorical features were numerically encoded to prevent overfitting. This method ensured a realistic and comprehensive dataset, encompassing both benign and malicious IoT traffic, critical for developing and testing the QAE IDS model.
Questions
The following are the question will be used for our project:-
Which protocol, service and port number is used in different type of attack scenarios to avoid any future network cyber attacks ?
How do different type of attack show unique patterns across bandwidth, inter arrival time, payload and flow characteristics ? Are these patterns showing any reliable distinctions between attacks ?
Which combinations of dimensions is responsible for the different type of attack ?
Analysis plan
Question 1:
- Variable proto , service and id.resp_p will be used and compared with different type of attacks vs when actual devices is talking over same protocol, service and port number.
Question 2:
A relationship between variables that corresponds to bandwidth information for e.g. fwd_pkts_tot, bwd_pkts_tot, fwd_data_pkts_tot, bwd_data_pkts_tot, fwd_pkts_per_sec, bwd_pkts_per_sec, flow_pkts_per_sec with attack type will be determined. This will define the clear relationship of bandwidth during an attack
Inter-arrival time information which will use variables like fwd_iat.min, fwd_iat.avg, flow_iat.min to make similar relationship during an attack vs normal operation.
Every attack type prohibit different payload behaviour. We will use the variables fwd_pkts_payload.avg,fwd_pkts_payload.min, fwd_pkts_payload.max to find out extreme large packet and empty packet
DDOS slowloris attack uses TCP SYNC message flooding through the server with the flow characterstics information we will compare number of TCP SYNC message with the number of other TCP messages
Question 3:
- With comparison of different variables in attack scenarios we will determine who and how many dimensions are affected during attacks
Ethical concerns
Our dataset do not have any ethical concerns because basic information of internal organization is not included. We are visualising and analysing without source and destination specified which has its pros and cons.