mirror of
https://github.com/20kaushik02/real-time-traffic-analysis-clickhouse.git
synced 2026-01-25 08:04:04 +00:00
ip2location data
This commit is contained in:
@@ -1,5 +1,7 @@
|
||||
# Data filtering, preprocessing and selection for further use
|
||||
|
||||
## Traffic data
|
||||
|
||||
- IP packet traces are taken [from here](https://mawi.wide.ad.jp/mawi/samplepoint-F/2023/)
|
||||
- Filtering
|
||||
- L4 - Limit to TCP and UDP
|
||||
@@ -15,6 +17,11 @@
|
||||
- Packet size - in bytes
|
||||
- `sample_output.csv` contains a partial subset of `202310081400.pcap`, ~600K packets
|
||||
|
||||
## IP geolocation database
|
||||
|
||||
- This project uses the IP2Location LITE database for [IP geolocation](https://lite.ip2location.com)
|
||||
- bit of preprocessing to leave out country code and convert IP address from decimal format to dotted string format
|
||||
|
||||
# Setting up Kafka
|
||||
- Download and install kafka [from here](https://kafka.apache.org/downloads)
|
||||
- Run all commands in separate terminals from installation location
|
||||
|
||||
Reference in New Issue
Block a user