mirror of
https://github.com/20kaushik02/real-time-traffic-analysis-clickhouse.git
synced 2025-12-06 07:54:07 +00:00
final submission
This commit is contained in:
parent
1549c39325
commit
63c0ceba60
BIN
CSE512 Final Report.pdf
Normal file
BIN
CSE512 Final Report.pdf
Normal file
Binary file not shown.
BIN
Demo_Video.mp4
Normal file
BIN
Demo_Video.mp4
Normal file
Binary file not shown.
35
README.md
35
README.md
@ -1 +1,36 @@
|
|||||||
# Real-time analytics of Internet traffic flow data
|
# Real-time analytics of Internet traffic flow data
|
||||||
|
|
||||||
|

|
||||||
|
|
||||||
|
## Download the dataset
|
||||||
|
|
||||||
|
- The full preprocessed dataset is hosted [here](https://tmp.knravish.me/512_proj/1M_sample_2023_10_01-2023_10_31.csv) - 1.4GB
|
||||||
|
- Place this file in the `preprocessing` directory
|
||||||
|
- For testing purposes, you can use the sample CSV that has 10k records from each day instead, change the bind path in the Compose file
|
||||||
|
|
||||||
|
## To run the project
|
||||||
|
|
||||||
|
- From the `scripts` directory:
|
||||||
|
- Run `deploy.ps1 -M` for Windows
|
||||||
|
- Run `deploy.sh -M` for Linux/macOS (add `-S` if sudo needed for docker)
|
||||||
|
- See the `README` in `scripts` for more
|
||||||
|
- This sets up the whole stack
|
||||||
|
|
||||||
|
### Access the UI
|
||||||
|
|
||||||
|
- The Grafana web interface is located at `http://localhost:7602`
|
||||||
|
- Login:
|
||||||
|
- Username: `thewebfarm`
|
||||||
|
- Password: `mrafbeweht`
|
||||||
|
- Go to `Dashboards` > `Internet traffic capture analysis`
|
||||||
|
|
||||||
|
### To run the shard creation and scaling script
|
||||||
|
|
||||||
|
- From the `scripts` directory:
|
||||||
|
- Install dependencies: `python3 -r ../clickhouse/update_config_scripts/requirements.txt`
|
||||||
|
- Run `python3 ../clickhouse/update_config_scripts/update_trigger.py`
|
||||||
|
- This checks every 2 minutes and creates a new shard and two server nodes for it based on resource utilization
|
||||||
|
|
||||||
|
## Limitations
|
||||||
|
|
||||||
|
- For multi-node deployments using Docker Swarm, the manager node needs to be running on Linux (outside Docker Desktop i.e. standalone Docker installation) due to limitations in the Docker Swarm engine
|
||||||
|
|||||||
BIN
architecture_diagram.png
Normal file
BIN
architecture_diagram.png
Normal file
Binary file not shown.
|
After Width: | Height: | Size: 120 KiB |
File diff suppressed because it is too large
Load Diff
Loading…
x
Reference in New Issue
Block a user