final submission

This commit is contained in:
Kaushik Narayan R 2024-12-03 00:28:17 -07:00
parent 1549c39325
commit 63c0ceba60
5 changed files with 35 additions and 601819 deletions

BIN
CSE512 Final Report.pdf Normal file

Binary file not shown.

BIN
Demo_Video.mp4 Normal file

Binary file not shown.

View File

@ -1 +1,36 @@
# Real-time analytics of Internet traffic flow data
![Architecture diagram](./architecture_diagram.png)
## Download the dataset
- The full preprocessed dataset is hosted [here](https://tmp.knravish.me/512_proj/1M_sample_2023_10_01-2023_10_31.csv) - 1.4GB
- Place this file in the `preprocessing` directory
- For testing purposes, you can use the sample CSV that has 10k records from each day instead, change the bind path in the Compose file
## To run the project
- From the `scripts` directory:
- Run `deploy.ps1 -M` for Windows
- Run `deploy.sh -M` for Linux/macOS (add `-S` if sudo needed for docker)
- See the `README` in `scripts` for more
- This sets up the whole stack
### Access the UI
- The Grafana web interface is located at `http://localhost:7602`
- Login:
- Username: `thewebfarm`
- Password: `mrafbeweht`
- Go to `Dashboards` > `Internet traffic capture analysis`
### To run the shard creation and scaling script
- From the `scripts` directory:
- Install dependencies: `python3 -r ../clickhouse/update_config_scripts/requirements.txt`
- Run `python3 ../clickhouse/update_config_scripts/update_trigger.py`
- This checks every 2 minutes and creates a new shard and two server nodes for it based on resource utilization
## Limitations
- For multi-node deployments using Docker Swarm, the manager node needs to be running on Linux (outside Docker Desktop i.e. standalone Docker installation) due to limitations in the Docker Swarm engine

BIN
architecture_diagram.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 120 KiB

File diff suppressed because it is too large Load Diff