Beijing Taxi Monitoring System
Real-time processing of 10,000+ taxis with Apache Flink, Kafka, and Redis
Real-time processing of 10,000+ taxis with Apache Flink, Kafka, and Redis
Real-time Taxi Monitoring Platform
Processing GPS data from 10,357 taxis across Beijing with Apache Flink
Each taxi has a separate file with timestamped GPS coordinates:
Processes GPS events with sub-second latency using Apache Flink
Computes real-time speed between consecutive GPS points
Monitors taxis within 15km radius of Beijing center
End-to-End Data Pipeline
How GPS data flows through the system in real-time
Python script processing GPS files
Distributed event streaming
Stream processing engine
In-memory data store
Real-time visualization
Apache Flink Job Details
Real-time computation of taxi metrics and alerts
Calculates speed between consecutive GPS points using Haversine formula and time difference
Maintains running average of speed for each taxi using ValueState
Accumulates total distance traveled by each taxi
Detects when taxis leave monitored area (15km from Beijing center)
State automatically expires after 5 minutes of inactivity
Redis Data Structures
Optimized for low-latency real-time access
Stored as Redis Hashes with TTL (5 minutes)
Stored in Redis Hashes for fast access
Sorted Set with timestamp as score
Development & Operations Commands
Essential commands for building, deploying and monitoring the system
Build the Java project using a temporary Maven container.
docker run -it --rm ^
-v "%cd%\taxi_locations\consumer\taxi_flink:/app" ^
-w /app ^
maven:3.8.6-eclipse-temurin-17 ^
mvn clean package
docker run -it --rm \
-v "$(pwd)/taxi_locations/consumer/taxi_flink:/app" \
-w /app \
maven:3.8.6-eclipse-temurin-17 \
mvn clean package
Pseudo Implementation of Real-time Monitoring Dashboard
Interactive visualization of taxi movements and alerts
Beijing City Map