A technical monitoring product for understanding bird strike risk at operational speed.

Aerorisk is a solo-built end-to-end machine learning project focused on one question: given current weather, recent bird activity, migration pressure, and airport context, which runways and flights are under the most bird strike risk right now?

The frontend is intentionally designed as a working instrument panel. The goal is not to market aviation data, but to make live risk telemetry readable, defensible, and inspectable by someone who wants to understand how the score was formed.

Build ownership

Frontend, backend, data pipeline, feature engineering, model training, and deployment were all built as one integrated system.

The product is aimed at a hiring manager or engineer who wants to see real product thinking paired with practical ML and data-platform execution.

System architecture

Deployment and data flow

┌──────────────────────────┐ ┌─────────────────────────┐ │ Next.js Frontend (Vercel)│ ───▶ │ FastAPI Backend │ │ SWR polling + detail UI │ API │ risk endpoints + cache │ └──────────────────────────┘ └──────────┬──────────────┘ │ hourly pipeline orchestration │ ┌──────────────┬────────────┼─────────────┬──────────────┐ │ NOAA METAR │ eBird │ BirdCast │ FAA history │ └──────────────┴────────────┴─────────────┴──────────────┘ │ LightGBM + SHAP cache │ Supabase / PostgreSQL

Delivery principles

Next.js frontend on Vercel, FastAPI backend on Railway, Supabase/Postgres for persisted scores and source data.

LightGBM predicts a continuous 0–1 risk score using 33 engineered features spanning weather, seasonality, geography, and migration context.

SHAP TreeExplainer runs in the pipeline, not in the request path, so the UI can open detailed reasoning panels without expensive runtime compute.

Frontend polling is SWR-based with stale-while-revalidate semantics so the interface remains useful during transient backend failures.

Modeling approach

Model: LightGBM regression on a calibrated 0–1 risk scale.

Features: 33 engineered variables covering temporal cycles, migration season, airport geography, strike history, live weather, and interaction terms.

Explainability: SHAP TreeExplainer caches top positive and negative contributors during the hourly pipeline so the frontend can render immediate reasoning panels.

Operational framing: runway-level ambient risk informs flight-level overlays, with bird observations and migration intensity acting as live modifiers on top of the baseline model output.

Data sources

FAA Wildlife Strike Database

Historical strike reports used to learn airport-specific baseline hazard patterns.

NOAA Aviation Weather

Live METAR observations for visibility, ceiling, wind, temperature, and precipitation.

eBird (Cornell Lab)

Recent nearby observations used as a real-time bird activity modifier around each airport.

BirdCast

Migration radar intensity and movement features cached by county for each monitored airport.