Files
traffic-safe/CLAUDE.md
2026-01-04 10:16:34 +08:00

3.3 KiB

CLAUDE.md

This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.

Build and Run Commands

# Install dependencies
pip install -r requirements.txt

# Run the Streamlit application
streamlit run app.py

# Run tests (if tests/ directory exists)
pytest -q

Architecture Overview

This is a Streamlit-based traffic safety analysis system with a three-layer architecture:

Layer Structure

app.py (Main Entry & UI Orchestration)
    ↓
ui_sections/ (UI Components - render_* functions)
    ↓
services/ (Business Logic)
    ↓
config/settings.py (Configuration)

Data Flow

  1. Input: Excel files uploaded via Streamlit sidebar (事故数据 + 策略数据)
  2. Processing: services/io.py handles loading, column aliasing, and cleaning
  3. Aggregation: Data aggregated to daily time series with aggregate_daily_data()
  4. Analysis: Various services process the aggregated data
  5. Output: Interactive Plotly charts, CSV exports, AI-generated reports

Key Services

Module Purpose
services/io.py Data loading, column normalization (COLUMN_ALIASES), region inference
services/forecast.py ARIMA grid search, KNN counterfactual, GLM/SVR extrapolation
services/strategy.py Strategy effectiveness evaluation (F1/F2 metrics, safety states)
services/hotspot.py Location extraction, risk scoring, strategy generation
services/metrics.py Model evaluation metrics (RMSE, MAE)

UI Sections

Each tab in the app corresponds to a render_* function in ui_sections/:

  • render_overview: KPI dashboard and time series visualization
  • render_forecast: Multi-model prediction comparison
  • render_model_eval: Model accuracy metrics
  • render_strategy_eval: Single strategy evaluation
  • render_hotspot: Accident hotspot analysis with risk levels

Session State Pattern

The app uses st.session_state['processed_data'] to persist:

  • Loaded DataFrames (combined_city, combined_by_region, accident_records)
  • Filter state (region_sel, date_range, strat_filter)
  • Derived metadata (all_regions, all_strategy_types, min_date, max_date)

AI Integration

Uses DeepSeek API (OpenAI-compatible) for generating analysis reports. Configuration in sidebar:

  • Base URL: https://api.deepseek.com
  • Model: deepseek-chat
  • Streaming response rendered incrementally

Coding Conventions

  • Python 3.8+ with type hints (from __future__ import annotations)
  • Functions/variables: snake_case; Classes: PascalCase; Constants: UPPER_SNAKE_CASE
  • Use @st.cache_data for expensive computations
  • Column aliases defined in COLUMN_ALIASES dict for flexible Excel input
  • Prefer pandas vectorization over loops

Data Format Requirements

Accident Data Excel must contain (or aliases of):

  • 事故时间 (accident time)
  • 所在街道 (street/region)
  • 事故类型 (accident type: 财损/伤人/亡人)

Strategy Data Excel must contain:

  • 发布时间 (publish date)
  • 交通策略类型 (strategy type)

Configuration (config/settings.py)

Key parameters:

  • ARIMA_P/D/Q: Grid search ranges for ARIMA
  • MIN_PRE_DAYS / MAX_PRE_DAYS: Historical data requirements
  • ANOMALY_CONTAMINATION: Isolation Forest contamination rate