3.3 KiB
3.3 KiB
CLAUDE.md
This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
Build and Run Commands
# Install dependencies
pip install -r requirements.txt
# Run the Streamlit application
streamlit run app.py
# Run tests (if tests/ directory exists)
pytest -q
Architecture Overview
This is a Streamlit-based traffic safety analysis system with a three-layer architecture:
Layer Structure
app.py (Main Entry & UI Orchestration)
↓
ui_sections/ (UI Components - render_* functions)
↓
services/ (Business Logic)
↓
config/settings.py (Configuration)
Data Flow
- Input: Excel files uploaded via Streamlit sidebar (事故数据 + 策略数据)
- Processing:
services/io.pyhandles loading, column aliasing, and cleaning - Aggregation: Data aggregated to daily time series with
aggregate_daily_data() - Analysis: Various services process the aggregated data
- Output: Interactive Plotly charts, CSV exports, AI-generated reports
Key Services
| Module | Purpose |
|---|---|
services/io.py |
Data loading, column normalization (COLUMN_ALIASES), region inference |
services/forecast.py |
ARIMA grid search, KNN counterfactual, GLM/SVR extrapolation |
services/strategy.py |
Strategy effectiveness evaluation (F1/F2 metrics, safety states) |
services/hotspot.py |
Location extraction, risk scoring, strategy generation |
services/metrics.py |
Model evaluation metrics (RMSE, MAE) |
UI Sections
Each tab in the app corresponds to a render_* function in ui_sections/:
render_overview: KPI dashboard and time series visualizationrender_forecast: Multi-model prediction comparisonrender_model_eval: Model accuracy metricsrender_strategy_eval: Single strategy evaluationrender_hotspot: Accident hotspot analysis with risk levels
Session State Pattern
The app uses st.session_state['processed_data'] to persist:
- Loaded DataFrames (
combined_city,combined_by_region,accident_records) - Filter state (
region_sel,date_range,strat_filter) - Derived metadata (
all_regions,all_strategy_types,min_date,max_date)
AI Integration
Uses DeepSeek API (OpenAI-compatible) for generating analysis reports. Configuration in sidebar:
- Base URL:
https://api.deepseek.com - Model:
deepseek-chat - Streaming response rendered incrementally
Coding Conventions
- Python 3.8+ with type hints (
from __future__ import annotations) - Functions/variables:
snake_case; Classes:PascalCase; Constants:UPPER_SNAKE_CASE - Use
@st.cache_datafor expensive computations - Column aliases defined in
COLUMN_ALIASESdict for flexible Excel input - Prefer pandas vectorization over loops
Data Format Requirements
Accident Data Excel must contain (or aliases of):
事故时间(accident time)所在街道(street/region)事故类型(accident type: 财损/伤人/亡人)
Strategy Data Excel must contain:
发布时间(publish date)交通策略类型(strategy type)
Configuration (config/settings.py)
Key parameters:
ARIMA_P/D/Q: Grid search ranges for ARIMAMIN_PRE_DAYS/MAX_PRE_DAYS: Historical data requirementsANOMALY_CONTAMINATION: Isolation Forest contamination rate