Compare commits
5 Commits
v1.2.1
...
mian-lates
| Author | SHA1 | Date | |
|---|---|---|---|
| 2726ba7bfc | |||
| 105a6b5076 | |||
| 28999baf85 | |||
| d94b5c5ba4 | |||
| d8eea8e3a9 |
36
AGENTS.md
36
AGENTS.md
@@ -1,36 +0,0 @@
|
||||
# Repository Guidelines
|
||||
|
||||
## Project Structure & Module Organization
|
||||
- Source: `app.py` (Streamlit UI, data processing, forecasting, anomaly detection, evaluation).
|
||||
- Docs & outputs: `docs/`, `overview_series.html`, `strategy_evaluation_results.csv`.
|
||||
- Samples: `sample/` for example data only; avoid sensitive content.
|
||||
- Meta: `requirements.txt`, `readme.md`, `LICENSE`, `CHANGELOG.md`.
|
||||
|
||||
## Build, Test, and Development Commands
|
||||
- Create env: `python -m venv .venv && source .venv/bin/activate` (or follow conda steps in `readme.md`).
|
||||
- Install deps: `pip install -r requirements.txt`.
|
||||
- Run app: `streamlit run app.py` then open `http://localhost:8501`.
|
||||
- Export artifacts: charts save as HTML (Plotly); forecasts may be written to CSV as noted in `readme.md`.
|
||||
|
||||
## Coding Style & Naming Conventions
|
||||
- Python ≥3.8; 4-space indentation; UTF-8.
|
||||
- Names: functions/variables `snake_case`; classes `PascalCase`; constants `UPPER_SNAKE_CASE`.
|
||||
- Files: keep scope focused; use descriptive output names (e.g., `arima_forecast.csv`).
|
||||
- Data handling: prefer pandas/NumPy vectorization; validate inputs; avoid global state except constants.
|
||||
|
||||
## Testing Guidelines
|
||||
- Framework: pytest (recommended). Place tests under `tests/`.
|
||||
- Naming: `test_<module>.py` and `test_<behavior>()`.
|
||||
- Run: `pytest -q`. Focus on `load_and_clean_data`, aggregation, model selection, and metrics.
|
||||
- Keep tests fast and deterministic; avoid large I/O. Use small DataFrame fixtures.
|
||||
|
||||
## Commit & Pull Request Guidelines
|
||||
- Messages: concise, present tense. Prefixes seen: `modify:`, `Add`, `Update`.
|
||||
- Include scope and reason: e.g., `modify: update requirements for statsmodels`.
|
||||
- PRs: clear description, linked issues, repro steps/screenshots for UI, and notes on any schema or output changes.
|
||||
|
||||
## Security & Configuration Tips
|
||||
- Do not commit real accident data or secrets. Use `sample/` for examples.
|
||||
- Optional envs: `LOG_LEVEL=DEBUG`. Keep any API keys in environment variables, not in code.
|
||||
- Validate Excel column names before processing; handle missing columns/rows defensively.
|
||||
|
||||
@@ -3,7 +3,7 @@
|
||||
## [1.1.0] - 2025-08-28
|
||||
|
||||
### Added
|
||||
- Integrated GPT-based analysis for comprehensive traffic safety insights
|
||||
- Integrated AI-based analysis for comprehensive traffic safety insights
|
||||
- Added automated report generation with AI-powered recommendations
|
||||
- Implemented natural language query processing for data exploration
|
||||
- Added export functionality for analysis reports (PDF/CSV formats)
|
||||
@@ -22,7 +22,7 @@
|
||||
- Addressed memory leaks in large dataset processing
|
||||
|
||||
### Documentation
|
||||
- Updated README with new GPT analysis features and usage examples
|
||||
- Updated README with new AI analysis features and usage examples
|
||||
- Added API documentation for extended functionality
|
||||
- Included sample datasets and tutorial guides
|
||||
|
||||
@@ -44,4 +44,4 @@
|
||||
|
||||
### Fixed
|
||||
|
||||
- Resolved session state KeyError.
|
||||
- Resolved session state KeyError.
|
||||
|
||||
99
CLAUDE.md
Normal file
99
CLAUDE.md
Normal file
@@ -0,0 +1,99 @@
|
||||
# CLAUDE.md
|
||||
|
||||
This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
|
||||
|
||||
## Build and Run Commands
|
||||
|
||||
```bash
|
||||
# Install dependencies
|
||||
pip install -r requirements.txt
|
||||
|
||||
# Run the Streamlit application
|
||||
streamlit run app.py
|
||||
|
||||
# Run tests (if tests/ directory exists)
|
||||
pytest -q
|
||||
```
|
||||
|
||||
## Architecture Overview
|
||||
|
||||
This is a Streamlit-based traffic safety analysis system with a three-layer architecture:
|
||||
|
||||
### Layer Structure
|
||||
|
||||
```
|
||||
app.py (Main Entry & UI Orchestration)
|
||||
↓
|
||||
ui_sections/ (UI Components - render_* functions)
|
||||
↓
|
||||
services/ (Business Logic)
|
||||
↓
|
||||
config/settings.py (Configuration)
|
||||
```
|
||||
|
||||
### Data Flow
|
||||
|
||||
1. **Input**: Excel files uploaded via Streamlit sidebar (事故数据 + 策略数据)
|
||||
2. **Processing**: `services/io.py` handles loading, column aliasing, and cleaning
|
||||
3. **Aggregation**: Data aggregated to daily time series with `aggregate_daily_data()`
|
||||
4. **Analysis**: Various services process the aggregated data
|
||||
5. **Output**: Interactive Plotly charts, CSV exports, AI-generated reports
|
||||
|
||||
### Key Services
|
||||
|
||||
| Module | Purpose |
|
||||
|--------|---------|
|
||||
| `services/io.py` | Data loading, column normalization (COLUMN_ALIASES), region inference |
|
||||
| `services/forecast.py` | ARIMA grid search, KNN counterfactual, GLM/SVR extrapolation |
|
||||
| `services/strategy.py` | Strategy effectiveness evaluation (F1/F2 metrics, safety states) |
|
||||
| `services/hotspot.py` | Location extraction, risk scoring, strategy generation |
|
||||
| `services/metrics.py` | Model evaluation metrics (RMSE, MAE) |
|
||||
|
||||
### UI Sections
|
||||
|
||||
Each tab in the app corresponds to a `render_*` function in `ui_sections/`:
|
||||
- `render_overview`: KPI dashboard and time series visualization
|
||||
- `render_forecast`: Multi-model prediction comparison
|
||||
- `render_model_eval`: Model accuracy metrics
|
||||
- `render_strategy_eval`: Single strategy evaluation
|
||||
- `render_hotspot`: Accident hotspot analysis with risk levels
|
||||
|
||||
### Session State Pattern
|
||||
|
||||
The app uses `st.session_state['processed_data']` to persist:
|
||||
- Loaded DataFrames (`combined_city`, `combined_by_region`, `accident_records`)
|
||||
- Filter state (`region_sel`, `date_range`, `strat_filter`)
|
||||
- Derived metadata (`all_regions`, `all_strategy_types`, `min_date`, `max_date`)
|
||||
|
||||
### AI Integration
|
||||
|
||||
Uses DeepSeek API (OpenAI-compatible) for generating analysis reports. Configuration in sidebar:
|
||||
- Base URL: `https://api.deepseek.com`
|
||||
- Model: `deepseek-chat`
|
||||
- Streaming response rendered incrementally
|
||||
|
||||
## Coding Conventions
|
||||
|
||||
- Python 3.8+ with type hints (`from __future__ import annotations`)
|
||||
- Functions/variables: `snake_case`; Classes: `PascalCase`; Constants: `UPPER_SNAKE_CASE`
|
||||
- Use `@st.cache_data` for expensive computations
|
||||
- Column aliases defined in `COLUMN_ALIASES` dict for flexible Excel input
|
||||
- Prefer pandas vectorization over loops
|
||||
|
||||
## Data Format Requirements
|
||||
|
||||
**Accident Data Excel** must contain (or aliases of):
|
||||
- `事故时间` (accident time)
|
||||
- `所在街道` (street/region)
|
||||
- `事故类型` (accident type: 财损/伤人/亡人)
|
||||
|
||||
**Strategy Data Excel** must contain:
|
||||
- `发布时间` (publish date)
|
||||
- `交通策略类型` (strategy type)
|
||||
|
||||
## Configuration (config/settings.py)
|
||||
|
||||
Key parameters:
|
||||
- `ARIMA_P/D/Q`: Grid search ranges for ARIMA
|
||||
- `MIN_PRE_DAYS` / `MAX_PRE_DAYS`: Historical data requirements
|
||||
- `ANOMALY_CONTAMINATION`: Isolation Forest contamination rate
|
||||
36
Dockerfile
Normal file
36
Dockerfile
Normal file
@@ -0,0 +1,36 @@
|
||||
# Use an official Python runtime as a parent image
|
||||
FROM python:3.12-slim
|
||||
|
||||
# Prevents writing .pyc files and buffering stdout/stderr
|
||||
ENV PYTHONDONTWRITEBYTECODE=1
|
||||
ENV PYTHONUNBUFFERED=1
|
||||
|
||||
# Set working directory
|
||||
WORKDIR /app
|
||||
|
||||
# Copy requirements first to leverage Docker cache
|
||||
COPY requirements.txt .
|
||||
|
||||
# Install system dependencies (if needed) and Python deps
|
||||
RUN apt-get update && apt-get install -y --no-install-recommends \
|
||||
build-essential \
|
||||
&& rm -rf /var/lib/apt/lists/*
|
||||
|
||||
RUN pip install --upgrade pip
|
||||
RUN pip install --no-cache-dir -r requirements.txt
|
||||
|
||||
# Copy app code
|
||||
COPY . .
|
||||
|
||||
# Expose Streamlit default port
|
||||
EXPOSE 8501
|
||||
|
||||
# Streamlit config: run headless and bind to 0.0.0.0
|
||||
ENV STREAMLIT_SERVER_HEADLESS=true
|
||||
ENV STREAMLIT_SERVER_ENABLE_CORS=false
|
||||
ENV STREAMLIT_SERVER_ENABLE_XSRF_PROTECTION=false
|
||||
ENV STREAMLIT_SERVER_PORT=8501
|
||||
ENV STREAMLIT_SERVER_ADDRESS=0.0.0.0
|
||||
|
||||
# Run Streamlit
|
||||
CMD ["streamlit", "run", "app.py", "--server.port=8501", "--server.address=0.0.0.0"]
|
||||
158
app.py
158
app.py
@@ -294,9 +294,9 @@ def run_streamlit_app():
|
||||
|
||||
# Add OpenAI API key input in sidebar
|
||||
st.sidebar.markdown("---")
|
||||
st.sidebar.subheader("GPT API 配置")
|
||||
openai_api_key = st.sidebar.text_input("GPT API Key", value='sk-sXY934yPqjh7YKKC08380b198fEb47308cDa09BeE23d9c8a', type="password", help="用于GPT分析结果的API密钥")
|
||||
open_ai_base_url = st.sidebar.text_input("GPT Base Url", value='https://aihubmix.com/v1', type='default')
|
||||
st.sidebar.subheader("AI API 配置")
|
||||
openai_api_key = st.sidebar.text_input("AI API Key", value='sk-959e0b065c774b1db6e30bf7589680f9', type="password", help="用于 AI 分析结果的 API 密钥")
|
||||
open_ai_base_url = st.sidebar.text_input("AI Base Url", value='https://api.deepseek.com', type='default')
|
||||
|
||||
# Process data only when Apply button is clicked
|
||||
if apply_button and accident_file and strategy_file:
|
||||
@@ -404,14 +404,14 @@ def run_streamlit_app():
|
||||
|
||||
tab_labels = [
|
||||
"🏠 总览",
|
||||
"📍 事故热点",
|
||||
"🔍 AI 分析",
|
||||
"📈 预测模型",
|
||||
"📊 模型评估",
|
||||
"⚠️ 异常检测",
|
||||
"📝 策略评估",
|
||||
"⚖️ 策略对比",
|
||||
"🧪 情景模拟",
|
||||
"🔍 GPT 分析",
|
||||
"📍 事故热点",
|
||||
]
|
||||
default_tab = st.session_state.get("active_tab", tab_labels[0])
|
||||
if default_tab not in tab_labels:
|
||||
@@ -426,17 +426,94 @@ def run_streamlit_app():
|
||||
st.session_state["active_tab"] = selected_tab
|
||||
|
||||
|
||||
if selected_tab == "📍 事故热点":
|
||||
if selected_tab == "🏠 总览":
|
||||
if render_overview is not None:
|
||||
render_overview(base, region_sel, start_dt, end_dt, strat_filter)
|
||||
else:
|
||||
st.warning("概览模块未能加载,请检查 `ui_sections/overview.py`。")
|
||||
|
||||
elif selected_tab == "📍 事故热点":
|
||||
if render_hotspot is not None:
|
||||
render_hotspot(accident_records, accident_source_name)
|
||||
else:
|
||||
st.warning("事故热点模块未能加载,请检查 `ui_sections/hotspot.py`。")
|
||||
|
||||
elif selected_tab == "🏠 总览":
|
||||
if render_overview is not None:
|
||||
render_overview(base, region_sel, start_dt, end_dt, strat_filter)
|
||||
elif selected_tab == "🔍 AI 分析":
|
||||
from openai import OpenAI
|
||||
st.subheader("AI 数据分析与改进建议")
|
||||
if not HAS_OPENAI:
|
||||
st.warning("未安装 `openai` 库。请安装后重试。")
|
||||
elif not openai_api_key:
|
||||
st.info("请在左侧边栏输入 OpenAI API Key 以启用 AI 分析。")
|
||||
else:
|
||||
st.warning("概览模块未能加载,请检查 `ui_sections/overview.py`。")
|
||||
if all_strategy_types:
|
||||
# Generate results if not already
|
||||
results, recommendation = generate_output_and_recommendations(base, all_strategy_types,
|
||||
region=region_sel if region_sel != '全市' else '全市')
|
||||
df_res = pd.DataFrame(results).T
|
||||
kpi_json = json.dumps(kpi, ensure_ascii=False, indent=2)
|
||||
results_json = df_res.to_json(orient="records", force_ascii=False)
|
||||
recommendation_text = recommendation
|
||||
|
||||
# Prepare data to send
|
||||
data_to_analyze = {
|
||||
"kpis": kpi_json,
|
||||
"strategy_results": results_json,
|
||||
"recommendation": recommendation_text
|
||||
}
|
||||
data_str = json.dumps(data_to_analyze, ensure_ascii=False)
|
||||
|
||||
prompt = (
|
||||
"你是一名资深交通安全数据分析顾问。请基于以下结构化数据输出一份专业报告,需包含:\n"
|
||||
"1. 核心指标洞察:按要点总结事故趋势、显著波动及可能原因。\n"
|
||||
"2. 策略绩效评估:对比主要策略的优势、短板与适用场景。\n"
|
||||
"3. 优化建议:为短期(0-3个月)、中期(3-12个月)与长期(12个月以上)分别给出2-3条可操作措施。\n"
|
||||
"请保持正式语气,引用关键数值支撑结论,并用清晰的小节或列表呈现。\n"
|
||||
f"数据摘要:{data_str}\n"
|
||||
)
|
||||
if st.button("上传数据至 AI 并获取分析"):
|
||||
if not openai_api_key.strip():
|
||||
st.info("请提供有效的 AI API Key。")
|
||||
elif not open_ai_base_url.strip():
|
||||
st.info("请提供可访问的 AI Base Url。")
|
||||
else:
|
||||
try:
|
||||
client = OpenAI(
|
||||
base_url=open_ai_base_url,
|
||||
# sk-xxx替换为自己的key
|
||||
api_key=openai_api_key
|
||||
)
|
||||
st.markdown("### AI 分析结果与改进思路")
|
||||
placeholder = st.empty()
|
||||
accumulated_response: list[str] = []
|
||||
with st.spinner("AI 正在生成专业报告,请稍候…"):
|
||||
stream = client.chat.completions.create(
|
||||
model="deepseek-chat",
|
||||
messages=[
|
||||
{
|
||||
"role": "system",
|
||||
"content": "You are a professional traffic safety analyst who writes concise, well-structured Chinese reports."
|
||||
},
|
||||
{"role": "user", "content": prompt},
|
||||
],
|
||||
stream=True,
|
||||
)
|
||||
for chunk in stream:
|
||||
delta = chunk.choices[0].delta if chunk.choices else None
|
||||
piece = getattr(delta, "content", None) if delta else None
|
||||
if piece:
|
||||
accumulated_response.append(piece)
|
||||
placeholder.markdown("".join(accumulated_response), unsafe_allow_html=True)
|
||||
final_text = "".join(accumulated_response)
|
||||
if not final_text:
|
||||
placeholder.info("AI 未返回可用内容,请稍后重试或检查凭据配置。")
|
||||
except Exception as e:
|
||||
st.error(f"调用 OpenAI API 失败:{str(e)}")
|
||||
else:
|
||||
st.warning("没有策略数据可供分析。")
|
||||
|
||||
# Update refresh time
|
||||
st.session_state['last_refresh'] = datetime.now()
|
||||
|
||||
elif selected_tab == "📈 预测模型":
|
||||
if render_forecast is not None:
|
||||
@@ -652,67 +729,6 @@ def run_streamlit_app():
|
||||
else:
|
||||
st.info("请设置模拟参数并点击“应用模拟参数”按钮。")
|
||||
|
||||
# --- New Tab 8: GPT 分析
|
||||
elif selected_tab == "🔍 GPT 分析":
|
||||
from openai import OpenAI
|
||||
st.subheader("GPT 数据分析与改进建议")
|
||||
# open_ai_key = f"sk-dQhKOOG48iVEfgJfAb14458dA4474fB09aBbE8153d4aB3Fc"
|
||||
if not HAS_OPENAI:
|
||||
st.warning("未安装 `openai` 库。请安装后重试。")
|
||||
elif not openai_api_key:
|
||||
st.info("请在左侧边栏输入 OpenAI API Key 以启用 GPT 分析。")
|
||||
else:
|
||||
if all_strategy_types:
|
||||
# Generate results if not already
|
||||
results, recommendation = generate_output_and_recommendations(base, all_strategy_types,
|
||||
region=region_sel if region_sel != '全市' else '全市')
|
||||
df_res = pd.DataFrame(results).T
|
||||
kpi_json = json.dumps(kpi, ensure_ascii=False, indent=2)
|
||||
results_json = df_res.to_json(orient="records", force_ascii=False)
|
||||
recommendation_text = recommendation
|
||||
|
||||
# Prepare data to send
|
||||
data_to_analyze = {
|
||||
"kpis": kpi_json,
|
||||
"strategy_results": results_json,
|
||||
"recommendation": recommendation_text
|
||||
}
|
||||
data_str = json.dumps(data_to_analyze, ensure_ascii=False)
|
||||
|
||||
prompt = str(f"""
|
||||
请分析以下交通安全分析结果,包括KPI指标、策略评估结果和推荐。
|
||||
提供数据结果的详细分析,以及改进思路和建议。
|
||||
数据:{str(data_str)}
|
||||
""")
|
||||
if st.button("上传数据至 GPT 并获取分析"):
|
||||
if False:
|
||||
st.info("请将 GPT Base Url 更新为实际可访问的接口地址。")
|
||||
else:
|
||||
try:
|
||||
client = OpenAI(
|
||||
base_url=open_ai_base_url,
|
||||
# sk-xxx替换为自己的key
|
||||
api_key=openai_api_key
|
||||
)
|
||||
response = client.chat.completions.create(
|
||||
model="gpt-5-mini",
|
||||
messages=[
|
||||
{"role": "system", "content": "You are a helpful assistant that analyzes traffic safety data."},
|
||||
{"role": "user", "content": prompt}
|
||||
],
|
||||
stream=False
|
||||
)
|
||||
gpt_response = response.choices[0].message.content
|
||||
st.markdown("### GPT 分析结果与改进思路")
|
||||
st.markdown(gpt_response, unsafe_allow_html=True)
|
||||
except Exception as e:
|
||||
st.error(f"调用 OpenAI API 失败:{str(e)}")
|
||||
else:
|
||||
st.warning("没有策略数据可供分析。")
|
||||
|
||||
# Update refresh time
|
||||
st.session_state['last_refresh'] = datetime.now()
|
||||
|
||||
else:
|
||||
st.info("请先在左侧上传事故数据与策略数据,并点击“应用数据与筛选”按钮。")
|
||||
|
||||
|
||||
@@ -68,6 +68,25 @@ pip install streamlit-autorefresh openpyxl xlrd cryptography openai
|
||||
|
||||
3. Open `http://localhost:8501` in your browser. The home page should load without import errors.
|
||||
|
||||
## 5. Run with Docker (optional)
|
||||
|
||||
If you prefer an isolated container build, use the included `Dockerfile`:
|
||||
|
||||
```bash
|
||||
docker build -t trafficsafeanalyzer .
|
||||
docker run --rm -p 8501:8501 trafficsafeanalyzer
|
||||
```
|
||||
|
||||
To work with local data, mount the host folder containing Excel files:
|
||||
|
||||
```bash
|
||||
docker run --rm -p 8501:8501 \
|
||||
-v "$(pwd)/sample:/app/sample" \
|
||||
trafficsafeanalyzer
|
||||
```
|
||||
|
||||
The container exposes Streamlit on port 8501 by default. Override configuration via environment variables when needed, for example `-e STREAMLIT_SERVER_PORT=8502`.
|
||||
|
||||
## Troubleshooting tips
|
||||
|
||||
- **Missing package**: Re-run `pip install -r requirements.txt`.
|
||||
|
||||
@@ -4,7 +4,7 @@ TrafficSafeAnalyzer delivers accident analytics and decision support through a S
|
||||
|
||||
## Start the app
|
||||
|
||||
1. Activate your virtual or conda environment.
|
||||
1. Activate your virtual or conda environment(或在容器中运行,见下).
|
||||
2. From the project root, run:
|
||||
|
||||
```bash
|
||||
@@ -13,6 +13,8 @@ TrafficSafeAnalyzer delivers accident analytics and decision support through a S
|
||||
|
||||
3. Open `http://localhost:8501`. Keep the terminal running while you work in the browser.
|
||||
|
||||
> 使用 Docker?运行 `docker build -t trafficsafeanalyzer .` 与 `docker run --rm -p 8501:8501 trafficsafeanalyzer` 后,同样访问 `http://localhost:8501`。
|
||||
|
||||
## Load input data
|
||||
|
||||
Use the sidebar form labelled “数据与筛选”.
|
||||
@@ -41,7 +43,7 @@ Use the sidebar form labelled “数据与筛选”.
|
||||
- **📝 策略评估 (Strategy evaluation)** — Aggregates metrics per strategy type, recommends the best option, writes `strategy_evaluation_results.csv`, and updates `recommendation.txt`.
|
||||
- **⚖️ 策略对比 (Strategy comparison)** — side-by-side metrics for selected strategies, useful for “what worked best last month” reviews.
|
||||
- **🧪 情景模拟 (Scenario simulation)** — apply intervention models (persistent/decay, lagged effects) to test potential roll-outs.
|
||||
- **🔍 GPT 分析** — enter your own OpenAI-compatible API key and base URL in the sidebar to generate narrative insights. Keys are read at runtime only.
|
||||
- **🔍 AI 分析** — 默认示例 API Key/Base URL 已预填,可直接体验;如需切换自有凭据,可在侧边栏更新后生成洞察(运行时读取,不会写入磁盘)。
|
||||
- **📍 事故热点 (Hotspot)** — reuse the already uploaded accident data to identify high-risk intersections and produce targeted mitigation ideas; no separate hotspot upload is required.
|
||||
|
||||
Each tab remembers the active filters from the sidebar so results stay consistent.
|
||||
|
||||
349
readme.md
349
readme.md
@@ -1,25 +1,67 @@
|
||||
# TrafficSafeAnalyzer
|
||||
|
||||
一个基于 Streamlit 的交通安全分析系统,支持事故数据分析、预测模型、异常检测和策略评估。
|
||||
基于 Streamlit 的交通安全分析系统,支持事故数据分析、多模型预测、异常检测、策略评估和 AI 智能分析。
|
||||
|
||||
## 功能
|
||||
## 功能特性
|
||||
|
||||
- 加载和清洗事故与策略数据(Excel 格式)
|
||||
- 使用 ARIMA、KNN、GLM、SVR 等模型预测事故趋势
|
||||
- 检测异常事故点
|
||||
- 评估交通策略效果并提供推荐
|
||||
- 识别事故热点路口并生成风险分级与整治建议
|
||||
- 支持 GPT 分析生成自然语言洞察
|
||||
### 核心功能模块
|
||||
|
||||
| 模块 | 功能说明 |
|
||||
|------|----------|
|
||||
| 总览 | 可视化事故趋势、KPI 指标展示(今日/本周事故数、预测偏差、策略覆盖率等) |
|
||||
| 事故热点 | 识别高发路口,生成风险分级与整治建议 |
|
||||
| AI 分析 | 基于 DeepSeek API 生成专业分析报告和改进建议 |
|
||||
| 预测模型 | 支持 ARIMA、KNN、GLM、SVR 等多模型预测对比 |
|
||||
| 模型评估 | 对比各模型预测效果(RMSE、MAE 等指标) |
|
||||
| 异常检测 | 基于 Isolation Forest 算法检测异常事故点 |
|
||||
| 策略评估 | 评估单一交通策略实施效果 |
|
||||
| 策略对比 | 多策略效果横向对比分析 |
|
||||
| 情景模拟 | 模拟策略上线对事故趋势的影响 |
|
||||
|
||||
### 技术亮点
|
||||
|
||||
- 支持实时自动刷新
|
||||
- 交互式 Plotly 图表
|
||||
- 多格式数据导出(CSV、HTML)
|
||||
- Docker 容器化部署
|
||||
- 中文分词支持(jieba)
|
||||
|
||||
## 项目结构
|
||||
|
||||
```
|
||||
TrafficSafeAnalyzer/
|
||||
├── app.py # 主应用入口
|
||||
├── services/ # 业务逻辑层
|
||||
│ ├── forecast.py # 预测模型(ARIMA、KNN、GLM、SVR)
|
||||
│ ├── hotspot.py # 热点分析
|
||||
│ ├── io.py # 数据加载与清洗
|
||||
│ ├── metrics.py # 模型评估指标
|
||||
│ └── strategy.py # 策略评估
|
||||
├── ui_sections/ # UI 组件层
|
||||
│ ├── overview.py # 总览页面
|
||||
│ ├── forecast.py # 预测页面
|
||||
│ ├── model_eval.py # 模型评估页面
|
||||
│ ├── strategy_eval.py # 策略评估页面
|
||||
│ └── hotspot.py # 热点分析页面
|
||||
├── config/
|
||||
│ └── settings.py # 配置参数
|
||||
├── docs/ # 文档
|
||||
│ ├── install.md # 安装指南
|
||||
│ └── usage.md # 使用说明
|
||||
├── Dockerfile # Docker 配置
|
||||
├── requirements.txt # Python 依赖
|
||||
└── environment.yml # Conda 环境配置
|
||||
```
|
||||
|
||||
## 安装步骤
|
||||
|
||||
### 前提条件
|
||||
|
||||
- Python 3.8+
|
||||
- Python 3.8+(推荐 3.12)
|
||||
- Git
|
||||
- 可选:Docker(用于容器化部署)
|
||||
|
||||
### 安装
|
||||
### 方式一:本地安装
|
||||
|
||||
1. 克隆仓库:
|
||||
|
||||
@@ -31,132 +73,213 @@ cd TrafficSafeAnalyzer
|
||||
2. 创建虚拟环境(推荐):
|
||||
|
||||
```bash
|
||||
conda create -n trafficsa python=3.8 -y
|
||||
# 使用 conda
|
||||
conda create -n trafficsa python=3.12 -y
|
||||
conda activate trafficsa
|
||||
pip install -r requirements.txt
|
||||
streamlit run app.py
|
||||
|
||||
# 或使用 venv
|
||||
python -m venv venv
|
||||
source venv/bin/activate # Linux/macOS
|
||||
# venv\Scripts\activate # Windows
|
||||
```
|
||||
|
||||
3. 安装依赖:
|
||||
|
||||
(1) 基本安装(必需依赖)
|
||||
|
||||
```bash
|
||||
pip install streamlit pandas numpy matplotlib plotly scikit-learn statsmodels scipy
|
||||
```
|
||||
|
||||
(2) 完整安装(包含所有可选依赖)
|
||||
|
||||
```bash
|
||||
pip install -r requirements.txt
|
||||
```
|
||||
|
||||
(3) 或者手动安装可选依赖
|
||||
|
||||
```bash
|
||||
pip install streamlit-autorefresh openpyxl xlrd cryptography
|
||||
```
|
||||
|
||||
(4) 运行应用:
|
||||
|
||||
```bash
|
||||
streamlit run app.py
|
||||
```
|
||||
|
||||
## 依赖项
|
||||
|
||||
列于 `requirements.txt`:
|
||||
|
||||
```txt
|
||||
streamlit>=1.20.0
|
||||
pandas>=1.3.0
|
||||
numpy>=1.21.0
|
||||
matplotlib>=3.4.0
|
||||
plotly>=5.0.0
|
||||
scikit-learn>=1.0.0
|
||||
statsmodels>=0.13.0
|
||||
scipy>=1.7.0
|
||||
streamlit-autorefresh>=0.1.5
|
||||
python-dateutil>=2.8.2
|
||||
pytz>=2021.3
|
||||
openpyxl>=3.0.9
|
||||
xlrd>=2.0.1
|
||||
cryptography>=3.4.7
|
||||
openai>=2.0.0
|
||||
```bash
|
||||
pip install -r requirements.txt
|
||||
```
|
||||
|
||||
## 配置参数
|
||||
4. 运行应用:
|
||||
|
||||
- **数据文件**:上传事故数据(`accident_file`)和策略数据(`strategy_file`),格式为 Excel;事故热点分析会直接复用事故数据,无需额外上传。
|
||||
- **环境变量**(可选):
|
||||
- `LOG_LEVEL=DEBUG`:启用详细日志
|
||||
- 示例:`export LOG_LEVEL=DEBUG`(Linux/macOS)或 `set LOG_LEVEL=DEBUG`(Windows)
|
||||
|
||||
## 示例数据
|
||||
|
||||
`sample/` 目录提供了脱敏示例数据,便于快速体验:
|
||||
|
||||
- `sample/事故/*.xlsx`:按年份划分的事故记录
|
||||
- `sample/交通策略/*.xlsx`:策略发布记录
|
||||
|
||||
使用前建议复制到临时位置再进行编辑。
|
||||
|
||||
## 输入输出格式
|
||||
|
||||
### 输入
|
||||
- **事故数据 Excel**:需包含 `事故时间`、`所在街道`、`事故类型` 列
|
||||
- **策略数据 Excel**:需包含 `发布时间`、`交通策略类型` 列
|
||||
|
||||
### 输出
|
||||
- **预测结果**:CSV 文件(例如 `arima_forecast.csv`)
|
||||
- **图表**:HTML 文件(例如 `overview_series.html`)
|
||||
- **策略推荐**:文本文件(`recommendation.txt`)
|
||||
|
||||
## 调用示例
|
||||
|
||||
运行 Streamlit 应用:
|
||||
```bash
|
||||
streamlit run app.py
|
||||
```
|
||||
|
||||
访问 http://localhost:8501,上传数据文件并交互分析。
|
||||
### 方式二:Docker 部署
|
||||
|
||||
## 常见问题排查
|
||||
```bash
|
||||
# 构建镜像
|
||||
docker build -t trafficsafeanalyzer .
|
||||
|
||||
**问题**:`ModuleNotFoundError: No module named 'streamlit'`
|
||||
**解决**:运行 `pip install -r requirements.txt` 或检查 Python 环境
|
||||
# 运行容器
|
||||
docker run --rm -p 8501:8501 trafficsafeanalyzer
|
||||
```
|
||||
|
||||
**问题**:数据加载失败
|
||||
**解决**:确保 Excel 文件格式正确,检查列名是否匹配
|
||||
访问 `http://localhost:8501` 即可使用。
|
||||
|
||||
**问题**:预测模型页面点击后图表未显示
|
||||
**解决**:确认干预日期之前至少有 10 条历史记录,或缩短预测天数重新提交
|
||||
如需挂载本地数据目录:
|
||||
|
||||
**问题**:热点分析提示“请上传事故数据”
|
||||
**解决**:侧边栏上传事故数据后点击“应用数据与筛选”,热点模块会复用相同数据集
|
||||
```bash
|
||||
docker run --rm -p 8501:8501 \
|
||||
-v "$(pwd)/data:/app/data" \
|
||||
trafficsafeanalyzer
|
||||
```
|
||||
|
||||
## 日志分析
|
||||
自定义端口:
|
||||
|
||||
- **日志文件**:`logs/app.log`(需在代码中配置 logging 模块)
|
||||
- **查看日志**:`tail -f logs/app.log`
|
||||
- **常见错误**:
|
||||
- `ValueError`:检查输入数据格式
|
||||
- `ConnectionError`:验证网络连接或文件路径
|
||||
```bash
|
||||
docker run --rm -p 8080:8501 \
|
||||
-e STREAMLIT_SERVER_PORT=8501 \
|
||||
trafficsafeanalyzer
|
||||
```
|
||||
|
||||
## 升级说明
|
||||
## 依赖项
|
||||
|
||||
- **当前版本**:v1.0.0
|
||||
- **升级步骤**:
|
||||
1. 备份数据和配置文件
|
||||
2. 拉取最新代码:`git pull origin main`
|
||||
3. 更新依赖:`pip install -r requirements.txt --upgrade`
|
||||
4. 重启应用:`streamlit run app.py`
|
||||
### 核心依赖
|
||||
|
||||
参考 `CHANGELOG.md` 查看版本变更详情。
|
||||
| 包名 | 版本要求 | 用途 |
|
||||
|------|----------|------|
|
||||
| streamlit | >=1.20.0 | Web 应用框架 |
|
||||
| pandas | >=1.3.0 | 数据处理 |
|
||||
| numpy | >=1.21.0 | 数值计算 |
|
||||
| matplotlib | >=3.4.0 | 静态图表 |
|
||||
| plotly | >=5.0.0 | 交互式图表 |
|
||||
| scikit-learn | >=1.0.0 | 机器学习模型 |
|
||||
| statsmodels | >=0.13.0 | 统计模型(ARIMA) |
|
||||
|
||||
### 可选依赖
|
||||
|
||||
| 包名 | 用途 |
|
||||
|------|------|
|
||||
| scipy | 统计检验(t-test、Mann-Whitney U) |
|
||||
| streamlit-autorefresh | 页面自动刷新 |
|
||||
| openpyxl / xlrd | Excel 文件读写 |
|
||||
| openai | AI 分析(兼容 DeepSeek API) |
|
||||
| jieba | 中文分词 |
|
||||
| cryptography | 安全加密 |
|
||||
|
||||
## 使用说明
|
||||
|
||||
### 数据格式要求
|
||||
|
||||
**事故数据 Excel**:
|
||||
|
||||
| 必需列 | 说明 |
|
||||
|--------|------|
|
||||
| 事故时间 | 事故发生时间 |
|
||||
| 所在街道 | 事故地点 |
|
||||
| 事故类型 | 事故分类 |
|
||||
|
||||
可选列:`region`(区域)、严重程度等
|
||||
|
||||
**策略数据 Excel**:
|
||||
|
||||
| 必需列 | 说明 |
|
||||
|--------|------|
|
||||
| 发布时间 | 策略发布日期 |
|
||||
| 交通策略类型 | 策略分类 |
|
||||
|
||||
### 基本操作流程
|
||||
|
||||
1. 启动应用后,在左侧边栏上传事故数据和策略数据(Excel 格式)
|
||||
2. 设置全局筛选器:区域、时间范围、策略类型
|
||||
3. 点击"应用数据与筛选"按钮加载数据
|
||||
4. 在顶部标签页切换不同功能模块进行分析
|
||||
|
||||
### AI 分析配置
|
||||
|
||||
系统使用 DeepSeek API 进行 AI 智能分析:
|
||||
|
||||
| 配置项 | 默认值 | 说明 |
|
||||
|--------|--------|------|
|
||||
| API Key | 预填示例密钥 | 可在侧边栏替换为自有密钥 |
|
||||
| Base URL | `https://api.deepseek.com` | DeepSeek API 地址 |
|
||||
|
||||
AI 分析功能可生成:
|
||||
- 核心指标洞察
|
||||
- 策略绩效评估
|
||||
- 短期/中期/长期优化建议
|
||||
|
||||
### 输出文件
|
||||
|
||||
| 类型 | 文件名示例 | 说明 |
|
||||
|------|------------|------|
|
||||
| 预测结果 | `arima_forecast.csv` | ARIMA 模型预测数据 |
|
||||
| 模型评估 | `model_evaluation.csv` | 各模型指标对比 |
|
||||
| 异常检测 | `anomalies.csv` | 异常日期列表 |
|
||||
| 策略对比 | `strategy_compare.csv` | 策略效果对比表 |
|
||||
| 交互图表 | `simulation.html` | Plotly 图表导出 |
|
||||
|
||||
## 配置参数
|
||||
|
||||
### 环境变量
|
||||
|
||||
| 变量名 | 说明 | 默认值 |
|
||||
|--------|------|--------|
|
||||
| `LOG_LEVEL` | 日志级别 | INFO |
|
||||
| `STREAMLIT_SERVER_PORT` | 服务端口 | 8501 |
|
||||
| `STREAMLIT_SERVER_HEADLESS` | 无头模式 | true(Docker 中) |
|
||||
|
||||
### 模型参数
|
||||
|
||||
配置文件:`config/settings.py`
|
||||
|
||||
```python
|
||||
# ARIMA 参数搜索范围
|
||||
ARIMA_P = range(0, 4)
|
||||
ARIMA_D = range(0, 2)
|
||||
ARIMA_Q = range(0, 4)
|
||||
|
||||
# 预测与评估
|
||||
DEFAULT_HORIZON_PREDICT = 30 # 默认预测天数
|
||||
DEFAULT_HORIZON_EVAL = 14 # 默认评估窗口
|
||||
MIN_PRE_DAYS = 5 # 最小历史数据天数
|
||||
MAX_PRE_DAYS = 120 # 最大历史数据天数
|
||||
|
||||
# 异常检测
|
||||
ANOMALY_N_ESTIMATORS = 50 # Isolation Forest 估计器数量
|
||||
ANOMALY_CONTAMINATION = 0.10 # 预期异常比例
|
||||
```
|
||||
|
||||
## 常见问题
|
||||
|
||||
| 问题 | 解决方案 |
|
||||
|------|----------|
|
||||
| `ModuleNotFoundError` | 运行 `pip install -r requirements.txt` |
|
||||
| 数据加载失败 | 检查 Excel 文件格式,确保包含必需列名 |
|
||||
| 预测图表未显示 | 确保干预日期前至少有 10 条历史数据 |
|
||||
| AI 分析无响应 | 检查 API Key 有效性及网络连接 |
|
||||
| 热点分析提示无数据 | 先上传事故数据并点击"应用数据与筛选" |
|
||||
|
||||
## 更新日志
|
||||
|
||||
参见 [CHANGELOG.md](CHANGELOG.md)
|
||||
|
||||
**当前版本**:v1.3.0
|
||||
|
||||
### v1.3.0 主要更新
|
||||
|
||||
- 集成 DeepSeek AI 分析功能(流式输出)
|
||||
- 新增事故热点分析模块
|
||||
- 优化预测模型性能
|
||||
- 支持 Docker 容器化部署
|
||||
- 改进数据可视化交互体验
|
||||
- 修复多标签页导航状态问题
|
||||
|
||||
## 升级指南
|
||||
|
||||
```bash
|
||||
# 备份现有数据
|
||||
cp -r data data_backup
|
||||
|
||||
# 拉取最新代码
|
||||
git pull origin main
|
||||
|
||||
# 更新依赖
|
||||
pip install -r requirements.txt --upgrade
|
||||
|
||||
# 重启应用
|
||||
streamlit run app.py
|
||||
```
|
||||
|
||||
## 许可证
|
||||
|
||||
MIT License - 详见 LICENSE 文件。
|
||||
MIT License - 详见 [LICENSE](LICENSE)
|
||||
|
||||
[](https://github.com/tongnian0613/TrafficSafeAnalyzer/LICENSE)
|
||||
[](https://travis-ci.org/tongnian0613/repo)
|
||||
## 贡献
|
||||
|
||||
欢迎提交 Issue 和 Pull Request。
|
||||
|
||||
---
|
||||
|
||||
[](https://github.com/tongnian0613/TrafficSafeAnalyzer/blob/main/LICENSE)
|
||||
|
||||
@@ -1,7 +1,7 @@
|
||||
from __future__ import annotations
|
||||
|
||||
from datetime import datetime
|
||||
from typing import Iterable
|
||||
from typing import Iterable, Optional
|
||||
|
||||
import numpy as np
|
||||
import pandas as pd
|
||||
@@ -211,11 +211,24 @@ def generate_hotspot_strategies(
|
||||
return strategies
|
||||
|
||||
|
||||
def serialise_datetime_columns(df: pd.DataFrame, columns: Iterable[str]) -> pd.DataFrame:
|
||||
def serialise_datetime_columns(df: pd.DataFrame, columns: Optional[Iterable[str]] = None) -> pd.DataFrame:
|
||||
result = df.copy()
|
||||
if columns is None:
|
||||
columns = result.columns
|
||||
for column in columns:
|
||||
if column in result.columns and pd.api.types.is_datetime64_any_dtype(result[column]):
|
||||
result[column] = result[column].dt.strftime("%Y-%m-%d %H:%M:%S")
|
||||
if column not in result.columns:
|
||||
continue
|
||||
series = result[column]
|
||||
if pd.api.types.is_datetime64_any_dtype(series):
|
||||
result[column] = series.dt.strftime("%Y-%m-%d %H:%M:%S")
|
||||
else:
|
||||
has_timestamp = series.map(lambda value: isinstance(value, (datetime, pd.Timestamp))).any()
|
||||
if has_timestamp:
|
||||
result[column] = series.map(
|
||||
lambda value: value.strftime("%Y-%m-%d %H:%M:%S")
|
||||
if isinstance(value, (datetime, pd.Timestamp))
|
||||
else value
|
||||
)
|
||||
return result
|
||||
|
||||
|
||||
@@ -224,4 +237,3 @@ def _mode_fallback(series: pd.Series) -> str:
|
||||
return ""
|
||||
mode = series.mode()
|
||||
return str(mode.iloc[0]) if not mode.empty else str(series.iloc[0])
|
||||
|
||||
|
||||
@@ -154,10 +154,7 @@ def render_hotspot(accident_records, accident_source_name: str | None) -> None:
|
||||
)
|
||||
|
||||
with download_cols[1]:
|
||||
serializable = serialise_datetime_columns(
|
||||
top_hotspots.reset_index(),
|
||||
columns=[col for col in top_hotspots.columns if "time" in col or "date" in col],
|
||||
)
|
||||
serializable = serialise_datetime_columns(top_hotspots.reset_index())
|
||||
report_payload = {
|
||||
"analysis_time": datetime.now().strftime("%Y-%m-%d %H:%M:%S"),
|
||||
"time_window": time_window,
|
||||
@@ -186,4 +183,3 @@ def render_hotspot(accident_records, accident_source_name: str | None) -> None:
|
||||
preview_cols = ["事故时间", "所在街道", "事故类型", "事故具体地点", "道路类型"]
|
||||
preview_df = hotspot_data[preview_cols].copy()
|
||||
st.dataframe(preview_df.head(10), use_container_width=True)
|
||||
|
||||
|
||||
Reference in New Issue
Block a user