5 Commits

Author SHA1 Message Date
2726ba7bfc fix: readme.md 2026-01-04 10:18:21 +08:00
105a6b5076 fix: last commit 2026-01-04 10:16:34 +08:00
28999baf85 modify: use deepseek api 2025-11-02 23:01:36 +08:00
d94b5c5ba4 modify: add docker 2025-11-02 22:30:59 +08:00
d8eea8e3a9 modify: update apps 2025-11-02 21:56:35 +08:00
10 changed files with 502 additions and 235 deletions

View File

@@ -1,36 +0,0 @@
# Repository Guidelines
## Project Structure & Module Organization
- Source: `app.py` (Streamlit UI, data processing, forecasting, anomaly detection, evaluation).
- Docs & outputs: `docs/`, `overview_series.html`, `strategy_evaluation_results.csv`.
- Samples: `sample/` for example data only; avoid sensitive content.
- Meta: `requirements.txt`, `readme.md`, `LICENSE`, `CHANGELOG.md`.
## Build, Test, and Development Commands
- Create env: `python -m venv .venv && source .venv/bin/activate` (or follow conda steps in `readme.md`).
- Install deps: `pip install -r requirements.txt`.
- Run app: `streamlit run app.py` then open `http://localhost:8501`.
- Export artifacts: charts save as HTML (Plotly); forecasts may be written to CSV as noted in `readme.md`.
## Coding Style & Naming Conventions
- Python ≥3.8; 4-space indentation; UTF-8.
- Names: functions/variables `snake_case`; classes `PascalCase`; constants `UPPER_SNAKE_CASE`.
- Files: keep scope focused; use descriptive output names (e.g., `arima_forecast.csv`).
- Data handling: prefer pandas/NumPy vectorization; validate inputs; avoid global state except constants.
## Testing Guidelines
- Framework: pytest (recommended). Place tests under `tests/`.
- Naming: `test_<module>.py` and `test_<behavior>()`.
- Run: `pytest -q`. Focus on `load_and_clean_data`, aggregation, model selection, and metrics.
- Keep tests fast and deterministic; avoid large I/O. Use small DataFrame fixtures.
## Commit & Pull Request Guidelines
- Messages: concise, present tense. Prefixes seen: `modify:`, `Add`, `Update`.
- Include scope and reason: e.g., `modify: update requirements for statsmodels`.
- PRs: clear description, linked issues, repro steps/screenshots for UI, and notes on any schema or output changes.
## Security & Configuration Tips
- Do not commit real accident data or secrets. Use `sample/` for examples.
- Optional envs: `LOG_LEVEL=DEBUG`. Keep any API keys in environment variables, not in code.
- Validate Excel column names before processing; handle missing columns/rows defensively.

View File

@@ -3,7 +3,7 @@
## [1.1.0] - 2025-08-28 ## [1.1.0] - 2025-08-28
### Added ### Added
- Integrated GPT-based analysis for comprehensive traffic safety insights - Integrated AI-based analysis for comprehensive traffic safety insights
- Added automated report generation with AI-powered recommendations - Added automated report generation with AI-powered recommendations
- Implemented natural language query processing for data exploration - Implemented natural language query processing for data exploration
- Added export functionality for analysis reports (PDF/CSV formats) - Added export functionality for analysis reports (PDF/CSV formats)
@@ -22,7 +22,7 @@
- Addressed memory leaks in large dataset processing - Addressed memory leaks in large dataset processing
### Documentation ### Documentation
- Updated README with new GPT analysis features and usage examples - Updated README with new AI analysis features and usage examples
- Added API documentation for extended functionality - Added API documentation for extended functionality
- Included sample datasets and tutorial guides - Included sample datasets and tutorial guides
@@ -44,4 +44,4 @@
### Fixed ### Fixed
- Resolved session state KeyError. - Resolved session state KeyError.

99
CLAUDE.md Normal file
View File

@@ -0,0 +1,99 @@
# CLAUDE.md
This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
## Build and Run Commands
```bash
# Install dependencies
pip install -r requirements.txt
# Run the Streamlit application
streamlit run app.py
# Run tests (if tests/ directory exists)
pytest -q
```
## Architecture Overview
This is a Streamlit-based traffic safety analysis system with a three-layer architecture:
### Layer Structure
```
app.py (Main Entry & UI Orchestration)
ui_sections/ (UI Components - render_* functions)
services/ (Business Logic)
config/settings.py (Configuration)
```
### Data Flow
1. **Input**: Excel files uploaded via Streamlit sidebar (事故数据 + 策略数据)
2. **Processing**: `services/io.py` handles loading, column aliasing, and cleaning
3. **Aggregation**: Data aggregated to daily time series with `aggregate_daily_data()`
4. **Analysis**: Various services process the aggregated data
5. **Output**: Interactive Plotly charts, CSV exports, AI-generated reports
### Key Services
| Module | Purpose |
|--------|---------|
| `services/io.py` | Data loading, column normalization (COLUMN_ALIASES), region inference |
| `services/forecast.py` | ARIMA grid search, KNN counterfactual, GLM/SVR extrapolation |
| `services/strategy.py` | Strategy effectiveness evaluation (F1/F2 metrics, safety states) |
| `services/hotspot.py` | Location extraction, risk scoring, strategy generation |
| `services/metrics.py` | Model evaluation metrics (RMSE, MAE) |
### UI Sections
Each tab in the app corresponds to a `render_*` function in `ui_sections/`:
- `render_overview`: KPI dashboard and time series visualization
- `render_forecast`: Multi-model prediction comparison
- `render_model_eval`: Model accuracy metrics
- `render_strategy_eval`: Single strategy evaluation
- `render_hotspot`: Accident hotspot analysis with risk levels
### Session State Pattern
The app uses `st.session_state['processed_data']` to persist:
- Loaded DataFrames (`combined_city`, `combined_by_region`, `accident_records`)
- Filter state (`region_sel`, `date_range`, `strat_filter`)
- Derived metadata (`all_regions`, `all_strategy_types`, `min_date`, `max_date`)
### AI Integration
Uses DeepSeek API (OpenAI-compatible) for generating analysis reports. Configuration in sidebar:
- Base URL: `https://api.deepseek.com`
- Model: `deepseek-chat`
- Streaming response rendered incrementally
## Coding Conventions
- Python 3.8+ with type hints (`from __future__ import annotations`)
- Functions/variables: `snake_case`; Classes: `PascalCase`; Constants: `UPPER_SNAKE_CASE`
- Use `@st.cache_data` for expensive computations
- Column aliases defined in `COLUMN_ALIASES` dict for flexible Excel input
- Prefer pandas vectorization over loops
## Data Format Requirements
**Accident Data Excel** must contain (or aliases of):
- `事故时间` (accident time)
- `所在街道` (street/region)
- `事故类型` (accident type: 财损/伤人/亡人)
**Strategy Data Excel** must contain:
- `发布时间` (publish date)
- `交通策略类型` (strategy type)
## Configuration (config/settings.py)
Key parameters:
- `ARIMA_P/D/Q`: Grid search ranges for ARIMA
- `MIN_PRE_DAYS` / `MAX_PRE_DAYS`: Historical data requirements
- `ANOMALY_CONTAMINATION`: Isolation Forest contamination rate

36
Dockerfile Normal file
View File

@@ -0,0 +1,36 @@
# Use an official Python runtime as a parent image
FROM python:3.12-slim
# Prevents writing .pyc files and buffering stdout/stderr
ENV PYTHONDONTWRITEBYTECODE=1
ENV PYTHONUNBUFFERED=1
# Set working directory
WORKDIR /app
# Copy requirements first to leverage Docker cache
COPY requirements.txt .
# Install system dependencies (if needed) and Python deps
RUN apt-get update && apt-get install -y --no-install-recommends \
build-essential \
&& rm -rf /var/lib/apt/lists/*
RUN pip install --upgrade pip
RUN pip install --no-cache-dir -r requirements.txt
# Copy app code
COPY . .
# Expose Streamlit default port
EXPOSE 8501
# Streamlit config: run headless and bind to 0.0.0.0
ENV STREAMLIT_SERVER_HEADLESS=true
ENV STREAMLIT_SERVER_ENABLE_CORS=false
ENV STREAMLIT_SERVER_ENABLE_XSRF_PROTECTION=false
ENV STREAMLIT_SERVER_PORT=8501
ENV STREAMLIT_SERVER_ADDRESS=0.0.0.0
# Run Streamlit
CMD ["streamlit", "run", "app.py", "--server.port=8501", "--server.address=0.0.0.0"]

158
app.py
View File

@@ -294,9 +294,9 @@ def run_streamlit_app():
# Add OpenAI API key input in sidebar # Add OpenAI API key input in sidebar
st.sidebar.markdown("---") st.sidebar.markdown("---")
st.sidebar.subheader("GPT API 配置") st.sidebar.subheader("AI API 配置")
openai_api_key = st.sidebar.text_input("GPT API Key", value='sk-sXY934yPqjh7YKKC08380b198fEb47308cDa09BeE23d9c8a', type="password", help="用于GPT分析结果的API密钥") openai_api_key = st.sidebar.text_input("AI API Key", value='sk-959e0b065c774b1db6e30bf7589680f9', type="password", help="用于 AI 分析结果的 API 密钥")
open_ai_base_url = st.sidebar.text_input("GPT Base Url", value='https://aihubmix.com/v1', type='default') open_ai_base_url = st.sidebar.text_input("AI Base Url", value='https://api.deepseek.com', type='default')
# Process data only when Apply button is clicked # Process data only when Apply button is clicked
if apply_button and accident_file and strategy_file: if apply_button and accident_file and strategy_file:
@@ -404,14 +404,14 @@ def run_streamlit_app():
tab_labels = [ tab_labels = [
"🏠 总览", "🏠 总览",
"📍 事故热点",
"🔍 AI 分析",
"📈 预测模型", "📈 预测模型",
"📊 模型评估", "📊 模型评估",
"⚠️ 异常检测", "⚠️ 异常检测",
"📝 策略评估", "📝 策略评估",
"⚖️ 策略对比", "⚖️ 策略对比",
"🧪 情景模拟", "🧪 情景模拟",
"🔍 GPT 分析",
"📍 事故热点",
] ]
default_tab = st.session_state.get("active_tab", tab_labels[0]) default_tab = st.session_state.get("active_tab", tab_labels[0])
if default_tab not in tab_labels: if default_tab not in tab_labels:
@@ -426,17 +426,94 @@ def run_streamlit_app():
st.session_state["active_tab"] = selected_tab st.session_state["active_tab"] = selected_tab
if selected_tab == "📍 事故热点": if selected_tab == "🏠 总览":
if render_overview is not None:
render_overview(base, region_sel, start_dt, end_dt, strat_filter)
else:
st.warning("概览模块未能加载,请检查 `ui_sections/overview.py`。")
elif selected_tab == "📍 事故热点":
if render_hotspot is not None: if render_hotspot is not None:
render_hotspot(accident_records, accident_source_name) render_hotspot(accident_records, accident_source_name)
else: else:
st.warning("事故热点模块未能加载,请检查 `ui_sections/hotspot.py`。") st.warning("事故热点模块未能加载,请检查 `ui_sections/hotspot.py`。")
elif selected_tab == "🏠 总览": elif selected_tab == "🔍 AI 分析":
if render_overview is not None: from openai import OpenAI
render_overview(base, region_sel, start_dt, end_dt, strat_filter) st.subheader("AI 数据分析与改进建议")
if not HAS_OPENAI:
st.warning("未安装 `openai` 库。请安装后重试。")
elif not openai_api_key:
st.info("请在左侧边栏输入 OpenAI API Key 以启用 AI 分析。")
else: else:
st.warning("概览模块未能加载,请检查 `ui_sections/overview.py`。") if all_strategy_types:
# Generate results if not already
results, recommendation = generate_output_and_recommendations(base, all_strategy_types,
region=region_sel if region_sel != '全市' else '全市')
df_res = pd.DataFrame(results).T
kpi_json = json.dumps(kpi, ensure_ascii=False, indent=2)
results_json = df_res.to_json(orient="records", force_ascii=False)
recommendation_text = recommendation
# Prepare data to send
data_to_analyze = {
"kpis": kpi_json,
"strategy_results": results_json,
"recommendation": recommendation_text
}
data_str = json.dumps(data_to_analyze, ensure_ascii=False)
prompt = (
"你是一名资深交通安全数据分析顾问。请基于以下结构化数据输出一份专业报告,需包含:\n"
"1. 核心指标洞察:按要点总结事故趋势、显著波动及可能原因。\n"
"2. 策略绩效评估:对比主要策略的优势、短板与适用场景。\n"
"3. 优化建议为短期0-3个月、中期3-12个月与长期12个月以上分别给出2-3条可操作措施。\n"
"请保持正式语气,引用关键数值支撑结论,并用清晰的小节或列表呈现。\n"
f"数据摘要:{data_str}\n"
)
if st.button("上传数据至 AI 并获取分析"):
if not openai_api_key.strip():
st.info("请提供有效的 AI API Key。")
elif not open_ai_base_url.strip():
st.info("请提供可访问的 AI Base Url。")
else:
try:
client = OpenAI(
base_url=open_ai_base_url,
# sk-xxx替换为自己的key
api_key=openai_api_key
)
st.markdown("### AI 分析结果与改进思路")
placeholder = st.empty()
accumulated_response: list[str] = []
with st.spinner("AI 正在生成专业报告,请稍候…"):
stream = client.chat.completions.create(
model="deepseek-chat",
messages=[
{
"role": "system",
"content": "You are a professional traffic safety analyst who writes concise, well-structured Chinese reports."
},
{"role": "user", "content": prompt},
],
stream=True,
)
for chunk in stream:
delta = chunk.choices[0].delta if chunk.choices else None
piece = getattr(delta, "content", None) if delta else None
if piece:
accumulated_response.append(piece)
placeholder.markdown("".join(accumulated_response), unsafe_allow_html=True)
final_text = "".join(accumulated_response)
if not final_text:
placeholder.info("AI 未返回可用内容,请稍后重试或检查凭据配置。")
except Exception as e:
st.error(f"调用 OpenAI API 失败:{str(e)}")
else:
st.warning("没有策略数据可供分析。")
# Update refresh time
st.session_state['last_refresh'] = datetime.now()
elif selected_tab == "📈 预测模型": elif selected_tab == "📈 预测模型":
if render_forecast is not None: if render_forecast is not None:
@@ -652,67 +729,6 @@ def run_streamlit_app():
else: else:
st.info("请设置模拟参数并点击“应用模拟参数”按钮。") st.info("请设置模拟参数并点击“应用模拟参数”按钮。")
# --- New Tab 8: GPT 分析
elif selected_tab == "🔍 GPT 分析":
from openai import OpenAI
st.subheader("GPT 数据分析与改进建议")
# open_ai_key = f"sk-dQhKOOG48iVEfgJfAb14458dA4474fB09aBbE8153d4aB3Fc"
if not HAS_OPENAI:
st.warning("未安装 `openai` 库。请安装后重试。")
elif not openai_api_key:
st.info("请在左侧边栏输入 OpenAI API Key 以启用 GPT 分析。")
else:
if all_strategy_types:
# Generate results if not already
results, recommendation = generate_output_and_recommendations(base, all_strategy_types,
region=region_sel if region_sel != '全市' else '全市')
df_res = pd.DataFrame(results).T
kpi_json = json.dumps(kpi, ensure_ascii=False, indent=2)
results_json = df_res.to_json(orient="records", force_ascii=False)
recommendation_text = recommendation
# Prepare data to send
data_to_analyze = {
"kpis": kpi_json,
"strategy_results": results_json,
"recommendation": recommendation_text
}
data_str = json.dumps(data_to_analyze, ensure_ascii=False)
prompt = str(f"""
请分析以下交通安全分析结果包括KPI指标、策略评估结果和推荐。
提供数据结果的详细分析,以及改进思路和建议。
数据:{str(data_str)}
""")
if st.button("上传数据至 GPT 并获取分析"):
if False:
st.info("请将 GPT Base Url 更新为实际可访问的接口地址。")
else:
try:
client = OpenAI(
base_url=open_ai_base_url,
# sk-xxx替换为自己的key
api_key=openai_api_key
)
response = client.chat.completions.create(
model="gpt-5-mini",
messages=[
{"role": "system", "content": "You are a helpful assistant that analyzes traffic safety data."},
{"role": "user", "content": prompt}
],
stream=False
)
gpt_response = response.choices[0].message.content
st.markdown("### GPT 分析结果与改进思路")
st.markdown(gpt_response, unsafe_allow_html=True)
except Exception as e:
st.error(f"调用 OpenAI API 失败:{str(e)}")
else:
st.warning("没有策略数据可供分析。")
# Update refresh time
st.session_state['last_refresh'] = datetime.now()
else: else:
st.info("请先在左侧上传事故数据与策略数据,并点击“应用数据与筛选”按钮。") st.info("请先在左侧上传事故数据与策略数据,并点击“应用数据与筛选”按钮。")

View File

@@ -68,6 +68,25 @@ pip install streamlit-autorefresh openpyxl xlrd cryptography openai
3. Open `http://localhost:8501` in your browser. The home page should load without import errors. 3. Open `http://localhost:8501` in your browser. The home page should load without import errors.
## 5. Run with Docker (optional)
If you prefer an isolated container build, use the included `Dockerfile`:
```bash
docker build -t trafficsafeanalyzer .
docker run --rm -p 8501:8501 trafficsafeanalyzer
```
To work with local data, mount the host folder containing Excel files:
```bash
docker run --rm -p 8501:8501 \
-v "$(pwd)/sample:/app/sample" \
trafficsafeanalyzer
```
The container exposes Streamlit on port 8501 by default. Override configuration via environment variables when needed, for example `-e STREAMLIT_SERVER_PORT=8502`.
## Troubleshooting tips ## Troubleshooting tips
- **Missing package**: Re-run `pip install -r requirements.txt`. - **Missing package**: Re-run `pip install -r requirements.txt`.

View File

@@ -4,7 +4,7 @@ TrafficSafeAnalyzer delivers accident analytics and decision support through a S
## Start the app ## Start the app
1. Activate your virtual or conda environment. 1. Activate your virtual or conda environment(或在容器中运行,见下).
2. From the project root, run: 2. From the project root, run:
```bash ```bash
@@ -13,6 +13,8 @@ TrafficSafeAnalyzer delivers accident analytics and decision support through a S
3. Open `http://localhost:8501`. Keep the terminal running while you work in the browser. 3. Open `http://localhost:8501`. Keep the terminal running while you work in the browser.
> 使用 Docker运行 `docker build -t trafficsafeanalyzer .` 与 `docker run --rm -p 8501:8501 trafficsafeanalyzer` 后,同样访问 `http://localhost:8501`。
## Load input data ## Load input data
Use the sidebar form labelled “数据与筛选”. Use the sidebar form labelled “数据与筛选”.
@@ -41,7 +43,7 @@ Use the sidebar form labelled “数据与筛选”.
- **📝 策略评估 (Strategy evaluation)** — Aggregates metrics per strategy type, recommends the best option, writes `strategy_evaluation_results.csv`, and updates `recommendation.txt`. - **📝 策略评估 (Strategy evaluation)** — Aggregates metrics per strategy type, recommends the best option, writes `strategy_evaluation_results.csv`, and updates `recommendation.txt`.
- **⚖️ 策略对比 (Strategy comparison)** — side-by-side metrics for selected strategies, useful for “what worked best last month” reviews. - **⚖️ 策略对比 (Strategy comparison)** — side-by-side metrics for selected strategies, useful for “what worked best last month” reviews.
- **🧪 情景模拟 (Scenario simulation)** — apply intervention models (persistent/decay, lagged effects) to test potential roll-outs. - **🧪 情景模拟 (Scenario simulation)** — apply intervention models (persistent/decay, lagged effects) to test potential roll-outs.
- **🔍 GPT 分析** — enter your own OpenAI-compatible API key and base URL in the sidebar to generate narrative insights. Keys are read at runtime only. - **🔍 AI 分析** — 默认示例 API Key/Base URL 已预填,可直接体验;如需切换自有凭据,可在侧边栏更新后生成洞察(运行时读取,不会写入磁盘)。
- **📍 事故热点 (Hotspot)** — reuse the already uploaded accident data to identify high-risk intersections and produce targeted mitigation ideas; no separate hotspot upload is required. - **📍 事故热点 (Hotspot)** — reuse the already uploaded accident data to identify high-risk intersections and produce targeted mitigation ideas; no separate hotspot upload is required.
Each tab remembers the active filters from the sidebar so results stay consistent. Each tab remembers the active filters from the sidebar so results stay consistent.

349
readme.md
View File

@@ -1,25 +1,67 @@
# TrafficSafeAnalyzer # TrafficSafeAnalyzer
一个基于 Streamlit 的交通安全分析系统,支持事故数据分析、预测模型、异常检测策略评估。 基于 Streamlit 的交通安全分析系统,支持事故数据分析、多模型预测、异常检测策略评估和 AI 智能分析
## 功能 ## 功能特性
- 加载和清洗事故与策略数据Excel 格式) ### 核心功能模块
- 使用 ARIMA、KNN、GLM、SVR 等模型预测事故趋势
- 检测异常事故点 | 模块 | 功能说明 |
- 评估交通策略效果并提供推荐 |------|----------|
- 识别事故热点路口并生成风险分级与整治建议 | 总览 | 可视化事故趋势、KPI 指标展示(今日/本周事故数、预测偏差、策略覆盖率等) |
- 支持 GPT 分析生成自然语言洞察 | 事故热点 | 识别高发路口,生成风险分级与整治建议 |
| AI 分析 | 基于 DeepSeek API 生成专业分析报告和改进建议 |
| 预测模型 | 支持 ARIMA、KNN、GLM、SVR 等多模型预测对比 |
| 模型评估 | 对比各模型预测效果RMSE、MAE 等指标) |
| 异常检测 | 基于 Isolation Forest 算法检测异常事故点 |
| 策略评估 | 评估单一交通策略实施效果 |
| 策略对比 | 多策略效果横向对比分析 |
| 情景模拟 | 模拟策略上线对事故趋势的影响 |
### 技术亮点
- 支持实时自动刷新
- 交互式 Plotly 图表
- 多格式数据导出CSV、HTML
- Docker 容器化部署
- 中文分词支持jieba
## 项目结构
```
TrafficSafeAnalyzer/
├── app.py # 主应用入口
├── services/ # 业务逻辑层
│ ├── forecast.py # 预测模型ARIMA、KNN、GLM、SVR
│ ├── hotspot.py # 热点分析
│ ├── io.py # 数据加载与清洗
│ ├── metrics.py # 模型评估指标
│ └── strategy.py # 策略评估
├── ui_sections/ # UI 组件层
│ ├── overview.py # 总览页面
│ ├── forecast.py # 预测页面
│ ├── model_eval.py # 模型评估页面
│ ├── strategy_eval.py # 策略评估页面
│ └── hotspot.py # 热点分析页面
├── config/
│ └── settings.py # 配置参数
├── docs/ # 文档
│ ├── install.md # 安装指南
│ └── usage.md # 使用说明
├── Dockerfile # Docker 配置
├── requirements.txt # Python 依赖
└── environment.yml # Conda 环境配置
```
## 安装步骤 ## 安装步骤
### 前提条件 ### 前提条件
- Python 3.8+ - Python 3.8+(推荐 3.12
- Git - Git
- 可选Docker用于容器化部署 - 可选Docker用于容器化部署
### 安装 ### 方式一:本地安装
1. 克隆仓库: 1. 克隆仓库:
@@ -31,132 +73,213 @@ cd TrafficSafeAnalyzer
2. 创建虚拟环境(推荐): 2. 创建虚拟环境(推荐):
```bash ```bash
conda create -n trafficsa python=3.8 -y # 使用 conda
conda create -n trafficsa python=3.12 -y
conda activate trafficsa conda activate trafficsa
pip install -r requirements.txt
streamlit run app.py # 或使用 venv
python -m venv venv
source venv/bin/activate # Linux/macOS
# venv\Scripts\activate # Windows
``` ```
3. 安装依赖: 3. 安装依赖:
(1) 基本安装(必需依赖) ```bash
pip install -r requirements.txt
```bash
pip install streamlit pandas numpy matplotlib plotly scikit-learn statsmodels scipy
```
(2) 完整安装(包含所有可选依赖)
```bash
pip install -r requirements.txt
```
(3) 或者手动安装可选依赖
```bash
pip install streamlit-autorefresh openpyxl xlrd cryptography
```
(4) 运行应用:
```bash
streamlit run app.py
```
## 依赖项
列于 `requirements.txt`
```txt
streamlit>=1.20.0
pandas>=1.3.0
numpy>=1.21.0
matplotlib>=3.4.0
plotly>=5.0.0
scikit-learn>=1.0.0
statsmodels>=0.13.0
scipy>=1.7.0
streamlit-autorefresh>=0.1.5
python-dateutil>=2.8.2
pytz>=2021.3
openpyxl>=3.0.9
xlrd>=2.0.1
cryptography>=3.4.7
openai>=2.0.0
``` ```
## 配置参数 4. 运行应用:
- **数据文件**:上传事故数据(`accident_file`)和策略数据(`strategy_file`),格式为 Excel事故热点分析会直接复用事故数据无需额外上传。
- **环境变量**(可选):
- `LOG_LEVEL=DEBUG`:启用详细日志
- 示例:`export LOG_LEVEL=DEBUG`Linux/macOS或 `set LOG_LEVEL=DEBUG`Windows
## 示例数据
`sample/` 目录提供了脱敏示例数据,便于快速体验:
- `sample/事故/*.xlsx`:按年份划分的事故记录
- `sample/交通策略/*.xlsx`:策略发布记录
使用前建议复制到临时位置再进行编辑。
## 输入输出格式
### 输入
- **事故数据 Excel**:需包含 `事故时间`、`所在街道`、`事故类型` 列
- **策略数据 Excel**:需包含 `发布时间`、`交通策略类型` 列
### 输出
- **预测结果**CSV 文件(例如 `arima_forecast.csv`
- **图表**HTML 文件(例如 `overview_series.html`
- **策略推荐**:文本文件(`recommendation.txt`
## 调用示例
运行 Streamlit 应用:
```bash ```bash
streamlit run app.py streamlit run app.py
``` ```
访问 http://localhost:8501上传数据文件并交互分析。 ### 方式二Docker 部署
## 常见问题排查 ```bash
# 构建镜像
docker build -t trafficsafeanalyzer .
**问题**`ModuleNotFoundError: No module named 'streamlit'` # 运行容器
**解决**:运行 `pip install -r requirements.txt` 或检查 Python 环境 docker run --rm -p 8501:8501 trafficsafeanalyzer
```
**问题**:数据加载失败 访问 `http://localhost:8501` 即可使用。
**解决**:确保 Excel 文件格式正确,检查列名是否匹配
**问题**:预测模型页面点击后图表未显示 如需挂载本地数据目录:
**解决**:确认干预日期之前至少有 10 条历史记录,或缩短预测天数重新提交
**问题**:热点分析提示“请上传事故数据” ```bash
**解决**:侧边栏上传事故数据后点击“应用数据与筛选”,热点模块会复用相同数据集 docker run --rm -p 8501:8501 \
-v "$(pwd)/data:/app/data" \
trafficsafeanalyzer
```
## 日志分析 自定义端口:
- **日志文件**`logs/app.log`(需在代码中配置 logging 模块) ```bash
- **查看日志**`tail -f logs/app.log` docker run --rm -p 8080:8501 \
- **常见错误** -e STREAMLIT_SERVER_PORT=8501 \
- `ValueError`:检查输入数据格式 trafficsafeanalyzer
- `ConnectionError`:验证网络连接或文件路径 ```
## 升级说明 ## 依赖项
- **当前版本**v1.0.0 ### 核心依赖
- **升级步骤**
1. 备份数据和配置文件
2. 拉取最新代码:`git pull origin main`
3. 更新依赖:`pip install -r requirements.txt --upgrade`
4. 重启应用:`streamlit run app.py`
参考 `CHANGELOG.md` 查看版本变更详情。 | 包名 | 版本要求 | 用途 |
|------|----------|------|
| streamlit | >=1.20.0 | Web 应用框架 |
| pandas | >=1.3.0 | 数据处理 |
| numpy | >=1.21.0 | 数值计算 |
| matplotlib | >=3.4.0 | 静态图表 |
| plotly | >=5.0.0 | 交互式图表 |
| scikit-learn | >=1.0.0 | 机器学习模型 |
| statsmodels | >=0.13.0 | 统计模型ARIMA |
### 可选依赖
| 包名 | 用途 |
|------|------|
| scipy | 统计检验t-test、Mann-Whitney U |
| streamlit-autorefresh | 页面自动刷新 |
| openpyxl / xlrd | Excel 文件读写 |
| openai | AI 分析(兼容 DeepSeek API |
| jieba | 中文分词 |
| cryptography | 安全加密 |
## 使用说明
### 数据格式要求
**事故数据 Excel**
| 必需列 | 说明 |
|--------|------|
| 事故时间 | 事故发生时间 |
| 所在街道 | 事故地点 |
| 事故类型 | 事故分类 |
可选列:`region`(区域)、严重程度等
**策略数据 Excel**
| 必需列 | 说明 |
|--------|------|
| 发布时间 | 策略发布日期 |
| 交通策略类型 | 策略分类 |
### 基本操作流程
1. 启动应用后在左侧边栏上传事故数据和策略数据Excel 格式)
2. 设置全局筛选器:区域、时间范围、策略类型
3. 点击"应用数据与筛选"按钮加载数据
4. 在顶部标签页切换不同功能模块进行分析
### AI 分析配置
系统使用 DeepSeek API 进行 AI 智能分析:
| 配置项 | 默认值 | 说明 |
|--------|--------|------|
| API Key | 预填示例密钥 | 可在侧边栏替换为自有密钥 |
| Base URL | `https://api.deepseek.com` | DeepSeek API 地址 |
AI 分析功能可生成:
- 核心指标洞察
- 策略绩效评估
- 短期/中期/长期优化建议
### 输出文件
| 类型 | 文件名示例 | 说明 |
|------|------------|------|
| 预测结果 | `arima_forecast.csv` | ARIMA 模型预测数据 |
| 模型评估 | `model_evaluation.csv` | 各模型指标对比 |
| 异常检测 | `anomalies.csv` | 异常日期列表 |
| 策略对比 | `strategy_compare.csv` | 策略效果对比表 |
| 交互图表 | `simulation.html` | Plotly 图表导出 |
## 配置参数
### 环境变量
| 变量名 | 说明 | 默认值 |
|--------|------|--------|
| `LOG_LEVEL` | 日志级别 | INFO |
| `STREAMLIT_SERVER_PORT` | 服务端口 | 8501 |
| `STREAMLIT_SERVER_HEADLESS` | 无头模式 | trueDocker 中) |
### 模型参数
配置文件:`config/settings.py`
```python
# ARIMA 参数搜索范围
ARIMA_P = range(0, 4)
ARIMA_D = range(0, 2)
ARIMA_Q = range(0, 4)
# 预测与评估
DEFAULT_HORIZON_PREDICT = 30 # 默认预测天数
DEFAULT_HORIZON_EVAL = 14 # 默认评估窗口
MIN_PRE_DAYS = 5 # 最小历史数据天数
MAX_PRE_DAYS = 120 # 最大历史数据天数
# 异常检测
ANOMALY_N_ESTIMATORS = 50 # Isolation Forest 估计器数量
ANOMALY_CONTAMINATION = 0.10 # 预期异常比例
```
## 常见问题
| 问题 | 解决方案 |
|------|----------|
| `ModuleNotFoundError` | 运行 `pip install -r requirements.txt` |
| 数据加载失败 | 检查 Excel 文件格式,确保包含必需列名 |
| 预测图表未显示 | 确保干预日期前至少有 10 条历史数据 |
| AI 分析无响应 | 检查 API Key 有效性及网络连接 |
| 热点分析提示无数据 | 先上传事故数据并点击"应用数据与筛选" |
## 更新日志
参见 [CHANGELOG.md](CHANGELOG.md)
**当前版本**v1.3.0
### v1.3.0 主要更新
- 集成 DeepSeek AI 分析功能(流式输出)
- 新增事故热点分析模块
- 优化预测模型性能
- 支持 Docker 容器化部署
- 改进数据可视化交互体验
- 修复多标签页导航状态问题
## 升级指南
```bash
# 备份现有数据
cp -r data data_backup
# 拉取最新代码
git pull origin main
# 更新依赖
pip install -r requirements.txt --upgrade
# 重启应用
streamlit run app.py
```
## 许可证 ## 许可证
MIT License - 详见 LICENSE 文件。 MIT License - 详见 [LICENSE](LICENSE)
[![GitHub license](https://img.shields.io/github/license/tongnian0613/repo)](https://github.com/tongnian0613/TrafficSafeAnalyzer/LICENSE) ## 贡献
[![Build Status](https://img.shields.io/travis/username/repo)](https://travis-ci.org/tongnian0613/repo)
欢迎提交 Issue 和 Pull Request。
---
[![GitHub license](https://img.shields.io/github/license/tongnian0613/TrafficSafeAnalyzer)](https://github.com/tongnian0613/TrafficSafeAnalyzer/blob/main/LICENSE)

View File

@@ -1,7 +1,7 @@
from __future__ import annotations from __future__ import annotations
from datetime import datetime from datetime import datetime
from typing import Iterable from typing import Iterable, Optional
import numpy as np import numpy as np
import pandas as pd import pandas as pd
@@ -211,11 +211,24 @@ def generate_hotspot_strategies(
return strategies return strategies
def serialise_datetime_columns(df: pd.DataFrame, columns: Iterable[str]) -> pd.DataFrame: def serialise_datetime_columns(df: pd.DataFrame, columns: Optional[Iterable[str]] = None) -> pd.DataFrame:
result = df.copy() result = df.copy()
if columns is None:
columns = result.columns
for column in columns: for column in columns:
if column in result.columns and pd.api.types.is_datetime64_any_dtype(result[column]): if column not in result.columns:
result[column] = result[column].dt.strftime("%Y-%m-%d %H:%M:%S") continue
series = result[column]
if pd.api.types.is_datetime64_any_dtype(series):
result[column] = series.dt.strftime("%Y-%m-%d %H:%M:%S")
else:
has_timestamp = series.map(lambda value: isinstance(value, (datetime, pd.Timestamp))).any()
if has_timestamp:
result[column] = series.map(
lambda value: value.strftime("%Y-%m-%d %H:%M:%S")
if isinstance(value, (datetime, pd.Timestamp))
else value
)
return result return result
@@ -224,4 +237,3 @@ def _mode_fallback(series: pd.Series) -> str:
return "" return ""
mode = series.mode() mode = series.mode()
return str(mode.iloc[0]) if not mode.empty else str(series.iloc[0]) return str(mode.iloc[0]) if not mode.empty else str(series.iloc[0])

View File

@@ -154,10 +154,7 @@ def render_hotspot(accident_records, accident_source_name: str | None) -> None:
) )
with download_cols[1]: with download_cols[1]:
serializable = serialise_datetime_columns( serializable = serialise_datetime_columns(top_hotspots.reset_index())
top_hotspots.reset_index(),
columns=[col for col in top_hotspots.columns if "time" in col or "date" in col],
)
report_payload = { report_payload = {
"analysis_time": datetime.now().strftime("%Y-%m-%d %H:%M:%S"), "analysis_time": datetime.now().strftime("%Y-%m-%d %H:%M:%S"),
"time_window": time_window, "time_window": time_window,
@@ -186,4 +183,3 @@ def render_hotspot(accident_records, accident_source_name: str | None) -> None:
preview_cols = ["事故时间", "所在街道", "事故类型", "事故具体地点", "道路类型"] preview_cols = ["事故时间", "所在街道", "事故类型", "事故具体地点", "道路类型"]
preview_df = hotspot_data[preview_cols].copy() preview_df = hotspot_data[preview_cols].copy()
st.dataframe(preview_df.head(10), use_container_width=True) st.dataframe(preview_df.head(10), use_container_width=True)