From 1a6b4e203b3fdfe915d1675f23dc86597fcb1ead Mon Sep 17 00:00:00 2001
From: jongjae <whdwo798@naver.com>
Date: Thu, 28 May 2026 20:39:37 +0900
Subject: [PATCH] =?UTF-8?q?[2026-05-28]=20=EC=9A=B4=EC=98=81=20=EB=AC=B8?=
 =?UTF-8?q?=EC=84=9C=20=EC=B5=9C=EC=8B=A0=ED=99=94?=
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

---
 AGENTS.md                                     | 24 +++++++----
 CLAUDE.md                                     | 24 +++++++----
 README.md                                     | 10 ++++-
 reports/daily/2026-05-28.md                   | 30 ++++++++++++-
 reports/implementation_log.md                 | 42 +++++++++++++++++++
 .../proposals/2026-05-28_strategy_proposal.md | 28 ++++++++-----
 6 files changed, 129 insertions(+), 29 deletions(-)

diff --git a/AGENTS.md b/AGENTS.md
index ed4f999..c219b93 100644
--- a/AGENTS.md
+++ b/AGENTS.md
@@ -11,33 +11,41 @@
 - Parameter changes must be written to `reports/proposals/YYYY-MM-DD_strategy_proposal.md` with evidence and manual approval required.
 - `FORCE_EXIT = "14:50"` remains immutable.
 
-## Current implementation status - 2026-05-27
+## Current implementation status - 2026-05-28
 
 - Mode: paper trading / dry-run focused. Real-cash trading is not approved yet.
+- Approved strategy change applied: `ENTRY_START = "09:20"`.
+- `FORCE_EXIT = "14:50"` remains unchanged.
+- `avoid_sectors` runtime bug fixed: `main.py` now passes sector context into `check_entry()`.
+- Sector handling now keeps a `ticker_sectors` cache when available and uses name-based hints for known avoid-sector cases.
 - Data layer: `entry_snapshots` and `post_entry_snapshots` are active for training data.
 - Post-entry sampling: 60s, 180s, 300s, and 600s after successful entry.
 - Training data export: `scripts/export_training_dataset.py`.
 - External data collection:
-  - Daily market features: `scripts/collect_daily_features.py`.
-  - KIS minute bars: `scripts/collect_minute_data.py`.
-  - External dataset builder: `scripts/build_external_training_dataset.py`.
+  - Daily market features: `scripts/collect_daily_features.py` with KIS fallback when pykrx fails.
+  - KIS minute bars: `scripts/collect_minute_data.py`, default ETF/ETN exclusion, multi-window collection from 09:30 to 14:00.
+  - External dataset builder: `scripts/build_external_training_dataset.py`, using prior daily OHLCV for breakout targets.
 - ML engine:
   - Training: `scripts/train_ai_model.py`.
   - Model output: `models/scalping_model.joblib`.
   - Metrics output: `models/scalping_model.metrics.json`.
   - Runtime loader: `app/ml/predictor.py`.
+- Latest training run: 2026-05-28 20:24, 3,156 rows total (`external_training_dataset.csv` 3,146 + bot dataset 10).
+- Latest metrics: `label_stop_loss` ROC-AUC 0.851, `label_win` ROC-AUC 0.719.
+- Training features exclude future/outcome leakage columns such as `ret_*`, `mfe_*`, `mae_*`, `price_*`, `pnl`, and `exit_price`.
 - AI runtime mode: observation only. If a model exists, entry-time scores are logged and saved to `entry_snapshots`; they do not block or resize trades.
-- Training schedule: `StockBot_Training` runs at 16:00 on trading days via `scripts/run_training_pipeline.ps1`.
+- Training schedule: `StockBot_Training` runs at 16:00 on trading days via `scripts/run_training_pipeline.ps1`; the pipeline was end-to-end verified on 2026-05-28.
 - Dependency install:
   - `requirements.txt` includes `app/requirements.txt`.
   - `scripts/install_dependencies.ps1` installs from `vendor/wheels` when available.
   - `scripts/download_dependencies.ps1` builds the local wheelhouse.
 
-## Current operational risks - 2026-05-27
+## Current operational risks - 2026-05-28
 
-- KIS minute-bar endpoint must be verified with live response logs.
-- Early ML models may be meaningless until enough labeled rows exist.
+- Model is still observation-only and is dominated by external pretraining rows; bot-trade labels are only 10 rows.
+- KIS minute-bar collection is verified for same-day windows, but historical depth and ticker coverage remain limited.
 - External minute data is pretraining data, not actual bot-trade data.
+- pykrx daily feature collection can fail for same-day data; KIS fallback is active.
 - Real-cash mode still needs stronger fill, partial-fill, unfilled-order, cancel/replace, and recovery logic.
 - Existing logs and older docs contain encoding damage; new operational notes should stay ASCII unless the file encoding is intentionally cleaned.
 
diff --git a/CLAUDE.md b/CLAUDE.md
index d8e46d3..0c6979a 100644
--- a/CLAUDE.md
+++ b/CLAUDE.md
@@ -11,33 +11,41 @@
 - Parameter changes must be written to `reports/proposals/YYYY-MM-DD_strategy_proposal.md` with evidence and manual approval required.
 - `FORCE_EXIT = "14:50"` remains immutable.
 
-## Current implementation status - 2026-05-27
+## Current implementation status - 2026-05-28
 
 - Mode: paper trading / dry-run focused. Real-cash trading is not approved yet.
+- Approved strategy change applied: `ENTRY_START = "09:20"`.
+- `FORCE_EXIT = "14:50"` remains unchanged.
+- `avoid_sectors` runtime bug fixed: `main.py` now passes sector context into `check_entry()`.
+- Sector handling now keeps a `ticker_sectors` cache when available and uses name-based hints for known avoid-sector cases.
 - Data layer: `entry_snapshots` and `post_entry_snapshots` are active for training data.
 - Post-entry sampling: 60s, 180s, 300s, and 600s after successful entry.
 - Training data export: `scripts/export_training_dataset.py`.
 - External data collection:
-  - Daily market features: `scripts/collect_daily_features.py`.
-  - KIS minute bars: `scripts/collect_minute_data.py`.
-  - External dataset builder: `scripts/build_external_training_dataset.py`.
+  - Daily market features: `scripts/collect_daily_features.py` with KIS fallback when pykrx fails.
+  - KIS minute bars: `scripts/collect_minute_data.py`, default ETF/ETN exclusion, multi-window collection from 09:30 to 14:00.
+  - External dataset builder: `scripts/build_external_training_dataset.py`, using prior daily OHLCV for breakout targets.
 - ML engine:
   - Training: `scripts/train_ai_model.py`.
   - Model output: `models/scalping_model.joblib`.
   - Metrics output: `models/scalping_model.metrics.json`.
   - Runtime loader: `app/ml/predictor.py`.
+- Latest training run: 2026-05-28 20:24, 3,156 rows total (`external_training_dataset.csv` 3,146 + bot dataset 10).
+- Latest metrics: `label_stop_loss` ROC-AUC 0.851, `label_win` ROC-AUC 0.719.
+- Training features exclude future/outcome leakage columns such as `ret_*`, `mfe_*`, `mae_*`, `price_*`, `pnl`, and `exit_price`.
 - AI runtime mode: observation only. If a model exists, entry-time scores are logged and saved to `entry_snapshots`; they do not block or resize trades.
-- Training schedule: `StockBot_Training` runs at 16:00 on trading days via `scripts/run_training_pipeline.ps1`.
+- Training schedule: `StockBot_Training` runs at 16:00 on trading days via `scripts/run_training_pipeline.ps1`; the pipeline was end-to-end verified on 2026-05-28.
 - Dependency install:
   - `requirements.txt` includes `app/requirements.txt`.
   - `scripts/install_dependencies.ps1` installs from `vendor/wheels` when available.
   - `scripts/download_dependencies.ps1` builds the local wheelhouse.
 
-## Current operational risks - 2026-05-27
+## Current operational risks - 2026-05-28
 
-- KIS minute-bar endpoint must be verified with live response logs.
-- Early ML models may be meaningless until enough labeled rows exist.
+- Model is still observation-only and is dominated by external pretraining rows; bot-trade labels are only 10 rows.
+- KIS minute-bar collection is verified for same-day windows, but historical depth and ticker coverage remain limited.
 - External minute data is pretraining data, not actual bot-trade data.
+- pykrx daily feature collection can fail for same-day data; KIS fallback is active.
 - Real-cash mode still needs stronger fill, partial-fill, unfilled-order, cancel/replace, and recovery logic.
 - Existing logs and older docs contain encoding damage; new operational notes should stay ASCII unless the file encoding is intentionally cleaned.
 
diff --git a/README.md b/README.md
index f2ec0ab..67a9cc3 100644
--- a/README.md
+++ b/README.md
@@ -47,19 +47,27 @@ docker-compose --profile emergency up kill-switch
 # 또는
 python kill_switch/kill.py
 ```
-# StockBot Current Status - 2026-05-27
+# StockBot Current Status - 2026-05-28
 
 This project is currently a paper-trading scalping bot with an AI training
 pipeline in observation mode.
 
 Active:
 - Windows Task Scheduler operation for morning, midday, evening, watchdog, and training jobs.
+- Approved `ENTRY_START = "09:20"` after the 2026-05-28 evening review.
+- `FORCE_EXIT = "14:50"` remains unchanged.
+- `avoid_sectors` filtering is active in runtime entry checks.
 - Entry snapshots for model training.
 - Post-entry snapshots at 1m, 3m, 5m, and 10m.
 - Bot-data export to `data/training_dataset.csv`.
 - External daily/minute data collection for pretraining.
+- External daily collection falls back to KIS when pykrx same-day data fails.
+- KIS minute collection excludes ETF/ETN by default and collects multiple windows from 09:30 to 14:00.
 - RandomForest-based training engine.
 - Optional AI entry scoring when `models/scalping_model.joblib` exists.
+- Latest verified training run: 2026-05-28 20:24.
+- Latest training rows: 3,156 total, including 3,146 external pretraining rows and 10 bot-trade rows.
+- Latest metrics: `label_stop_loss` ROC-AUC 0.851, `label_win` ROC-AUC 0.719.
 
 Not active yet:
 - AI does not block buys.
diff --git a/reports/daily/2026-05-28.md b/reports/daily/2026-05-28.md
index 97fdf2f..257a368 100644
--- a/reports/daily/2026-05-28.md
+++ b/reports/daily/2026-05-28.md
@@ -100,6 +100,32 @@ midday 이후 기존 보유 2종목(흥아해운, SFA반도체)만 정리.
 
 ## 내일 운영 참고
 
-- ENTRY_START 제안(09:20) 승인 대기 → 미승인 시 현행 09:15 유지
-- 섹터 필터 버그 수정 승인 대기 → 미수정 시 avoid_sectors 비작동 지속
+- ENTRY_START 제안(09:20) 승인 후 적용 완료.
+- 섹터 필터 버그 수정 승인 후 적용 완료. `avoid_sectors`가 `check_entry()`에 전달됨.
 - 오전 신호 신뢰 낮음 — L3-B 0.3× 상태로 출발 (전날 연속손절 미회복 시)
+
+---
+
+## 사후 적용 내역 - 2026-05-28
+
+사용자 승인 후 다음 변경을 적용함.
+
+- `app/config.py`: `ENTRY_START` 09:15 → 09:20.
+- `app/main.py`: `ticker_sectors` 캐시 추가, `check_entry()`에 `sector` 전달.
+- `app/main.py`: KIS 랭킹 row에 섹터 필드가 없을 때를 위한 보수적 종목명 기반 회피 섹터 힌트 추가.
+- `FORCE_EXIT = "14:50"` 유지 확인.
+- 외부 데이터 기반 AI 사전학습 파이프라인 복구 및 end-to-end 검증.
+- 학습 feature에서 미래/결과 누수 컬럼 제거: `price_*`, `ret_*`, `mfe_*`, `mae_*`, `pnl`, `exit_price`.
+- 미사용 `scripts/_send_midday_discord.py` 삭제.
+
+학습 결과:
+- `data/external_training_dataset.csv`: 3,146 rows.
+- `data/training_dataset.csv`: 10 rows.
+- `models/scalping_model.joblib`: 생성 완료.
+- `models/scalping_model.metrics.json`: 2026-05-28 20:24 생성.
+- `label_stop_loss`: ROC-AUC 0.851, accuracy 0.750.
+- `label_win`: ROC-AUC 0.719, accuracy 0.635.
+
+주의:
+- AI는 계속 observation-only.
+- 현재 모델은 외부 사전학습 row 비중이 높고 실제 봇 row는 10개뿐이므로 진입 차단/비중 조절에 사용하지 않음.
diff --git a/reports/implementation_log.md b/reports/implementation_log.md
index d84e4bd..4ebc1fb 100644
--- a/reports/implementation_log.md
+++ b/reports/implementation_log.md
@@ -1,5 +1,47 @@
 # Implementation Log
 
+## 2026-05-28
+
+- Applied the approved 2026-05-28 strategy update:
+  - `ENTRY_START` changed from `"09:15"` to `"09:20"`.
+  - `FORCE_EXIT = "14:50"` was verified unchanged.
+- Fixed the `avoid_sectors` runtime bug:
+  - `app/main.py` now passes `sector` into `VolatilityBreakout.check_entry()`.
+  - Added `ticker_sectors` cache support from ranking rows when sector fields exist.
+  - Added conservative name-based avoid-sector hints for cases such as construction names when no sector field is available.
+- Repaired the external-data pretraining path:
+  - `scripts/collect_daily_features.py` now falls back to KIS daily OHLCV when pykrx fails.
+  - `scripts/collect_minute_data.py` excludes ETF/ETN by default and collects multiple intraday windows from 09:30 to 14:00.
+  - `scripts/build_external_training_dataset.py` now uses prior daily OHLCV rows for breakout targets instead of same-day OHLCV.
+  - `scripts/run_training_pipeline.ps1` builds external rows with `--all-minutes` for pretraining.
+- Removed model leakage:
+  - Excluded future/outcome columns from training features: `price_*`, `ret_*`, `mfe_*`, `mae_*`, `pnl`, and `exit_price`.
+- Fixed PowerShell training pipeline execution:
+  - Replaced `$Args` parameter usage with `$StepArgs` to avoid PowerShell automatic-variable collision.
+  - Prevented nonzero stderr output from stopping required exit-code handling.
+  - Normalized Python step logging to UTF-8 append.
+- Removed unused helper:
+  - `scripts/_send_midday_discord.py`.
+
+Validation performed:
+- `python -m compileall app scripts` passed.
+- Manual external daily collection passed through KIS fallback.
+- Manual KIS minute collection saved 11 regular-stock CSV files for 2026-05-28.
+- `data/external_training_dataset.csv` generated 3,146 rows.
+- `data/training_dataset.csv` generated 10 bot-trade rows.
+- `python scripts/train_ai_model.py` generated `models/scalping_model.joblib` and metrics.
+- `powershell -ExecutionPolicy Bypass -File scripts\run_training_pipeline.ps1` passed end-to-end.
+
+Latest training metrics:
+- `label_stop_loss`: rows 3,156, accuracy 0.750, precision 0.450, ROC-AUC 0.851.
+- `label_win`: rows 3,156, accuracy 0.635, precision 0.492, ROC-AUC 0.719.
+
+Open risks:
+- AI remains observation-only and must not block entries, resize trades, or override exits.
+- Training is still dominated by external pretraining rows; actual bot-labeled rows are only 10.
+- Same-day pykrx data may fail; KIS fallback is active but index rows can be empty.
+- Real-cash trading remains unapproved.
+
 ## 2026-05-27
 
 - Reviewed the stock scalping bot structure and moved it toward an AI-training-ready paper-trading platform.
diff --git a/reports/proposals/2026-05-28_strategy_proposal.md b/reports/proposals/2026-05-28_strategy_proposal.md
index 46805ce..9cdef46 100644
--- a/reports/proposals/2026-05-28_strategy_proposal.md
+++ b/reports/proposals/2026-05-28_strategy_proposal.md
@@ -1,8 +1,15 @@
 # 전략 파라미터 변경 제안 — 2026-05-28
 
-**상태: 승인 대기 (수동 검토 필요)**
+**상태: 승인 후 적용 완료 (2026-05-28)**
 **작성: Claude Evening / 2026-05-28**
 
+적용 결과:
+- 제안 1 적용: `ENTRY_START = "09:20"`.
+- 제안 2 적용: `app/main.py`에서 `check_entry()` 호출 시 `sector` 전달.
+- `ticker_sectors` 캐시와 보수적인 종목명 기반 회피 섹터 힌트 추가.
+- `FORCE_EXIT = "14:50"` 변경 없음.
+- 검증: compile check, 대우건설/건설업 섹터 추론 확인, 학습 파이프라인 end-to-end 실행 통과.
+
 ---
 
 ## 제안 1: ENTRY_START 추가 지연 — 09:15 → 09:20
@@ -104,8 +111,8 @@ signal = self.strategy.check_entry(
 **버그 확인됨.** 코드 레벨 확증. 수정 즉시 효과 발휘.
 
 ### 리스크
-- main.py 수정이 필요하므로 별도 승인 필요
-- 섹터 데이터 미확보 시 여전히 공백 — ticker_sectors 빌드 로직 추가 필요
+- main.py 수정은 사용자 승인 후 적용 완료.
+- 섹터 데이터 미확보 시 공백이 될 수 있어 `ticker_sectors` 캐시와 보수적 종목명 기반 힌트를 함께 적용함.
 
 ---
 
@@ -113,15 +120,16 @@ signal = self.strategy.check_entry(
 
 | 순위 | 제안 | 난이도 | 즉시 적용 |
 |------|------|--------|-----------|
-| 1 | 섹터 필터 버그 수정 | 중간 (main.py) | X (승인 필요) |
-| 2 | ENTRY_START → 09:20 | 낮음 (config.py) | O |
+| 1 | 섹터 필터 버그 수정 | 중간 (main.py) | 적용 완료 |
+| 2 | ENTRY_START → 09:20 | 낮음 (config.py) | 적용 완료 |
 
-버그 수정이 더 중요하나 승인 필요. ENTRY_START는 바로 반영 가능.
+두 제안 모두 사용자 승인 후 반영 완료.
 
 ---
 
-## 검토 항목 (승인 전)
+## 검토 항목 (적용 후)
 
-- [ ] `ticker_sectors` 데이터를 어디서 채울지 확인 (KIS 종목 마스터 API 또는 universe fetch 시)
-- [ ] 옵션 A vs B 선택
-- [ ] ENTRY_START 09:20 적용 후 최소 3거래일 관찰
+- [x] `ticker_sectors` 캐시 추가.
+- [x] `check_entry()`에 sector 전달.
+- [x] ENTRY_START 09:20 적용.
+- [ ] ENTRY_START 09:20 적용 후 최소 3거래일 관찰.