From 1a6b4e203b3fdfe915d1675f23dc86597fcb1ead Mon Sep 17 00:00:00 2001 From: jongjae Date: Thu, 28 May 2026 20:39:37 +0900 Subject: [PATCH] =?UTF-8?q?[2026-05-28]=20=EC=9A=B4=EC=98=81=20=EB=AC=B8?= =?UTF-8?q?=EC=84=9C=20=EC=B5=9C=EC=8B=A0=ED=99=94?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit --- AGENTS.md | 24 +++++++---- CLAUDE.md | 24 +++++++---- README.md | 10 ++++- reports/daily/2026-05-28.md | 30 ++++++++++++- reports/implementation_log.md | 42 +++++++++++++++++++ .../proposals/2026-05-28_strategy_proposal.md | 28 ++++++++----- 6 files changed, 129 insertions(+), 29 deletions(-) diff --git a/AGENTS.md b/AGENTS.md index ed4f999..c219b93 100644 --- a/AGENTS.md +++ b/AGENTS.md @@ -11,33 +11,41 @@ - Parameter changes must be written to `reports/proposals/YYYY-MM-DD_strategy_proposal.md` with evidence and manual approval required. - `FORCE_EXIT = "14:50"` remains immutable. -## Current implementation status - 2026-05-27 +## Current implementation status - 2026-05-28 - Mode: paper trading / dry-run focused. Real-cash trading is not approved yet. +- Approved strategy change applied: `ENTRY_START = "09:20"`. +- `FORCE_EXIT = "14:50"` remains unchanged. +- `avoid_sectors` runtime bug fixed: `main.py` now passes sector context into `check_entry()`. +- Sector handling now keeps a `ticker_sectors` cache when available and uses name-based hints for known avoid-sector cases. - Data layer: `entry_snapshots` and `post_entry_snapshots` are active for training data. - Post-entry sampling: 60s, 180s, 300s, and 600s after successful entry. - Training data export: `scripts/export_training_dataset.py`. - External data collection: - - Daily market features: `scripts/collect_daily_features.py`. - - KIS minute bars: `scripts/collect_minute_data.py`. - - External dataset builder: `scripts/build_external_training_dataset.py`. + - Daily market features: `scripts/collect_daily_features.py` with KIS fallback when pykrx fails. + - KIS minute bars: `scripts/collect_minute_data.py`, default ETF/ETN exclusion, multi-window collection from 09:30 to 14:00. + - External dataset builder: `scripts/build_external_training_dataset.py`, using prior daily OHLCV for breakout targets. - ML engine: - Training: `scripts/train_ai_model.py`. - Model output: `models/scalping_model.joblib`. - Metrics output: `models/scalping_model.metrics.json`. - Runtime loader: `app/ml/predictor.py`. +- Latest training run: 2026-05-28 20:24, 3,156 rows total (`external_training_dataset.csv` 3,146 + bot dataset 10). +- Latest metrics: `label_stop_loss` ROC-AUC 0.851, `label_win` ROC-AUC 0.719. +- Training features exclude future/outcome leakage columns such as `ret_*`, `mfe_*`, `mae_*`, `price_*`, `pnl`, and `exit_price`. - AI runtime mode: observation only. If a model exists, entry-time scores are logged and saved to `entry_snapshots`; they do not block or resize trades. -- Training schedule: `StockBot_Training` runs at 16:00 on trading days via `scripts/run_training_pipeline.ps1`. +- Training schedule: `StockBot_Training` runs at 16:00 on trading days via `scripts/run_training_pipeline.ps1`; the pipeline was end-to-end verified on 2026-05-28. - Dependency install: - `requirements.txt` includes `app/requirements.txt`. - `scripts/install_dependencies.ps1` installs from `vendor/wheels` when available. - `scripts/download_dependencies.ps1` builds the local wheelhouse. -## Current operational risks - 2026-05-27 +## Current operational risks - 2026-05-28 -- KIS minute-bar endpoint must be verified with live response logs. -- Early ML models may be meaningless until enough labeled rows exist. +- Model is still observation-only and is dominated by external pretraining rows; bot-trade labels are only 10 rows. +- KIS minute-bar collection is verified for same-day windows, but historical depth and ticker coverage remain limited. - External minute data is pretraining data, not actual bot-trade data. +- pykrx daily feature collection can fail for same-day data; KIS fallback is active. - Real-cash mode still needs stronger fill, partial-fill, unfilled-order, cancel/replace, and recovery logic. - Existing logs and older docs contain encoding damage; new operational notes should stay ASCII unless the file encoding is intentionally cleaned. diff --git a/CLAUDE.md b/CLAUDE.md index d8e46d3..0c6979a 100644 --- a/CLAUDE.md +++ b/CLAUDE.md @@ -11,33 +11,41 @@ - Parameter changes must be written to `reports/proposals/YYYY-MM-DD_strategy_proposal.md` with evidence and manual approval required. - `FORCE_EXIT = "14:50"` remains immutable. -## Current implementation status - 2026-05-27 +## Current implementation status - 2026-05-28 - Mode: paper trading / dry-run focused. Real-cash trading is not approved yet. +- Approved strategy change applied: `ENTRY_START = "09:20"`. +- `FORCE_EXIT = "14:50"` remains unchanged. +- `avoid_sectors` runtime bug fixed: `main.py` now passes sector context into `check_entry()`. +- Sector handling now keeps a `ticker_sectors` cache when available and uses name-based hints for known avoid-sector cases. - Data layer: `entry_snapshots` and `post_entry_snapshots` are active for training data. - Post-entry sampling: 60s, 180s, 300s, and 600s after successful entry. - Training data export: `scripts/export_training_dataset.py`. - External data collection: - - Daily market features: `scripts/collect_daily_features.py`. - - KIS minute bars: `scripts/collect_minute_data.py`. - - External dataset builder: `scripts/build_external_training_dataset.py`. + - Daily market features: `scripts/collect_daily_features.py` with KIS fallback when pykrx fails. + - KIS minute bars: `scripts/collect_minute_data.py`, default ETF/ETN exclusion, multi-window collection from 09:30 to 14:00. + - External dataset builder: `scripts/build_external_training_dataset.py`, using prior daily OHLCV for breakout targets. - ML engine: - Training: `scripts/train_ai_model.py`. - Model output: `models/scalping_model.joblib`. - Metrics output: `models/scalping_model.metrics.json`. - Runtime loader: `app/ml/predictor.py`. +- Latest training run: 2026-05-28 20:24, 3,156 rows total (`external_training_dataset.csv` 3,146 + bot dataset 10). +- Latest metrics: `label_stop_loss` ROC-AUC 0.851, `label_win` ROC-AUC 0.719. +- Training features exclude future/outcome leakage columns such as `ret_*`, `mfe_*`, `mae_*`, `price_*`, `pnl`, and `exit_price`. - AI runtime mode: observation only. If a model exists, entry-time scores are logged and saved to `entry_snapshots`; they do not block or resize trades. -- Training schedule: `StockBot_Training` runs at 16:00 on trading days via `scripts/run_training_pipeline.ps1`. +- Training schedule: `StockBot_Training` runs at 16:00 on trading days via `scripts/run_training_pipeline.ps1`; the pipeline was end-to-end verified on 2026-05-28. - Dependency install: - `requirements.txt` includes `app/requirements.txt`. - `scripts/install_dependencies.ps1` installs from `vendor/wheels` when available. - `scripts/download_dependencies.ps1` builds the local wheelhouse. -## Current operational risks - 2026-05-27 +## Current operational risks - 2026-05-28 -- KIS minute-bar endpoint must be verified with live response logs. -- Early ML models may be meaningless until enough labeled rows exist. +- Model is still observation-only and is dominated by external pretraining rows; bot-trade labels are only 10 rows. +- KIS minute-bar collection is verified for same-day windows, but historical depth and ticker coverage remain limited. - External minute data is pretraining data, not actual bot-trade data. +- pykrx daily feature collection can fail for same-day data; KIS fallback is active. - Real-cash mode still needs stronger fill, partial-fill, unfilled-order, cancel/replace, and recovery logic. - Existing logs and older docs contain encoding damage; new operational notes should stay ASCII unless the file encoding is intentionally cleaned. diff --git a/README.md b/README.md index f2ec0ab..67a9cc3 100644 --- a/README.md +++ b/README.md @@ -47,19 +47,27 @@ docker-compose --profile emergency up kill-switch # 또는 python kill_switch/kill.py ``` -# StockBot Current Status - 2026-05-27 +# StockBot Current Status - 2026-05-28 This project is currently a paper-trading scalping bot with an AI training pipeline in observation mode. Active: - Windows Task Scheduler operation for morning, midday, evening, watchdog, and training jobs. +- Approved `ENTRY_START = "09:20"` after the 2026-05-28 evening review. +- `FORCE_EXIT = "14:50"` remains unchanged. +- `avoid_sectors` filtering is active in runtime entry checks. - Entry snapshots for model training. - Post-entry snapshots at 1m, 3m, 5m, and 10m. - Bot-data export to `data/training_dataset.csv`. - External daily/minute data collection for pretraining. +- External daily collection falls back to KIS when pykrx same-day data fails. +- KIS minute collection excludes ETF/ETN by default and collects multiple windows from 09:30 to 14:00. - RandomForest-based training engine. - Optional AI entry scoring when `models/scalping_model.joblib` exists. +- Latest verified training run: 2026-05-28 20:24. +- Latest training rows: 3,156 total, including 3,146 external pretraining rows and 10 bot-trade rows. +- Latest metrics: `label_stop_loss` ROC-AUC 0.851, `label_win` ROC-AUC 0.719. Not active yet: - AI does not block buys. diff --git a/reports/daily/2026-05-28.md b/reports/daily/2026-05-28.md index 97fdf2f..257a368 100644 --- a/reports/daily/2026-05-28.md +++ b/reports/daily/2026-05-28.md @@ -100,6 +100,32 @@ midday 이후 기존 보유 2종목(흥아해운, SFA반도체)만 정리. ## 내일 운영 참고 -- ENTRY_START 제안(09:20) 승인 대기 → 미승인 시 현행 09:15 유지 -- 섹터 필터 버그 수정 승인 대기 → 미수정 시 avoid_sectors 비작동 지속 +- ENTRY_START 제안(09:20) 승인 후 적용 완료. +- 섹터 필터 버그 수정 승인 후 적용 완료. `avoid_sectors`가 `check_entry()`에 전달됨. - 오전 신호 신뢰 낮음 — L3-B 0.3× 상태로 출발 (전날 연속손절 미회복 시) + +--- + +## 사후 적용 내역 - 2026-05-28 + +사용자 승인 후 다음 변경을 적용함. + +- `app/config.py`: `ENTRY_START` 09:15 → 09:20. +- `app/main.py`: `ticker_sectors` 캐시 추가, `check_entry()`에 `sector` 전달. +- `app/main.py`: KIS 랭킹 row에 섹터 필드가 없을 때를 위한 보수적 종목명 기반 회피 섹터 힌트 추가. +- `FORCE_EXIT = "14:50"` 유지 확인. +- 외부 데이터 기반 AI 사전학습 파이프라인 복구 및 end-to-end 검증. +- 학습 feature에서 미래/결과 누수 컬럼 제거: `price_*`, `ret_*`, `mfe_*`, `mae_*`, `pnl`, `exit_price`. +- 미사용 `scripts/_send_midday_discord.py` 삭제. + +학습 결과: +- `data/external_training_dataset.csv`: 3,146 rows. +- `data/training_dataset.csv`: 10 rows. +- `models/scalping_model.joblib`: 생성 완료. +- `models/scalping_model.metrics.json`: 2026-05-28 20:24 생성. +- `label_stop_loss`: ROC-AUC 0.851, accuracy 0.750. +- `label_win`: ROC-AUC 0.719, accuracy 0.635. + +주의: +- AI는 계속 observation-only. +- 현재 모델은 외부 사전학습 row 비중이 높고 실제 봇 row는 10개뿐이므로 진입 차단/비중 조절에 사용하지 않음. diff --git a/reports/implementation_log.md b/reports/implementation_log.md index d84e4bd..4ebc1fb 100644 --- a/reports/implementation_log.md +++ b/reports/implementation_log.md @@ -1,5 +1,47 @@ # Implementation Log +## 2026-05-28 + +- Applied the approved 2026-05-28 strategy update: + - `ENTRY_START` changed from `"09:15"` to `"09:20"`. + - `FORCE_EXIT = "14:50"` was verified unchanged. +- Fixed the `avoid_sectors` runtime bug: + - `app/main.py` now passes `sector` into `VolatilityBreakout.check_entry()`. + - Added `ticker_sectors` cache support from ranking rows when sector fields exist. + - Added conservative name-based avoid-sector hints for cases such as construction names when no sector field is available. +- Repaired the external-data pretraining path: + - `scripts/collect_daily_features.py` now falls back to KIS daily OHLCV when pykrx fails. + - `scripts/collect_minute_data.py` excludes ETF/ETN by default and collects multiple intraday windows from 09:30 to 14:00. + - `scripts/build_external_training_dataset.py` now uses prior daily OHLCV rows for breakout targets instead of same-day OHLCV. + - `scripts/run_training_pipeline.ps1` builds external rows with `--all-minutes` for pretraining. +- Removed model leakage: + - Excluded future/outcome columns from training features: `price_*`, `ret_*`, `mfe_*`, `mae_*`, `pnl`, and `exit_price`. +- Fixed PowerShell training pipeline execution: + - Replaced `$Args` parameter usage with `$StepArgs` to avoid PowerShell automatic-variable collision. + - Prevented nonzero stderr output from stopping required exit-code handling. + - Normalized Python step logging to UTF-8 append. +- Removed unused helper: + - `scripts/_send_midday_discord.py`. + +Validation performed: +- `python -m compileall app scripts` passed. +- Manual external daily collection passed through KIS fallback. +- Manual KIS minute collection saved 11 regular-stock CSV files for 2026-05-28. +- `data/external_training_dataset.csv` generated 3,146 rows. +- `data/training_dataset.csv` generated 10 bot-trade rows. +- `python scripts/train_ai_model.py` generated `models/scalping_model.joblib` and metrics. +- `powershell -ExecutionPolicy Bypass -File scripts\run_training_pipeline.ps1` passed end-to-end. + +Latest training metrics: +- `label_stop_loss`: rows 3,156, accuracy 0.750, precision 0.450, ROC-AUC 0.851. +- `label_win`: rows 3,156, accuracy 0.635, precision 0.492, ROC-AUC 0.719. + +Open risks: +- AI remains observation-only and must not block entries, resize trades, or override exits. +- Training is still dominated by external pretraining rows; actual bot-labeled rows are only 10. +- Same-day pykrx data may fail; KIS fallback is active but index rows can be empty. +- Real-cash trading remains unapproved. + ## 2026-05-27 - Reviewed the stock scalping bot structure and moved it toward an AI-training-ready paper-trading platform. diff --git a/reports/proposals/2026-05-28_strategy_proposal.md b/reports/proposals/2026-05-28_strategy_proposal.md index 46805ce..9cdef46 100644 --- a/reports/proposals/2026-05-28_strategy_proposal.md +++ b/reports/proposals/2026-05-28_strategy_proposal.md @@ -1,8 +1,15 @@ # 전략 파라미터 변경 제안 — 2026-05-28 -**상태: 승인 대기 (수동 검토 필요)** +**상태: 승인 후 적용 완료 (2026-05-28)** **작성: Claude Evening / 2026-05-28** +적용 결과: +- 제안 1 적용: `ENTRY_START = "09:20"`. +- 제안 2 적용: `app/main.py`에서 `check_entry()` 호출 시 `sector` 전달. +- `ticker_sectors` 캐시와 보수적인 종목명 기반 회피 섹터 힌트 추가. +- `FORCE_EXIT = "14:50"` 변경 없음. +- 검증: compile check, 대우건설/건설업 섹터 추론 확인, 학습 파이프라인 end-to-end 실행 통과. + --- ## 제안 1: ENTRY_START 추가 지연 — 09:15 → 09:20 @@ -104,8 +111,8 @@ signal = self.strategy.check_entry( **버그 확인됨.** 코드 레벨 확증. 수정 즉시 효과 발휘. ### 리스크 -- main.py 수정이 필요하므로 별도 승인 필요 -- 섹터 데이터 미확보 시 여전히 공백 — ticker_sectors 빌드 로직 추가 필요 +- main.py 수정은 사용자 승인 후 적용 완료. +- 섹터 데이터 미확보 시 공백이 될 수 있어 `ticker_sectors` 캐시와 보수적 종목명 기반 힌트를 함께 적용함. --- @@ -113,15 +120,16 @@ signal = self.strategy.check_entry( | 순위 | 제안 | 난이도 | 즉시 적용 | |------|------|--------|-----------| -| 1 | 섹터 필터 버그 수정 | 중간 (main.py) | X (승인 필요) | -| 2 | ENTRY_START → 09:20 | 낮음 (config.py) | O | +| 1 | 섹터 필터 버그 수정 | 중간 (main.py) | 적용 완료 | +| 2 | ENTRY_START → 09:20 | 낮음 (config.py) | 적용 완료 | -버그 수정이 더 중요하나 승인 필요. ENTRY_START는 바로 반영 가능. +두 제안 모두 사용자 승인 후 반영 완료. --- -## 검토 항목 (승인 전) +## 검토 항목 (적용 후) -- [ ] `ticker_sectors` 데이터를 어디서 채울지 확인 (KIS 종목 마스터 API 또는 universe fetch 시) -- [ ] 옵션 A vs B 선택 -- [ ] ENTRY_START 09:20 적용 후 최소 3거래일 관찰 +- [x] `ticker_sectors` 캐시 추가. +- [x] `check_entry()`에 sector 전달. +- [x] ENTRY_START 09:20 적용. +- [ ] ENTRY_START 09:20 적용 후 최소 3거래일 관찰.