Compare commits
11 Commits
9a55021063
...
main
| Author | SHA1 | Date | |
|---|---|---|---|
|
|
83a5e843c0 | ||
|
|
4e02c6ce0a | ||
|
|
c01a98abce | ||
|
|
68805ed80a | ||
| 711d87a998 | |||
| bce40014ad | |||
| 50a7ef119a | |||
| 4ab0e22047 | |||
| 67b8fad423 | |||
| 690887a6ec | |||
| b3f4580faf |
13
.dockerignore
Normal file
13
.dockerignore
Normal file
@@ -0,0 +1,13 @@
|
||||
.git/
|
||||
.gitignore
|
||||
__pycache__/
|
||||
*.pyc
|
||||
venv/
|
||||
.venv/
|
||||
.env
|
||||
.env.*
|
||||
.pytest_cache/
|
||||
charts/
|
||||
yahoo.html
|
||||
scraper_service(works).py
|
||||
scraper_service.working.backup.py
|
||||
7
.gitignore
vendored
Normal file
7
.gitignore
vendored
Normal file
@@ -0,0 +1,7 @@
|
||||
__pycache__/
|
||||
*.pyc
|
||||
venv/
|
||||
.venv/
|
||||
.env
|
||||
.env.*
|
||||
.pytest_cache/
|
||||
754
AGENTS.md
Normal file
754
AGENTS.md
Normal file
@@ -0,0 +1,754 @@
|
||||
# AGENTS.md
|
||||
|
||||
## Context
|
||||
- This project exposes a Flask API that uses Playwright to scrape Yahoo Finance options chains.
|
||||
- Entry point: `scraper_service.py` (launched via `runner.bat` or directly with Python).
|
||||
- The scraper loads the Yahoo options page (optionally with `?date=`) and validates expirations using the YYMMDD code embedded in contract symbols.
|
||||
- Option chains come from the embedded `optionChain` JSON when available, with an HTML table fallback.
|
||||
|
||||
## API
|
||||
- Route: `GET /scrape_sync`
|
||||
- Query params:
|
||||
- `stock`: symbol (default `MSFT`).
|
||||
- `expiration|expiry|date`: epoch seconds (Yahoo date param) or a date string matching `DATE_FORMATS`.
|
||||
- `strikeLimit`: number of nearest strikes to return per side (default `25`).
|
||||
- Behavior:
|
||||
- If `strikeLimit` is greater than available strikes, all available rows are returned.
|
||||
- `pruned_calls_count` and `pruned_puts_count` report how many rows were removed beyond the limit.
|
||||
- `selected_expiration` reports the resolved expiry (epoch + label), and mismatches return an error.
|
||||
- Route: `GET /profile`
|
||||
- Query params:
|
||||
- `stock`: symbol (default `MSFT`).
|
||||
- Behavior:
|
||||
- Loads `https://finance.yahoo.com/quote/<SYMBOL>/` with Playwright.
|
||||
- Pulls the embedded SvelteKit payloads (quoteSummary, quote, quoteType, ratings, recommendations).
|
||||
- Returns a pruned JSON with valuation, profitability, growth, financial strength, cashflow, ownership, analyst, earnings, and performance summaries.
|
||||
|
||||
## Guard Rails
|
||||
- Run local 10-cycle validation (4 stocks x 4 expiries) before any deploy or push.
|
||||
- Run the same 10-cycle validation against the docker container before pushing the image.
|
||||
- Do not push if any response contains `error` or if contract symbols do not contain the expected YYMMDD code.
|
||||
- Keep Playwright version aligned with the docker base image (`mcr.microsoft.com/playwright/python:v1.57.0-jammy`).
|
||||
- Keep the API port open after a successful deploy so it can be tested immediately.
|
||||
|
||||
## Testing
|
||||
- Local server:
|
||||
- Start: `.\venv\Scripts\python.exe scraper_service.py`
|
||||
- Validate: `python scripts/test_cycles.py --base-url http://127.0.0.1:9777/scrape_sync`
|
||||
- Profile validation (local server):
|
||||
- Validate: `python scripts/test_profile_cycles.py --base-url http://127.0.0.1:9777/profile --runs 8`
|
||||
- Docker server:
|
||||
- Start: `docker run --rm -p 9777:9777 rushabhtechie/yahoo-scraper:latest`
|
||||
- Validate: `python scripts/test_cycles.py --base-url http://127.0.0.1:9777/scrape_sync`
|
||||
- Profile validation (docker server):
|
||||
- Validate: `python scripts/test_profile_cycles.py --base-url http://127.0.0.1:9777/profile --runs 8`
|
||||
|
||||
## Update Log (2025-12-28)
|
||||
- Added `/profile` endpoint backed by SvelteKit payload parsing (quoteSummary, quote, quoteType, ratings, recommendations).
|
||||
- `/profile` response trimmed to focus on valuation, profitability, growth, financial strength, cashflow, ownership, analyst, earnings, and performance summaries.
|
||||
- Validation ensures quote data matches the requested symbol, with issues reported in `validation`.
|
||||
- Issue encountered: existing server instance bound to port 9777 without `/profile`, resolved by restarting the service with the updated script.
|
||||
- Tests executed (local):
|
||||
- `.\venv\Scripts\python.exe scripts/test_profile_cycles.py --runs 8 --timeout 180`
|
||||
- `.\venv\Scripts\python.exe scripts\test_cycles.py --base-url http://127.0.0.1:9777/scrape_sync`
|
||||
- Tests executed (docker):
|
||||
- `docker build -t rushabhtechie/yahoo-scraper:latest .`
|
||||
- `.\venv\Scripts\python.exe scripts\test_cycles.py --base-url http://127.0.0.1:9777/scrape_sync`
|
||||
- `.\venv\Scripts\python.exe scripts\test_profile_cycles.py --base-url http://127.0.0.1:9777/profile --runs 8 --timeout 180`
|
||||
- The test harness verifies:
|
||||
- Requested expiration matches `selected_expiration.value`.
|
||||
- Contract symbols include the expected YYMMDD code.
|
||||
- `total_calls`/`total_puts` match `min(strikeLimit, available)`.
|
||||
- `pruned_*_count` equals the number of rows removed.
|
||||
|
||||
## Docker
|
||||
- Build: `docker build -t rushabhtechie/yahoo-scraper:latest .`
|
||||
- Run (CPU): `docker run --rm -p 9777:9777 rushabhtechie/yahoo-scraper:latest`
|
||||
- The container uses the Playwright base image with bundled browsers.
|
||||
|
||||
## GPU Acceleration
|
||||
- GPU is auto-detected via `NVIDIA_VISIBLE_DEVICES`, `/dev/nvidia0`, or `/dev/dri`.
|
||||
- Override detection:
|
||||
- Force on: `ENABLE_GPU=1`
|
||||
- Force off: `ENABLE_GPU=0`
|
||||
- Docker (NVIDIA): `docker run --rm --gpus all -e ENABLE_GPU=1 -p 9777:9777 rushabhtechie/yahoo-scraper:latest`
|
||||
- Docker (AMD/Intel): `docker run --rm --device=/dev/dri --group-add video -e ENABLE_GPU=1 -p 9777:9777 rushabhtechie/yahoo-scraper:latest`
|
||||
|
||||
## Line-by-line explanation of scraper_service.py
|
||||
|
||||
- Line 1: Import symbols or modules. Code: `from flask import Flask, jsonify, request`
|
||||
- Line 2: Import symbols or modules. Code: `from playwright.sync_api import sync_playwright`
|
||||
- Line 3: Import symbols or modules. Code: `from bs4 import BeautifulSoup`
|
||||
- Line 4: Import symbols or modules. Code: `from datetime import datetime, timezone`
|
||||
- Line 5: Import symbols or modules. Code: `import urllib.parse`
|
||||
- Line 6: Import symbols or modules. Code: `import logging`
|
||||
- Line 7: Import symbols or modules. Code: `import json`
|
||||
- Line 8: Import symbols or modules. Code: `import re`
|
||||
- Line 9: Import symbols or modules. Code: `import time`
|
||||
- Line 10: Import symbols or modules. Code: `import os`
|
||||
- Line 11: Blank line for readability. Code: `<blank>`
|
||||
- Line 12: Execute the statement as written. Code: `app = Flask(__name__)`
|
||||
- Line 13: Blank line for readability. Code: `<blank>`
|
||||
- Line 14: Comment describing the next block. Code: `# Logging`
|
||||
- Line 15: Execute the statement as written. Code: `logging.basicConfig(`
|
||||
- Line 16: Execute the statement as written. Code: ` level=logging.INFO,`
|
||||
- Line 17: Execute the statement as written. Code: ` format="%(asctime)s [%(levelname)s] %(message)s"`
|
||||
- Line 18: Execute the statement as written. Code: `)`
|
||||
- Line 19: Execute the statement as written. Code: `app.logger.setLevel(logging.INFO)`
|
||||
- Line 20: Blank line for readability. Code: `<blank>`
|
||||
- Line 21: Execute the statement as written. Code: `DATE_FORMATS = (`
|
||||
- Line 22: Execute the statement as written. Code: ` "%Y-%m-%d",`
|
||||
- Line 23: Execute the statement as written. Code: ` "%Y/%m/%d",`
|
||||
- Line 24: Execute the statement as written. Code: ` "%Y%m%d",`
|
||||
- Line 25: Execute the statement as written. Code: ` "%b %d, %Y",`
|
||||
- Line 26: Execute the statement as written. Code: ` "%B %d, %Y",`
|
||||
- Line 27: Execute the statement as written. Code: `)`
|
||||
- Line 28: Blank line for readability. Code: `<blank>`
|
||||
- Line 29: Execute the statement as written. Code: `GPU_ACCEL_ENV = "ENABLE_GPU"`
|
||||
- Line 30: Blank line for readability. Code: `<blank>`
|
||||
- Line 31: Blank line for readability. Code: `<blank>`
|
||||
- Line 32: Define the parse_env_flag function. Code: `def parse_env_flag(value, default=False):`
|
||||
- Line 33: Execute the statement as written. Code: ` if value is None:`
|
||||
- Line 34: Execute the statement as written. Code: ` return default`
|
||||
- Line 35: Execute the statement as written. Code: ` return str(value).strip().lower() in ("1", "true", "yes", "on")`
|
||||
- Line 36: Blank line for readability. Code: `<blank>`
|
||||
- Line 37: Blank line for readability. Code: `<blank>`
|
||||
- Line 38: Define the detect_gpu_available function. Code: `def detect_gpu_available():`
|
||||
- Line 39: Execute the statement as written. Code: ` env_value = os.getenv(GPU_ACCEL_ENV)`
|
||||
- Line 40: Execute the statement as written. Code: ` if env_value is not None:`
|
||||
- Line 41: Execute the statement as written. Code: ` return parse_env_flag(env_value, default=False)`
|
||||
- Line 42: Blank line for readability. Code: `<blank>`
|
||||
- Line 43: Execute the statement as written. Code: ` nvidia_visible = os.getenv("NVIDIA_VISIBLE_DEVICES")`
|
||||
- Line 44: Execute the statement as written. Code: ` if nvidia_visible and nvidia_visible.lower() not in ("none", "void", "off"):`
|
||||
- Line 45: Execute the statement as written. Code: ` return True`
|
||||
- Line 46: Blank line for readability. Code: `<blank>`
|
||||
- Line 47: Execute the statement as written. Code: ` if os.path.exists("/dev/nvidia0"):`
|
||||
- Line 48: Execute the statement as written. Code: ` return True`
|
||||
- Line 49: Blank line for readability. Code: `<blank>`
|
||||
- Line 50: Execute the statement as written. Code: ` if os.path.exists("/dev/dri/renderD128") or os.path.exists("/dev/dri/card0"):`
|
||||
- Line 51: Execute the statement as written. Code: ` return True`
|
||||
- Line 52: Blank line for readability. Code: `<blank>`
|
||||
- Line 53: Execute the statement as written. Code: ` return False`
|
||||
- Line 54: Blank line for readability. Code: `<blank>`
|
||||
- Line 55: Blank line for readability. Code: `<blank>`
|
||||
- Line 56: Define the chromium_launch_args function. Code: `def chromium_launch_args():`
|
||||
- Line 57: Execute the statement as written. Code: ` if not detect_gpu_available():`
|
||||
- Line 58: Execute the statement as written. Code: ` return []`
|
||||
- Line 59: Blank line for readability. Code: `<blank>`
|
||||
- Line 60: Execute the statement as written. Code: ` if os.name == "nt":`
|
||||
- Line 61: Execute the statement as written. Code: ` return ["--enable-gpu"]`
|
||||
- Line 62: Blank line for readability. Code: `<blank>`
|
||||
- Line 63: Execute the statement as written. Code: ` return [`
|
||||
- Line 64: Execute the statement as written. Code: ` "--enable-gpu",`
|
||||
- Line 65: Execute the statement as written. Code: ` "--ignore-gpu-blocklist",`
|
||||
- Line 66: Execute the statement as written. Code: ` "--disable-software-rasterizer",`
|
||||
- Line 67: Execute the statement as written. Code: ` "--use-gl=egl",`
|
||||
- Line 68: Execute the statement as written. Code: ` "--enable-zero-copy",`
|
||||
- Line 69: Execute the statement as written. Code: ` "--enable-gpu-rasterization",`
|
||||
- Line 70: Execute the statement as written. Code: ` ]`
|
||||
- Line 71: Blank line for readability. Code: `<blank>`
|
||||
- Line 72: Blank line for readability. Code: `<blank>`
|
||||
- Line 73: Define the parse_date function. Code: `def parse_date(value):`
|
||||
- Line 74: Execute the statement as written. Code: ` for fmt in DATE_FORMATS:`
|
||||
- Line 75: Execute the statement as written. Code: ` try:`
|
||||
- Line 76: Execute the statement as written. Code: ` return datetime.strptime(value, fmt).date()`
|
||||
- Line 77: Execute the statement as written. Code: ` except ValueError:`
|
||||
- Line 78: Execute the statement as written. Code: ` continue`
|
||||
- Line 79: Execute the statement as written. Code: ` return None`
|
||||
- Line 80: Blank line for readability. Code: `<blank>`
|
||||
- Line 81: Blank line for readability. Code: `<blank>`
|
||||
- Line 82: Define the normalize_label function. Code: `def normalize_label(value):`
|
||||
- Line 83: Execute the statement as written. Code: ` return " ".join(value.strip().split()).lower()`
|
||||
- Line 84: Blank line for readability. Code: `<blank>`
|
||||
- Line 85: Blank line for readability. Code: `<blank>`
|
||||
- Line 86: Define the format_expiration_label function. Code: `def format_expiration_label(timestamp):`
|
||||
- Line 87: Execute the statement as written. Code: ` try:`
|
||||
- Line 88: Execute the statement as written. Code: ` return datetime.utcfromtimestamp(timestamp).strftime("%Y-%m-%d")`
|
||||
- Line 89: Execute the statement as written. Code: ` except Exception:`
|
||||
- Line 90: Execute the statement as written. Code: ` return str(timestamp)`
|
||||
- Line 91: Blank line for readability. Code: `<blank>`
|
||||
- Line 92: Blank line for readability. Code: `<blank>`
|
||||
- Line 93: Define the format_percent function. Code: `def format_percent(value):`
|
||||
- Line 94: Execute the statement as written. Code: ` if value is None:`
|
||||
- Line 95: Execute the statement as written. Code: ` return None`
|
||||
- Line 96: Execute the statement as written. Code: ` try:`
|
||||
- Line 97: Execute the statement as written. Code: ` return f"{value * 100:.2f}%"`
|
||||
- Line 98: Execute the statement as written. Code: ` except Exception:`
|
||||
- Line 99: Execute the statement as written. Code: ` return None`
|
||||
- Line 100: Blank line for readability. Code: `<blank>`
|
||||
- Line 101: Blank line for readability. Code: `<blank>`
|
||||
- Line 102: Define the extract_raw_value function. Code: `def extract_raw_value(value):`
|
||||
- Line 103: Execute the statement as written. Code: ` if isinstance(value, dict):`
|
||||
- Line 104: Execute the statement as written. Code: ` return value.get("raw")`
|
||||
- Line 105: Execute the statement as written. Code: ` return value`
|
||||
- Line 106: Blank line for readability. Code: `<blank>`
|
||||
- Line 107: Blank line for readability. Code: `<blank>`
|
||||
- Line 108: Define the extract_fmt_value function. Code: `def extract_fmt_value(value):`
|
||||
- Line 109: Execute the statement as written. Code: ` if isinstance(value, dict):`
|
||||
- Line 110: Execute the statement as written. Code: ` return value.get("fmt")`
|
||||
- Line 111: Execute the statement as written. Code: ` return None`
|
||||
- Line 112: Blank line for readability. Code: `<blank>`
|
||||
- Line 113: Blank line for readability. Code: `<blank>`
|
||||
- Line 114: Define the format_percent_value function. Code: `def format_percent_value(value):`
|
||||
- Line 115: Execute the statement as written. Code: ` fmt = extract_fmt_value(value)`
|
||||
- Line 116: Execute the statement as written. Code: ` if fmt is not None:`
|
||||
- Line 117: Execute the statement as written. Code: ` return fmt`
|
||||
- Line 118: Execute the statement as written. Code: ` return format_percent(extract_raw_value(value))`
|
||||
- Line 119: Blank line for readability. Code: `<blank>`
|
||||
- Line 120: Blank line for readability. Code: `<blank>`
|
||||
- Line 121: Define the format_last_trade_date function. Code: `def format_last_trade_date(timestamp):`
|
||||
- Line 122: Execute the statement as written. Code: ` timestamp = extract_raw_value(timestamp)`
|
||||
- Line 123: Execute the statement as written. Code: ` if not timestamp:`
|
||||
- Line 124: Execute the statement as written. Code: ` return None`
|
||||
- Line 125: Execute the statement as written. Code: ` try:`
|
||||
- Line 126: Execute the statement as written. Code: ` return datetime.fromtimestamp(timestamp).strftime("%m/%d/%Y %I:%M %p") + " EST"`
|
||||
- Line 127: Execute the statement as written. Code: ` except Exception:`
|
||||
- Line 128: Execute the statement as written. Code: ` return None`
|
||||
- Line 129: Blank line for readability. Code: `<blank>`
|
||||
- Line 130: Blank line for readability. Code: `<blank>`
|
||||
- Line 131: Define the extract_option_chain_from_html function. Code: `def extract_option_chain_from_html(html):`
|
||||
- Line 132: Execute the statement as written. Code: ` if not html:`
|
||||
- Line 133: Execute the statement as written. Code: ` return None`
|
||||
- Line 134: Blank line for readability. Code: `<blank>`
|
||||
- Line 135: Execute the statement as written. Code: ` token = "\"body\":\""`
|
||||
- Line 136: Execute the statement as written. Code: ` start = 0`
|
||||
- Line 137: Execute the statement as written. Code: ` while True:`
|
||||
- Line 138: Execute the statement as written. Code: ` idx = html.find(token, start)`
|
||||
- Line 139: Execute the statement as written. Code: ` if idx == -1:`
|
||||
- Line 140: Execute the statement as written. Code: ` break`
|
||||
- Line 141: Execute the statement as written. Code: ` i = idx + len(token)`
|
||||
- Line 142: Execute the statement as written. Code: ` escaped = False`
|
||||
- Line 143: Execute the statement as written. Code: ` raw_chars = []`
|
||||
- Line 144: Execute the statement as written. Code: ` while i < len(html):`
|
||||
- Line 145: Execute the statement as written. Code: ` ch = html[i]`
|
||||
- Line 146: Execute the statement as written. Code: ` if escaped:`
|
||||
- Line 147: Execute the statement as written. Code: ` raw_chars.append(ch)`
|
||||
- Line 148: Execute the statement as written. Code: ` escaped = False`
|
||||
- Line 149: Execute the statement as written. Code: ` else:`
|
||||
- Line 150: Execute the statement as written. Code: ` if ch == "\\":`
|
||||
- Line 151: Execute the statement as written. Code: ` raw_chars.append(ch)`
|
||||
- Line 152: Execute the statement as written. Code: ` escaped = True`
|
||||
- Line 153: Execute the statement as written. Code: ` elif ch == "\"":`
|
||||
- Line 154: Execute the statement as written. Code: ` break`
|
||||
- Line 155: Execute the statement as written. Code: ` else:`
|
||||
- Line 156: Execute the statement as written. Code: ` raw_chars.append(ch)`
|
||||
- Line 157: Execute the statement as written. Code: ` i += 1`
|
||||
- Line 158: Execute the statement as written. Code: ` raw = "".join(raw_chars)`
|
||||
- Line 159: Execute the statement as written. Code: ` try:`
|
||||
- Line 160: Execute the statement as written. Code: ` body_text = json.loads(f"\"{raw}\"")`
|
||||
- Line 161: Execute the statement as written. Code: ` except json.JSONDecodeError:`
|
||||
- Line 162: Execute the statement as written. Code: ` start = idx + len(token)`
|
||||
- Line 163: Execute the statement as written. Code: ` continue`
|
||||
- Line 164: Execute the statement as written. Code: ` if "optionChain" not in body_text:`
|
||||
- Line 165: Execute the statement as written. Code: ` start = idx + len(token)`
|
||||
- Line 166: Execute the statement as written. Code: ` continue`
|
||||
- Line 167: Execute the statement as written. Code: ` try:`
|
||||
- Line 168: Execute the statement as written. Code: ` payload = json.loads(body_text)`
|
||||
- Line 169: Execute the statement as written. Code: ` except json.JSONDecodeError:`
|
||||
- Line 170: Execute the statement as written. Code: ` start = idx + len(token)`
|
||||
- Line 171: Execute the statement as written. Code: ` continue`
|
||||
- Line 172: Execute the statement as written. Code: ` option_chain = payload.get("optionChain")`
|
||||
- Line 173: Execute the statement as written. Code: ` if option_chain and option_chain.get("result"):`
|
||||
- Line 174: Execute the statement as written. Code: ` return option_chain`
|
||||
- Line 175: Blank line for readability. Code: `<blank>`
|
||||
- Line 176: Execute the statement as written. Code: ` start = idx + len(token)`
|
||||
- Line 177: Blank line for readability. Code: `<blank>`
|
||||
- Line 178: Execute the statement as written. Code: ` return None`
|
||||
- Line 179: Blank line for readability. Code: `<blank>`
|
||||
- Line 180: Blank line for readability. Code: `<blank>`
|
||||
- Line 181: Define the extract_expiration_dates_from_chain function. Code: `def extract_expiration_dates_from_chain(chain):`
|
||||
- Line 182: Execute the statement as written. Code: ` if not chain:`
|
||||
- Line 183: Execute the statement as written. Code: ` return []`
|
||||
- Line 184: Blank line for readability. Code: `<blank>`
|
||||
- Line 185: Execute the statement as written. Code: ` result = chain.get("result", [])`
|
||||
- Line 186: Execute the statement as written. Code: ` if not result:`
|
||||
- Line 187: Execute the statement as written. Code: ` return []`
|
||||
- Line 188: Execute the statement as written. Code: ` return result[0].get("expirationDates", []) or []`
|
||||
- Line 189: Blank line for readability. Code: `<blank>`
|
||||
- Line 190: Blank line for readability. Code: `<blank>`
|
||||
- Line 191: Define the normalize_chain_rows function. Code: `def normalize_chain_rows(rows):`
|
||||
- Line 192: Execute the statement as written. Code: ` normalized = []`
|
||||
- Line 193: Execute the statement as written. Code: ` for row in rows or []:`
|
||||
- Line 194: Execute the statement as written. Code: ` normalized.append(`
|
||||
- Line 195: Execute the statement as written. Code: ` {`
|
||||
- Line 196: Execute the statement as written. Code: ` "Contract Name": row.get("contractSymbol"),`
|
||||
- Line 197: Execute the statement as written. Code: ` "Last Trade Date (EST)": format_last_trade_date(`
|
||||
- Line 198: Execute the statement as written. Code: ` row.get("lastTradeDate")`
|
||||
- Line 199: Execute the statement as written. Code: ` ),`
|
||||
- Line 200: Execute the statement as written. Code: ` "Strike": extract_raw_value(row.get("strike")),`
|
||||
- Line 201: Execute the statement as written. Code: ` "Last Price": extract_raw_value(row.get("lastPrice")),`
|
||||
- Line 202: Execute the statement as written. Code: ` "Bid": extract_raw_value(row.get("bid")),`
|
||||
- Line 203: Execute the statement as written. Code: ` "Ask": extract_raw_value(row.get("ask")),`
|
||||
- Line 204: Execute the statement as written. Code: ` "Change": extract_raw_value(row.get("change")),`
|
||||
- Line 205: Execute the statement as written. Code: ` "% Change": format_percent_value(row.get("percentChange")),`
|
||||
- Line 206: Execute the statement as written. Code: ` "Volume": extract_raw_value(row.get("volume")),`
|
||||
- Line 207: Execute the statement as written. Code: ` "Open Interest": extract_raw_value(row.get("openInterest")),`
|
||||
- Line 208: Execute the statement as written. Code: ` "Implied Volatility": format_percent_value(`
|
||||
- Line 209: Execute the statement as written. Code: ` row.get("impliedVolatility")`
|
||||
- Line 210: Execute the statement as written. Code: ` ),`
|
||||
- Line 211: Execute the statement as written. Code: ` }`
|
||||
- Line 212: Execute the statement as written. Code: ` )`
|
||||
- Line 213: Execute the statement as written. Code: ` return normalized`
|
||||
- Line 214: Blank line for readability. Code: `<blank>`
|
||||
- Line 215: Blank line for readability. Code: `<blank>`
|
||||
- Line 216: Define the build_rows_from_chain function. Code: `def build_rows_from_chain(chain):`
|
||||
- Line 217: Execute the statement as written. Code: ` result = chain.get("result", []) if chain else []`
|
||||
- Line 218: Execute the statement as written. Code: ` if not result:`
|
||||
- Line 219: Execute the statement as written. Code: ` return [], []`
|
||||
- Line 220: Execute the statement as written. Code: ` options = result[0].get("options", [])`
|
||||
- Line 221: Execute the statement as written. Code: ` if not options:`
|
||||
- Line 222: Execute the statement as written. Code: ` return [], []`
|
||||
- Line 223: Execute the statement as written. Code: ` option = options[0]`
|
||||
- Line 224: Execute the statement as written. Code: ` return (`
|
||||
- Line 225: Execute the statement as written. Code: ` normalize_chain_rows(option.get("calls")),`
|
||||
- Line 226: Execute the statement as written. Code: ` normalize_chain_rows(option.get("puts")),`
|
||||
- Line 227: Execute the statement as written. Code: ` )`
|
||||
- Line 228: Blank line for readability. Code: `<blank>`
|
||||
- Line 229: Blank line for readability. Code: `<blank>`
|
||||
- Line 230: Define the extract_contract_expiry_code function. Code: `def extract_contract_expiry_code(contract_name):`
|
||||
- Line 231: Execute the statement as written. Code: ` if not contract_name:`
|
||||
- Line 232: Execute the statement as written. Code: ` return None`
|
||||
- Line 233: Execute the statement as written. Code: ` match = re.search(r"(\d{6})", contract_name)`
|
||||
- Line 234: Execute the statement as written. Code: ` return match.group(1) if match else None`
|
||||
- Line 235: Blank line for readability. Code: `<blank>`
|
||||
- Line 236: Blank line for readability. Code: `<blank>`
|
||||
- Line 237: Define the expected_expiry_code function. Code: `def expected_expiry_code(timestamp):`
|
||||
- Line 238: Execute the statement as written. Code: ` if not timestamp:`
|
||||
- Line 239: Execute the statement as written. Code: ` return None`
|
||||
- Line 240: Execute the statement as written. Code: ` try:`
|
||||
- Line 241: Execute the statement as written. Code: ` return datetime.utcfromtimestamp(timestamp).strftime("%y%m%d")`
|
||||
- Line 242: Execute the statement as written. Code: ` except Exception:`
|
||||
- Line 243: Execute the statement as written. Code: ` return None`
|
||||
- Line 244: Blank line for readability. Code: `<blank>`
|
||||
- Line 245: Blank line for readability. Code: `<blank>`
|
||||
- Line 246: Define the extract_expiration_dates_from_html function. Code: `def extract_expiration_dates_from_html(html):`
|
||||
- Line 247: Execute the statement as written. Code: ` if not html:`
|
||||
- Line 248: Execute the statement as written. Code: ` return []`
|
||||
- Line 249: Blank line for readability. Code: `<blank>`
|
||||
- Line 250: Execute the statement as written. Code: ` patterns = (`
|
||||
- Line 251: Execute the statement as written. Code: ` r'\\"expirationDates\\":\[(.*?)\]',`
|
||||
- Line 252: Execute the statement as written. Code: ` r'"expirationDates":\[(.*?)\]',`
|
||||
- Line 253: Execute the statement as written. Code: ` )`
|
||||
- Line 254: Execute the statement as written. Code: ` match = None`
|
||||
- Line 255: Execute the statement as written. Code: ` for pattern in patterns:`
|
||||
- Line 256: Execute the statement as written. Code: ` match = re.search(pattern, html, re.DOTALL)`
|
||||
- Line 257: Execute the statement as written. Code: ` if match:`
|
||||
- Line 258: Execute the statement as written. Code: ` break`
|
||||
- Line 259: Execute the statement as written. Code: ` if not match:`
|
||||
- Line 260: Execute the statement as written. Code: ` return []`
|
||||
- Line 261: Blank line for readability. Code: `<blank>`
|
||||
- Line 262: Execute the statement as written. Code: ` raw = match.group(1)`
|
||||
- Line 263: Execute the statement as written. Code: ` values = []`
|
||||
- Line 264: Execute the statement as written. Code: ` for part in raw.split(","):`
|
||||
- Line 265: Execute the statement as written. Code: ` part = part.strip()`
|
||||
- Line 266: Execute the statement as written. Code: ` if part.isdigit():`
|
||||
- Line 267: Execute the statement as written. Code: ` try:`
|
||||
- Line 268: Execute the statement as written. Code: ` values.append(int(part))`
|
||||
- Line 269: Execute the statement as written. Code: ` except Exception:`
|
||||
- Line 270: Execute the statement as written. Code: ` continue`
|
||||
- Line 271: Execute the statement as written. Code: ` return values`
|
||||
- Line 272: Blank line for readability. Code: `<blank>`
|
||||
- Line 273: Blank line for readability. Code: `<blank>`
|
||||
- Line 274: Define the build_expiration_options function. Code: `def build_expiration_options(expiration_dates):`
|
||||
- Line 275: Execute the statement as written. Code: ` options = []`
|
||||
- Line 276: Execute the statement as written. Code: ` for value in expiration_dates or []:`
|
||||
- Line 277: Execute the statement as written. Code: ` try:`
|
||||
- Line 278: Execute the statement as written. Code: ` value_int = int(value)`
|
||||
- Line 279: Execute the statement as written. Code: ` except Exception:`
|
||||
- Line 280: Execute the statement as written. Code: ` continue`
|
||||
- Line 281: Blank line for readability. Code: `<blank>`
|
||||
- Line 282: Execute the statement as written. Code: ` label = format_expiration_label(value_int)`
|
||||
- Line 283: Execute the statement as written. Code: ` try:`
|
||||
- Line 284: Execute the statement as written. Code: ` date_value = datetime.utcfromtimestamp(value_int).date()`
|
||||
- Line 285: Execute the statement as written. Code: ` except Exception:`
|
||||
- Line 286: Execute the statement as written. Code: ` date_value = None`
|
||||
- Line 287: Blank line for readability. Code: `<blank>`
|
||||
- Line 288: Execute the statement as written. Code: ` options.append({"value": value_int, "label": label, "date": date_value})`
|
||||
- Line 289: Execute the statement as written. Code: ` return sorted(options, key=lambda x: x["value"])`
|
||||
- Line 290: Blank line for readability. Code: `<blank>`
|
||||
- Line 291: Blank line for readability. Code: `<blank>`
|
||||
- Line 292: Define the resolve_expiration function. Code: `def resolve_expiration(expiration, options):`
|
||||
- Line 293: Execute the statement as written. Code: ` if not expiration:`
|
||||
- Line 294: Execute the statement as written. Code: ` return None, None`
|
||||
- Line 295: Blank line for readability. Code: `<blank>`
|
||||
- Line 296: Execute the statement as written. Code: ` raw = expiration.strip()`
|
||||
- Line 297: Execute the statement as written. Code: ` if not raw:`
|
||||
- Line 298: Execute the statement as written. Code: ` return None, None`
|
||||
- Line 299: Blank line for readability. Code: `<blank>`
|
||||
- Line 300: Execute the statement as written. Code: ` if raw.isdigit():`
|
||||
- Line 301: Execute the statement as written. Code: ` value = int(raw)`
|
||||
- Line 302: Execute the statement as written. Code: ` if options:`
|
||||
- Line 303: Execute the statement as written. Code: ` for opt in options:`
|
||||
- Line 304: Execute the statement as written. Code: ` if opt.get("value") == value:`
|
||||
- Line 305: Execute the statement as written. Code: ` return value, opt.get("label")`
|
||||
- Line 306: Execute the statement as written. Code: ` return None, None`
|
||||
- Line 307: Execute the statement as written. Code: ` return value, format_expiration_label(value)`
|
||||
- Line 308: Blank line for readability. Code: `<blank>`
|
||||
- Line 309: Execute the statement as written. Code: ` requested_date = parse_date(raw)`
|
||||
- Line 310: Execute the statement as written. Code: ` if requested_date:`
|
||||
- Line 311: Execute the statement as written. Code: ` for opt in options:`
|
||||
- Line 312: Execute the statement as written. Code: ` if opt.get("date") == requested_date:`
|
||||
- Line 313: Execute the statement as written. Code: ` return opt.get("value"), opt.get("label")`
|
||||
- Line 314: Execute the statement as written. Code: ` return None, None`
|
||||
- Line 315: Blank line for readability. Code: `<blank>`
|
||||
- Line 316: Execute the statement as written. Code: ` normalized = normalize_label(raw)`
|
||||
- Line 317: Execute the statement as written. Code: ` for opt in options:`
|
||||
- Line 318: Execute the statement as written. Code: ` if normalize_label(opt.get("label", "")) == normalized:`
|
||||
- Line 319: Execute the statement as written. Code: ` return opt.get("value"), opt.get("label")`
|
||||
- Line 320: Blank line for readability. Code: `<blank>`
|
||||
- Line 321: Execute the statement as written. Code: ` return None, None`
|
||||
- Line 322: Blank line for readability. Code: `<blank>`
|
||||
- Line 323: Blank line for readability. Code: `<blank>`
|
||||
- Line 324: Define the wait_for_tables function. Code: `def wait_for_tables(page):`
|
||||
- Line 325: Execute the statement as written. Code: ` try:`
|
||||
- Line 326: Execute the statement as written. Code: ` page.wait_for_selector(`
|
||||
- Line 327: Execute the statement as written. Code: ` "section[data-testid='options-list-table'] table",`
|
||||
- Line 328: Execute the statement as written. Code: ` timeout=30000,`
|
||||
- Line 329: Execute the statement as written. Code: ` )`
|
||||
- Line 330: Execute the statement as written. Code: ` except Exception:`
|
||||
- Line 331: Execute the statement as written. Code: ` page.wait_for_selector("table", timeout=30000)`
|
||||
- Line 332: Blank line for readability. Code: `<blank>`
|
||||
- Line 333: Execute the statement as written. Code: ` for _ in range(30): # 30 * 1s = 30 seconds`
|
||||
- Line 334: Execute the statement as written. Code: ` tables = page.query_selector_all(`
|
||||
- Line 335: Execute the statement as written. Code: ` "section[data-testid='options-list-table'] table"`
|
||||
- Line 336: Execute the statement as written. Code: ` )`
|
||||
- Line 337: Execute the statement as written. Code: ` if len(tables) >= 2:`
|
||||
- Line 338: Execute the statement as written. Code: ` return tables`
|
||||
- Line 339: Execute the statement as written. Code: ` tables = page.query_selector_all("table")`
|
||||
- Line 340: Execute the statement as written. Code: ` if len(tables) >= 2:`
|
||||
- Line 341: Execute the statement as written. Code: ` return tables`
|
||||
- Line 342: Execute the statement as written. Code: ` time.sleep(1)`
|
||||
- Line 343: Execute the statement as written. Code: ` return []`
|
||||
- Line 344: Blank line for readability. Code: `<blank>`
|
||||
- Line 345: Blank line for readability. Code: `<blank>`
|
||||
- Line 346: Define the parse_strike_limit function. Code: `def parse_strike_limit(value, default=25):`
|
||||
- Line 347: Execute the statement as written. Code: ` if value is None:`
|
||||
- Line 348: Execute the statement as written. Code: ` return default`
|
||||
- Line 349: Execute the statement as written. Code: ` try:`
|
||||
- Line 350: Execute the statement as written. Code: ` limit = int(value)`
|
||||
- Line 351: Execute the statement as written. Code: ` except (TypeError, ValueError):`
|
||||
- Line 352: Execute the statement as written. Code: ` return default`
|
||||
- Line 353: Execute the statement as written. Code: ` return limit if limit > 0 else default`
|
||||
- Line 354: Blank line for readability. Code: `<blank>`
|
||||
- Line 355: Blank line for readability. Code: `<blank>`
|
||||
- Line 356: Define the scrape_yahoo_options function. Code: `def scrape_yahoo_options(symbol, expiration=None, strike_limit=25):`
|
||||
- Line 357: Define the parse_table function. Code: ` def parse_table(table_html, side):`
|
||||
- Line 358: Execute the statement as written. Code: ` if not table_html:`
|
||||
- Line 359: Execute the statement as written. Code: ` app.logger.warning("No %s table HTML for %s", side, symbol)`
|
||||
- Line 360: Execute the statement as written. Code: ` return []`
|
||||
- Line 361: Blank line for readability. Code: `<blank>`
|
||||
- Line 362: Execute the statement as written. Code: ` soup = BeautifulSoup(table_html, "html.parser")`
|
||||
- Line 363: Blank line for readability. Code: `<blank>`
|
||||
- Line 364: Execute the statement as written. Code: ` headers = [th.get_text(strip=True) for th in soup.select("thead th")]`
|
||||
- Line 365: Execute the statement as written. Code: ` rows = soup.select("tbody tr")`
|
||||
- Line 366: Blank line for readability. Code: `<blank>`
|
||||
- Line 367: Execute the statement as written. Code: ` parsed = []`
|
||||
- Line 368: Execute the statement as written. Code: ` for r in rows:`
|
||||
- Line 369: Execute the statement as written. Code: ` tds = r.find_all("td")`
|
||||
- Line 370: Execute the statement as written. Code: ` if len(tds) != len(headers):`
|
||||
- Line 371: Execute the statement as written. Code: ` continue`
|
||||
- Line 372: Blank line for readability. Code: `<blank>`
|
||||
- Line 373: Execute the statement as written. Code: ` item = {}`
|
||||
- Line 374: Execute the statement as written. Code: ` for i, c in enumerate(tds):`
|
||||
- Line 375: Execute the statement as written. Code: ` key = headers[i]`
|
||||
- Line 376: Execute the statement as written. Code: ` val = c.get_text(" ", strip=True)`
|
||||
- Line 377: Blank line for readability. Code: `<blank>`
|
||||
- Line 378: Comment describing the next block. Code: ` # Convert numeric fields`
|
||||
- Line 379: Execute the statement as written. Code: ` if key in ["Strike", "Last Price", "Bid", "Ask", "Change"]:`
|
||||
- Line 380: Execute the statement as written. Code: ` try:`
|
||||
- Line 381: Execute the statement as written. Code: ` val = float(val.replace(",", ""))`
|
||||
- Line 382: Execute the statement as written. Code: ` except Exception:`
|
||||
- Line 383: Execute the statement as written. Code: ` val = None`
|
||||
- Line 384: Execute the statement as written. Code: ` elif key in ["Volume", "Open Interest"]:`
|
||||
- Line 385: Execute the statement as written. Code: ` try:`
|
||||
- Line 386: Execute the statement as written. Code: ` val = int(val.replace(",", ""))`
|
||||
- Line 387: Execute the statement as written. Code: ` except Exception:`
|
||||
- Line 388: Execute the statement as written. Code: ` val = None`
|
||||
- Line 389: Execute the statement as written. Code: ` elif val in ["-", ""]:`
|
||||
- Line 390: Execute the statement as written. Code: ` val = None`
|
||||
- Line 391: Blank line for readability. Code: `<blank>`
|
||||
- Line 392: Execute the statement as written. Code: ` item[key] = val`
|
||||
- Line 393: Blank line for readability. Code: `<blank>`
|
||||
- Line 394: Execute the statement as written. Code: ` parsed.append(item)`
|
||||
- Line 395: Blank line for readability. Code: `<blank>`
|
||||
- Line 396: Execute the statement as written. Code: ` app.logger.info("Parsed %d %s rows", len(parsed), side)`
|
||||
- Line 397: Execute the statement as written. Code: ` return parsed`
|
||||
- Line 398: Blank line for readability. Code: `<blank>`
|
||||
- Line 399: Define the read_option_chain function. Code: ` def read_option_chain(page):`
|
||||
- Line 400: Execute the statement as written. Code: ` html = page.content()`
|
||||
- Line 401: Execute the statement as written. Code: ` option_chain = extract_option_chain_from_html(html)`
|
||||
- Line 402: Execute the statement as written. Code: ` if option_chain:`
|
||||
- Line 403: Execute the statement as written. Code: ` expiration_dates = extract_expiration_dates_from_chain(option_chain)`
|
||||
- Line 404: Execute the statement as written. Code: ` else:`
|
||||
- Line 405: Execute the statement as written. Code: ` expiration_dates = extract_expiration_dates_from_html(html)`
|
||||
- Line 406: Execute the statement as written. Code: ` return option_chain, expiration_dates`
|
||||
- Line 407: Blank line for readability. Code: `<blank>`
|
||||
- Line 408: Define the has_expected_expiry function. Code: ` def has_expected_expiry(options, expected_code):`
|
||||
- Line 409: Execute the statement as written. Code: ` if not expected_code:`
|
||||
- Line 410: Execute the statement as written. Code: ` return False`
|
||||
- Line 411: Execute the statement as written. Code: ` for row in options or []:`
|
||||
- Line 412: Execute the statement as written. Code: ` name = row.get("Contract Name")`
|
||||
- Line 413: Execute the statement as written. Code: ` if extract_contract_expiry_code(name) == expected_code:`
|
||||
- Line 414: Execute the statement as written. Code: ` return True`
|
||||
- Line 415: Execute the statement as written. Code: ` return False`
|
||||
- Line 416: Blank line for readability. Code: `<blank>`
|
||||
- Line 417: Execute the statement as written. Code: ` encoded = urllib.parse.quote(symbol, safe="")`
|
||||
- Line 418: Execute the statement as written. Code: ` base_url = f"https://finance.yahoo.com/quote/{encoded}/options/"`
|
||||
- Line 419: Execute the statement as written. Code: ` requested_expiration = expiration.strip() if expiration else None`
|
||||
- Line 420: Execute the statement as written. Code: ` if not requested_expiration:`
|
||||
- Line 421: Execute the statement as written. Code: ` requested_expiration = None`
|
||||
- Line 422: Execute the statement as written. Code: ` url = base_url`
|
||||
- Line 423: Blank line for readability. Code: `<blank>`
|
||||
- Line 424: Execute the statement as written. Code: ` app.logger.info(`
|
||||
- Line 425: Execute the statement as written. Code: ` "Starting scrape for symbol=%s expiration=%s url=%s",`
|
||||
- Line 426: Execute the statement as written. Code: ` symbol,`
|
||||
- Line 427: Execute the statement as written. Code: ` requested_expiration,`
|
||||
- Line 428: Execute the statement as written. Code: ` base_url,`
|
||||
- Line 429: Execute the statement as written. Code: ` )`
|
||||
- Line 430: Blank line for readability. Code: `<blank>`
|
||||
- Line 431: Execute the statement as written. Code: ` calls_html = None`
|
||||
- Line 432: Execute the statement as written. Code: ` puts_html = None`
|
||||
- Line 433: Execute the statement as written. Code: ` calls_full = []`
|
||||
- Line 434: Execute the statement as written. Code: ` puts_full = []`
|
||||
- Line 435: Execute the statement as written. Code: ` price = None`
|
||||
- Line 436: Execute the statement as written. Code: ` selected_expiration_value = None`
|
||||
- Line 437: Execute the statement as written. Code: ` selected_expiration_label = None`
|
||||
- Line 438: Execute the statement as written. Code: ` expiration_options = []`
|
||||
- Line 439: Execute the statement as written. Code: ` target_date = None`
|
||||
- Line 440: Execute the statement as written. Code: ` fallback_to_base = False`
|
||||
- Line 441: Blank line for readability. Code: `<blank>`
|
||||
- Line 442: Execute the statement as written. Code: ` with sync_playwright() as p:`
|
||||
- Line 443: Execute the statement as written. Code: ` launch_args = chromium_launch_args()`
|
||||
- Line 444: Execute the statement as written. Code: ` if launch_args:`
|
||||
- Line 445: Execute the statement as written. Code: ` app.logger.info("GPU acceleration enabled")`
|
||||
- Line 446: Execute the statement as written. Code: ` else:`
|
||||
- Line 447: Execute the statement as written. Code: ` app.logger.info("GPU acceleration disabled")`
|
||||
- Line 448: Execute the statement as written. Code: ` browser = p.chromium.launch(headless=True, args=launch_args)`
|
||||
- Line 449: Execute the statement as written. Code: ` page = browser.new_page()`
|
||||
- Line 450: Execute the statement as written. Code: ` page.set_extra_http_headers(`
|
||||
- Line 451: Execute the statement as written. Code: ` {`
|
||||
- Line 452: Execute the statement as written. Code: ` "User-Agent": (`
|
||||
- Line 453: Execute the statement as written. Code: ` "Mozilla/5.0 (Windows NT 10.0; Win64; x64) "`
|
||||
- Line 454: Execute the statement as written. Code: ` "AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120 Safari/537.36"`
|
||||
- Line 455: Execute the statement as written. Code: ` )`
|
||||
- Line 456: Execute the statement as written. Code: ` }`
|
||||
- Line 457: Execute the statement as written. Code: ` )`
|
||||
- Line 458: Execute the statement as written. Code: ` page.set_default_timeout(60000)`
|
||||
- Line 459: Blank line for readability. Code: `<blank>`
|
||||
- Line 460: Execute the statement as written. Code: ` try:`
|
||||
- Line 461: Execute the statement as written. Code: ` if requested_expiration:`
|
||||
- Line 462: Execute the statement as written. Code: ` if requested_expiration.isdigit():`
|
||||
- Line 463: Execute the statement as written. Code: ` target_date = int(requested_expiration)`
|
||||
- Line 464: Execute the statement as written. Code: ` selected_expiration_value = target_date`
|
||||
- Line 465: Execute the statement as written. Code: ` selected_expiration_label = format_expiration_label(target_date)`
|
||||
- Line 466: Execute the statement as written. Code: ` else:`
|
||||
- Line 467: Execute the statement as written. Code: ` parsed_date = parse_date(requested_expiration)`
|
||||
- Line 468: Execute the statement as written. Code: ` if parsed_date:`
|
||||
- Line 469: Execute the statement as written. Code: ` target_date = int(`
|
||||
- Line 470: Execute the statement as written. Code: ` datetime(`
|
||||
- Line 471: Execute the statement as written. Code: ` parsed_date.year,`
|
||||
- Line 472: Execute the statement as written. Code: ` parsed_date.month,`
|
||||
- Line 473: Execute the statement as written. Code: ` parsed_date.day,`
|
||||
- Line 474: Execute the statement as written. Code: ` tzinfo=timezone.utc,`
|
||||
- Line 475: Execute the statement as written. Code: ` ).timestamp()`
|
||||
- Line 476: Execute the statement as written. Code: ` )`
|
||||
- Line 477: Execute the statement as written. Code: ` selected_expiration_value = target_date`
|
||||
- Line 478: Execute the statement as written. Code: ` selected_expiration_label = format_expiration_label(target_date)`
|
||||
- Line 479: Execute the statement as written. Code: ` else:`
|
||||
- Line 480: Execute the statement as written. Code: ` fallback_to_base = True`
|
||||
- Line 481: Blank line for readability. Code: `<blank>`
|
||||
- Line 482: Execute the statement as written. Code: ` if target_date:`
|
||||
- Line 483: Execute the statement as written. Code: ` url = f"{base_url}?date={target_date}"`
|
||||
- Line 484: Blank line for readability. Code: `<blank>`
|
||||
- Line 485: Execute the statement as written. Code: ` page.goto(url, wait_until="domcontentloaded", timeout=60000)`
|
||||
- Line 486: Execute the statement as written. Code: ` app.logger.info("Page loaded (domcontentloaded) for %s", symbol)`
|
||||
- Line 487: Blank line for readability. Code: `<blank>`
|
||||
- Line 488: Execute the statement as written. Code: ` option_chain, expiration_dates = read_option_chain(page)`
|
||||
- Line 489: Execute the statement as written. Code: ` app.logger.info("Option chain found: %s", bool(option_chain))`
|
||||
- Line 490: Execute the statement as written. Code: ` expiration_options = build_expiration_options(expiration_dates)`
|
||||
- Line 491: Blank line for readability. Code: `<blank>`
|
||||
- Line 492: Execute the statement as written. Code: ` if fallback_to_base:`
|
||||
- Line 493: Execute the statement as written. Code: ` resolved_value, resolved_label = resolve_expiration(`
|
||||
- Line 494: Execute the statement as written. Code: ` requested_expiration, expiration_options`
|
||||
- Line 495: Execute the statement as written. Code: ` )`
|
||||
- Line 496: Execute the statement as written. Code: ` if resolved_value is None:`
|
||||
- Line 497: Execute the statement as written. Code: ` return {`
|
||||
- Line 498: Execute the statement as written. Code: ` "error": "Requested expiration not available",`
|
||||
- Line 499: Execute the statement as written. Code: ` "stock": symbol,`
|
||||
- Line 500: Execute the statement as written. Code: ` "requested_expiration": requested_expiration,`
|
||||
- Line 501: Execute the statement as written. Code: ` "available_expirations": [`
|
||||
- Line 502: Execute the statement as written. Code: ` {"label": opt.get("label"), "value": opt.get("value")}`
|
||||
- Line 503: Execute the statement as written. Code: ` for opt in expiration_options`
|
||||
- Line 504: Execute the statement as written. Code: ` ],`
|
||||
- Line 505: Execute the statement as written. Code: ` }`
|
||||
- Line 506: Blank line for readability. Code: `<blank>`
|
||||
- Line 507: Execute the statement as written. Code: ` target_date = resolved_value`
|
||||
- Line 508: Execute the statement as written. Code: ` selected_expiration_value = resolved_value`
|
||||
- Line 509: Execute the statement as written. Code: ` selected_expiration_label = resolved_label or format_expiration_label(`
|
||||
- Line 510: Execute the statement as written. Code: ` resolved_value`
|
||||
- Line 511: Execute the statement as written. Code: ` )`
|
||||
- Line 512: Execute the statement as written. Code: ` url = f"{base_url}?date={resolved_value}"`
|
||||
- Line 513: Execute the statement as written. Code: ` page.goto(url, wait_until="domcontentloaded", timeout=60000)`
|
||||
- Line 514: Execute the statement as written. Code: ` app.logger.info("Page loaded (domcontentloaded) for %s", symbol)`
|
||||
- Line 515: Blank line for readability. Code: `<blank>`
|
||||
- Line 516: Execute the statement as written. Code: ` option_chain, expiration_dates = read_option_chain(page)`
|
||||
- Line 517: Execute the statement as written. Code: ` expiration_options = build_expiration_options(expiration_dates)`
|
||||
- Line 518: Blank line for readability. Code: `<blank>`
|
||||
- Line 519: Execute the statement as written. Code: ` if target_date and expiration_options:`
|
||||
- Line 520: Execute the statement as written. Code: ` matched = None`
|
||||
- Line 521: Execute the statement as written. Code: ` for opt in expiration_options:`
|
||||
- Line 522: Execute the statement as written. Code: ` if opt.get("value") == target_date:`
|
||||
- Line 523: Execute the statement as written. Code: ` matched = opt`
|
||||
- Line 524: Execute the statement as written. Code: ` break`
|
||||
- Line 525: Execute the statement as written. Code: ` if not matched:`
|
||||
- Line 526: Execute the statement as written. Code: ` return {`
|
||||
- Line 527: Execute the statement as written. Code: ` "error": "Requested expiration not available",`
|
||||
- Line 528: Execute the statement as written. Code: ` "stock": symbol,`
|
||||
- Line 529: Execute the statement as written. Code: ` "requested_expiration": requested_expiration,`
|
||||
- Line 530: Execute the statement as written. Code: ` "available_expirations": [`
|
||||
- Line 531: Execute the statement as written. Code: ` {"label": opt.get("label"), "value": opt.get("value")}`
|
||||
- Line 532: Execute the statement as written. Code: ` for opt in expiration_options`
|
||||
- Line 533: Execute the statement as written. Code: ` ],`
|
||||
- Line 534: Execute the statement as written. Code: ` }`
|
||||
- Line 535: Execute the statement as written. Code: ` selected_expiration_value = matched.get("value")`
|
||||
- Line 536: Execute the statement as written. Code: ` selected_expiration_label = matched.get("label")`
|
||||
- Line 537: Execute the statement as written. Code: ` elif expiration_options and not target_date:`
|
||||
- Line 538: Execute the statement as written. Code: ` selected_expiration_value = expiration_options[0].get("value")`
|
||||
- Line 539: Execute the statement as written. Code: ` selected_expiration_label = expiration_options[0].get("label")`
|
||||
- Line 540: Blank line for readability. Code: `<blank>`
|
||||
- Line 541: Execute the statement as written. Code: ` calls_full, puts_full = build_rows_from_chain(option_chain)`
|
||||
- Line 542: Execute the statement as written. Code: ` app.logger.info(`
|
||||
- Line 543: Execute the statement as written. Code: ` "Option chain rows: calls=%d puts=%d",`
|
||||
- Line 544: Execute the statement as written. Code: ` len(calls_full),`
|
||||
- Line 545: Execute the statement as written. Code: ` len(puts_full),`
|
||||
- Line 546: Execute the statement as written. Code: ` )`
|
||||
- Line 547: Blank line for readability. Code: `<blank>`
|
||||
- Line 548: Execute the statement as written. Code: ` if not calls_full and not puts_full:`
|
||||
- Line 549: Execute the statement as written. Code: ` app.logger.info("Waiting for options tables...")`
|
||||
- Line 550: Blank line for readability. Code: `<blank>`
|
||||
- Line 551: Execute the statement as written. Code: ` tables = wait_for_tables(page)`
|
||||
- Line 552: Execute the statement as written. Code: ` if len(tables) < 2:`
|
||||
- Line 553: Execute the statement as written. Code: ` app.logger.error(`
|
||||
- Line 554: Execute the statement as written. Code: ` "Only %d tables found; expected 2. HTML may have changed.",`
|
||||
- Line 555: Execute the statement as written. Code: ` len(tables),`
|
||||
- Line 556: Execute the statement as written. Code: ` )`
|
||||
- Line 557: Execute the statement as written. Code: ` return {"error": "Could not locate options tables", "stock": symbol}`
|
||||
- Line 558: Blank line for readability. Code: `<blank>`
|
||||
- Line 559: Execute the statement as written. Code: ` app.logger.info("Found %d tables. Extracting Calls & Puts.", len(tables))`
|
||||
- Line 560: Blank line for readability. Code: `<blank>`
|
||||
- Line 561: Execute the statement as written. Code: ` calls_html = tables[0].evaluate("el => el.outerHTML")`
|
||||
- Line 562: Execute the statement as written. Code: ` puts_html = tables[1].evaluate("el => el.outerHTML")`
|
||||
- Line 563: Blank line for readability. Code: `<blank>`
|
||||
- Line 564: Comment describing the next block. Code: ` # --- Extract current price ---`
|
||||
- Line 565: Execute the statement as written. Code: ` try:`
|
||||
- Line 566: Comment describing the next block. Code: ` # Primary selector`
|
||||
- Line 567: Execute the statement as written. Code: ` price_text = page.locator(`
|
||||
- Line 568: Execute the statement as written. Code: ` "fin-streamer[data-field='regularMarketPrice']"`
|
||||
- Line 569: Execute the statement as written. Code: ` ).inner_text()`
|
||||
- Line 570: Execute the statement as written. Code: ` price = float(price_text.replace(",", ""))`
|
||||
- Line 571: Execute the statement as written. Code: ` except Exception:`
|
||||
- Line 572: Execute the statement as written. Code: ` try:`
|
||||
- Line 573: Comment describing the next block. Code: ` # Fallback`
|
||||
- Line 574: Execute the statement as written. Code: ` price_text = page.locator("span[data-testid='qsp-price']").inner_text()`
|
||||
- Line 575: Execute the statement as written. Code: ` price = float(price_text.replace(",", ""))`
|
||||
- Line 576: Execute the statement as written. Code: ` except Exception as e:`
|
||||
- Line 577: Execute the statement as written. Code: ` app.logger.warning("Failed to extract price for %s: %s", symbol, e)`
|
||||
- Line 578: Blank line for readability. Code: `<blank>`
|
||||
- Line 579: Execute the statement as written. Code: ` app.logger.info("Current price for %s = %s", symbol, price)`
|
||||
- Line 580: Execute the statement as written. Code: ` finally:`
|
||||
- Line 581: Execute the statement as written. Code: ` browser.close()`
|
||||
- Line 582: Blank line for readability. Code: `<blank>`
|
||||
- Line 583: Execute the statement as written. Code: ` if not calls_full and not puts_full and calls_html and puts_html:`
|
||||
- Line 584: Execute the statement as written. Code: ` calls_full = parse_table(calls_html, "calls")`
|
||||
- Line 585: Execute the statement as written. Code: ` puts_full = parse_table(puts_html, "puts")`
|
||||
- Line 586: Blank line for readability. Code: `<blank>`
|
||||
- Line 587: Execute the statement as written. Code: ` expected_code = expected_expiry_code(target_date)`
|
||||
- Line 588: Execute the statement as written. Code: ` if expected_code:`
|
||||
- Line 589: Execute the statement as written. Code: ` if not has_expected_expiry(calls_full, expected_code) and not has_expected_expiry(`
|
||||
- Line 590: Execute the statement as written. Code: ` puts_full, expected_code`
|
||||
- Line 591: Execute the statement as written. Code: ` ):`
|
||||
- Line 592: Execute the statement as written. Code: ` return {`
|
||||
- Line 593: Execute the statement as written. Code: ` "error": "Options chain does not match requested expiration",`
|
||||
- Line 594: Execute the statement as written. Code: ` "stock": symbol,`
|
||||
- Line 595: Execute the statement as written. Code: ` "requested_expiration": requested_expiration,`
|
||||
- Line 596: Execute the statement as written. Code: ` "expected_expiration_code": expected_code,`
|
||||
- Line 597: Execute the statement as written. Code: ` "selected_expiration": {`
|
||||
- Line 598: Execute the statement as written. Code: ` "value": selected_expiration_value,`
|
||||
- Line 599: Execute the statement as written. Code: ` "label": selected_expiration_label,`
|
||||
- Line 600: Execute the statement as written. Code: ` },`
|
||||
- Line 601: Execute the statement as written. Code: ` }`
|
||||
- Line 602: Blank line for readability. Code: `<blank>`
|
||||
- Line 603: Comment describing the next block. Code: ` # ----------------------------------------------------------------------`
|
||||
- Line 604: Comment describing the next block. Code: ` # Pruning logic`
|
||||
- Line 605: Comment describing the next block. Code: ` # ----------------------------------------------------------------------`
|
||||
- Line 606: Define the prune_nearest function. Code: ` def prune_nearest(options, price_value, limit=25, side=""):`
|
||||
- Line 607: Execute the statement as written. Code: ` if price_value is None:`
|
||||
- Line 608: Execute the statement as written. Code: ` return options, 0`
|
||||
- Line 609: Blank line for readability. Code: `<blank>`
|
||||
- Line 610: Execute the statement as written. Code: ` numeric = [o for o in options if isinstance(o.get("Strike"), (int, float))]`
|
||||
- Line 611: Blank line for readability. Code: `<blank>`
|
||||
- Line 612: Execute the statement as written. Code: ` if len(numeric) <= limit:`
|
||||
- Line 613: Execute the statement as written. Code: ` return numeric, 0`
|
||||
- Line 614: Blank line for readability. Code: `<blank>`
|
||||
- Line 615: Execute the statement as written. Code: ` sorted_opts = sorted(numeric, key=lambda x: abs(x["Strike"] - price_value))`
|
||||
- Line 616: Execute the statement as written. Code: ` pruned = sorted_opts[:limit]`
|
||||
- Line 617: Execute the statement as written. Code: ` pruned_count = len(options) - len(pruned)`
|
||||
- Line 618: Execute the statement as written. Code: ` return pruned, pruned_count`
|
||||
- Line 619: Blank line for readability. Code: `<blank>`
|
||||
- Line 620: Execute the statement as written. Code: ` calls, pruned_calls = prune_nearest(`
|
||||
- Line 621: Execute the statement as written. Code: ` calls_full,`
|
||||
- Line 622: Execute the statement as written. Code: ` price,`
|
||||
- Line 623: Execute the statement as written. Code: ` limit=strike_limit,`
|
||||
- Line 624: Execute the statement as written. Code: ` side="calls",`
|
||||
- Line 625: Execute the statement as written. Code: ` )`
|
||||
- Line 626: Execute the statement as written. Code: ` puts, pruned_puts = prune_nearest(`
|
||||
- Line 627: Execute the statement as written. Code: ` puts_full,`
|
||||
- Line 628: Execute the statement as written. Code: ` price,`
|
||||
- Line 629: Execute the statement as written. Code: ` limit=strike_limit,`
|
||||
- Line 630: Execute the statement as written. Code: ` side="puts",`
|
||||
- Line 631: Execute the statement as written. Code: ` )`
|
||||
- Line 632: Blank line for readability. Code: `<blank>`
|
||||
- Line 633: Define the strike_range function. Code: ` def strike_range(opts):`
|
||||
- Line 634: Execute the statement as written. Code: ` strikes = [o["Strike"] for o in opts if isinstance(o.get("Strike"), (int, float))]`
|
||||
- Line 635: Execute the statement as written. Code: ` return [min(strikes), max(strikes)] if strikes else [None, None]`
|
||||
- Line 636: Blank line for readability. Code: `<blank>`
|
||||
- Line 637: Execute the statement as written. Code: ` return {`
|
||||
- Line 638: Execute the statement as written. Code: ` "stock": symbol,`
|
||||
- Line 639: Execute the statement as written. Code: ` "url": url,`
|
||||
- Line 640: Execute the statement as written. Code: ` "requested_expiration": requested_expiration,`
|
||||
- Line 641: Execute the statement as written. Code: ` "selected_expiration": {`
|
||||
- Line 642: Execute the statement as written. Code: ` "value": selected_expiration_value,`
|
||||
- Line 643: Execute the statement as written. Code: ` "label": selected_expiration_label,`
|
||||
- Line 644: Execute the statement as written. Code: ` },`
|
||||
- Line 645: Execute the statement as written. Code: ` "current_price": price,`
|
||||
- Line 646: Execute the statement as written. Code: ` "calls": calls,`
|
||||
- Line 647: Execute the statement as written. Code: ` "puts": puts,`
|
||||
- Line 648: Execute the statement as written. Code: ` "calls_strike_range": strike_range(calls),`
|
||||
- Line 649: Execute the statement as written. Code: ` "puts_strike_range": strike_range(puts),`
|
||||
- Line 650: Execute the statement as written. Code: ` "total_calls": len(calls),`
|
||||
- Line 651: Execute the statement as written. Code: ` "total_puts": len(puts),`
|
||||
- Line 652: Execute the statement as written. Code: ` "pruned_calls_count": pruned_calls,`
|
||||
- Line 653: Execute the statement as written. Code: ` "pruned_puts_count": pruned_puts,`
|
||||
- Line 654: Execute the statement as written. Code: ` }`
|
||||
- Line 655: Blank line for readability. Code: `<blank>`
|
||||
- Line 656: Blank line for readability. Code: `<blank>`
|
||||
- Line 657: Attach a decorator to the next function. Code: `@app.route("/scrape_sync")`
|
||||
- Line 658: Define the scrape_sync function. Code: `def scrape_sync():`
|
||||
- Line 659: Execute the statement as written. Code: ` symbol = request.args.get("stock", "MSFT")`
|
||||
- Line 660: Execute the statement as written. Code: ` expiration = (`
|
||||
- Line 661: Execute the statement as written. Code: ` request.args.get("expiration")`
|
||||
- Line 662: Execute the statement as written. Code: ` or request.args.get("expiry")`
|
||||
- Line 663: Execute the statement as written. Code: ` or request.args.get("date")`
|
||||
- Line 664: Execute the statement as written. Code: ` )`
|
||||
- Line 665: Execute the statement as written. Code: ` strike_limit = parse_strike_limit(request.args.get("strikeLimit"), default=25)`
|
||||
- Line 666: Execute the statement as written. Code: ` app.logger.info(`
|
||||
- Line 667: Execute the statement as written. Code: ` "Received /scrape_sync request for symbol=%s expiration=%s strike_limit=%s",`
|
||||
- Line 668: Execute the statement as written. Code: ` symbol,`
|
||||
- Line 669: Execute the statement as written. Code: ` expiration,`
|
||||
- Line 670: Execute the statement as written. Code: ` strike_limit,`
|
||||
- Line 671: Execute the statement as written. Code: ` )`
|
||||
- Line 672: Execute the statement as written. Code: ` return jsonify(scrape_yahoo_options(symbol, expiration, strike_limit))`
|
||||
- Line 673: Blank line for readability. Code: `<blank>`
|
||||
- Line 674: Blank line for readability. Code: `<blank>`
|
||||
- Line 675: Run the Flask development server when executed as a script. Code: `if __name__ == "__main__":`
|
||||
- Line 676: Execute the statement as written. Code: ` app.run(host="0.0.0.0", port=9777)`
|
||||
13
Dockerfile
Normal file
13
Dockerfile
Normal file
@@ -0,0 +1,13 @@
|
||||
FROM mcr.microsoft.com/playwright/python:v1.57.0-jammy
|
||||
|
||||
WORKDIR /app
|
||||
|
||||
ENV PYTHONUNBUFFERED=1
|
||||
|
||||
COPY scraper_service.py /app/scraper_service.py
|
||||
|
||||
RUN python -m pip install --no-cache-dir flask beautifulsoup4 playwright==1.57.0
|
||||
|
||||
EXPOSE 9777
|
||||
|
||||
CMD ["python", "scraper_service.py"]
|
||||
1415
scraper_service.py
1415
scraper_service.py
File diff suppressed because it is too large
Load Diff
199
scripts/test_cycles.py
Normal file
199
scripts/test_cycles.py
Normal file
@@ -0,0 +1,199 @@
|
||||
import argparse
|
||||
import datetime
|
||||
import json
|
||||
import sys
|
||||
import time
|
||||
import urllib.parse
|
||||
import urllib.request
|
||||
|
||||
DEFAULT_STOCKS = ["AAPL", "AMZN", "MSFT", "TSLA"]
|
||||
DEFAULT_CYCLES = [None, 5, 10, 25, 50, 75, 100, 150, 200, 500]
|
||||
|
||||
|
||||
def http_get(base_url, params, timeout):
|
||||
query = urllib.parse.urlencode(params)
|
||||
url = f"{base_url}?{query}"
|
||||
with urllib.request.urlopen(url, timeout=timeout) as resp:
|
||||
return json.loads(resp.read().decode("utf-8"))
|
||||
|
||||
|
||||
def expected_code_from_epoch(epoch):
|
||||
return datetime.datetime.utcfromtimestamp(epoch).strftime("%y%m%d")
|
||||
|
||||
|
||||
def all_contracts_match(opts, expected_code):
|
||||
for opt in opts:
|
||||
name = opt.get("Contract Name") or ""
|
||||
if expected_code not in name:
|
||||
return False
|
||||
return True
|
||||
|
||||
|
||||
def parse_list(value, default):
|
||||
if not value:
|
||||
return default
|
||||
return [item.strip() for item in value.split(",") if item.strip()]
|
||||
|
||||
|
||||
def parse_cycles(value):
|
||||
if not value:
|
||||
return DEFAULT_CYCLES
|
||||
cycles = []
|
||||
for item in value.split(","):
|
||||
token = item.strip().lower()
|
||||
if not token or token in ("default", "none"):
|
||||
cycles.append(None)
|
||||
continue
|
||||
try:
|
||||
cycles.append(int(token))
|
||||
except ValueError:
|
||||
raise ValueError(f"Invalid strikeLimit value: {item}")
|
||||
return cycles
|
||||
|
||||
|
||||
def main():
|
||||
parser = argparse.ArgumentParser(description="Yahoo options scraper test cycles")
|
||||
parser.add_argument(
|
||||
"--base-url",
|
||||
default="http://127.0.0.1:9777/scrape_sync",
|
||||
help="Base URL for /scrape_sync",
|
||||
)
|
||||
parser.add_argument(
|
||||
"--stocks",
|
||||
default=",".join(DEFAULT_STOCKS),
|
||||
help="Comma-separated stock symbols",
|
||||
)
|
||||
parser.add_argument(
|
||||
"--strike-limits",
|
||||
default="default,5,10,25,50,75,100,150,200,500",
|
||||
help="Comma-separated strike limits (use 'default' for the API default)",
|
||||
)
|
||||
parser.add_argument(
|
||||
"--baseline-limit",
|
||||
type=int,
|
||||
default=5000,
|
||||
help="Large strikeLimit used to capture all available strikes",
|
||||
)
|
||||
parser.add_argument(
|
||||
"--timeout",
|
||||
type=int,
|
||||
default=180,
|
||||
help="Request timeout in seconds",
|
||||
)
|
||||
parser.add_argument(
|
||||
"--sleep",
|
||||
type=float,
|
||||
default=0.2,
|
||||
help="Sleep between requests",
|
||||
)
|
||||
args = parser.parse_args()
|
||||
|
||||
stocks = parse_list(args.stocks, DEFAULT_STOCKS)
|
||||
cycles = parse_cycles(args.strike_limits)
|
||||
|
||||
print("Fetching expiration lists...")
|
||||
expirations = {}
|
||||
for stock in stocks:
|
||||
data = http_get(args.base_url, {"stock": stock, "expiration": "invalid"}, args.timeout)
|
||||
if "available_expirations" not in data:
|
||||
print(f"ERROR: missing available_expirations for {stock}: {data}")
|
||||
sys.exit(1)
|
||||
values = [opt.get("value") for opt in data["available_expirations"] if opt.get("value")]
|
||||
if len(values) < 4:
|
||||
print(f"ERROR: not enough expirations for {stock}: {values}")
|
||||
sys.exit(1)
|
||||
expirations[stock] = values[:4]
|
||||
print(f" {stock}: {expirations[stock]}")
|
||||
time.sleep(args.sleep)
|
||||
|
||||
print("\nBuilding baseline counts (strikeLimit=%d)..." % args.baseline_limit)
|
||||
baseline_counts = {}
|
||||
for stock, exp_list in expirations.items():
|
||||
for exp in exp_list:
|
||||
data = http_get(
|
||||
args.base_url,
|
||||
{"stock": stock, "expiration": exp, "strikeLimit": args.baseline_limit},
|
||||
args.timeout,
|
||||
)
|
||||
if "error" in data:
|
||||
print(f"ERROR: baseline error for {stock} {exp}: {data}")
|
||||
sys.exit(1)
|
||||
calls_count = data.get("total_calls")
|
||||
puts_count = data.get("total_puts")
|
||||
if calls_count is None or puts_count is None:
|
||||
print(f"ERROR: baseline missing counts for {stock} {exp}: {data}")
|
||||
sys.exit(1)
|
||||
expected_code = expected_code_from_epoch(exp)
|
||||
if not all_contracts_match(data.get("calls", []), expected_code):
|
||||
print(f"ERROR: baseline calls mismatch for {stock} {exp}")
|
||||
sys.exit(1)
|
||||
if not all_contracts_match(data.get("puts", []), expected_code):
|
||||
print(f"ERROR: baseline puts mismatch for {stock} {exp}")
|
||||
sys.exit(1)
|
||||
baseline_counts[(stock, exp)] = (calls_count, puts_count)
|
||||
print(f" {stock} {exp}: calls={calls_count} puts={puts_count}")
|
||||
time.sleep(args.sleep)
|
||||
|
||||
print("\nRunning %d cycles of API tests..." % len(cycles))
|
||||
for idx, strike_limit in enumerate(cycles, start=1):
|
||||
print(f"Cycle {idx}/{len(cycles)} (strikeLimit={strike_limit})")
|
||||
for stock, exp_list in expirations.items():
|
||||
for exp in exp_list:
|
||||
params = {"stock": stock, "expiration": exp}
|
||||
if strike_limit is not None:
|
||||
params["strikeLimit"] = strike_limit
|
||||
data = http_get(args.base_url, params, args.timeout)
|
||||
if "error" in data:
|
||||
print(f"ERROR: {stock} {exp} -> {data}")
|
||||
sys.exit(1)
|
||||
selected_val = data.get("selected_expiration", {}).get("value")
|
||||
if selected_val != exp:
|
||||
print(
|
||||
f"ERROR: selected expiration mismatch for {stock} {exp}: {selected_val}"
|
||||
)
|
||||
sys.exit(1)
|
||||
expected_code = expected_code_from_epoch(exp)
|
||||
if not all_contracts_match(data.get("calls", []), expected_code):
|
||||
print(f"ERROR: calls expiry mismatch for {stock} {exp}")
|
||||
sys.exit(1)
|
||||
if not all_contracts_match(data.get("puts", []), expected_code):
|
||||
print(f"ERROR: puts expiry mismatch for {stock} {exp}")
|
||||
sys.exit(1)
|
||||
available_calls, available_puts = baseline_counts[(stock, exp)]
|
||||
expected_limit = strike_limit if strike_limit is not None else 25
|
||||
expected_calls = min(expected_limit, available_calls)
|
||||
expected_puts = min(expected_limit, available_puts)
|
||||
if data.get("total_calls") != expected_calls:
|
||||
print(
|
||||
f"ERROR: call count mismatch for {stock} {exp}: "
|
||||
f"got {data.get('total_calls')} expected {expected_calls}"
|
||||
)
|
||||
sys.exit(1)
|
||||
if data.get("total_puts") != expected_puts:
|
||||
print(
|
||||
f"ERROR: put count mismatch for {stock} {exp}: "
|
||||
f"got {data.get('total_puts')} expected {expected_puts}"
|
||||
)
|
||||
sys.exit(1)
|
||||
expected_pruned_calls = max(0, available_calls - expected_calls)
|
||||
expected_pruned_puts = max(0, available_puts - expected_puts)
|
||||
if data.get("pruned_calls_count") != expected_pruned_calls:
|
||||
print(
|
||||
f"ERROR: pruned calls mismatch for {stock} {exp}: "
|
||||
f"got {data.get('pruned_calls_count')} expected {expected_pruned_calls}"
|
||||
)
|
||||
sys.exit(1)
|
||||
if data.get("pruned_puts_count") != expected_pruned_puts:
|
||||
print(
|
||||
f"ERROR: pruned puts mismatch for {stock} {exp}: "
|
||||
f"got {data.get('pruned_puts_count')} expected {expected_pruned_puts}"
|
||||
)
|
||||
sys.exit(1)
|
||||
time.sleep(args.sleep)
|
||||
print(f"Cycle {idx} OK")
|
||||
|
||||
print("\nAll cycles completed successfully.")
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
main()
|
||||
145
scripts/test_profile_cycles.py
Normal file
145
scripts/test_profile_cycles.py
Normal file
@@ -0,0 +1,145 @@
|
||||
import argparse
|
||||
import json
|
||||
import sys
|
||||
import time
|
||||
import urllib.parse
|
||||
import urllib.request
|
||||
|
||||
DEFAULT_SYMBOLS = ["AAPL", "AMZN", "MSFT", "TSLA"]
|
||||
|
||||
REQUIRED_SECTIONS = [
|
||||
"key_metrics",
|
||||
"valuation",
|
||||
"profitability",
|
||||
"growth",
|
||||
"financial_strength",
|
||||
"cashflow",
|
||||
"ownership",
|
||||
"analyst",
|
||||
"earnings",
|
||||
"performance",
|
||||
]
|
||||
|
||||
REQUIRED_KEY_METRICS = [
|
||||
"previous_close",
|
||||
"open",
|
||||
"bid",
|
||||
"ask",
|
||||
"beta",
|
||||
"eps_trailing",
|
||||
"dividend_rate",
|
||||
"current_price",
|
||||
]
|
||||
|
||||
|
||||
def http_get(base_url, params, timeout):
|
||||
query = urllib.parse.urlencode(params)
|
||||
url = f"{base_url}?{query}"
|
||||
with urllib.request.urlopen(url, timeout=timeout) as resp:
|
||||
return json.loads(resp.read().decode("utf-8"))
|
||||
|
||||
|
||||
def parse_list(value, default):
|
||||
if not value:
|
||||
return default
|
||||
return [item.strip() for item in value.split(",") if item.strip()]
|
||||
|
||||
|
||||
def build_signature(data):
|
||||
return {
|
||||
"key_metrics_keys": sorted(data.get("key_metrics", {}).keys()),
|
||||
"valuation_keys": sorted(data.get("valuation", {}).keys()),
|
||||
"profitability_keys": sorted(data.get("profitability", {}).keys()),
|
||||
"growth_keys": sorted(data.get("growth", {}).keys()),
|
||||
"financial_strength_keys": sorted(data.get("financial_strength", {}).keys()),
|
||||
"cashflow_keys": sorted(data.get("cashflow", {}).keys()),
|
||||
"ownership_keys": sorted(data.get("ownership", {}).keys()),
|
||||
"analyst_keys": sorted(data.get("analyst", {}).keys()),
|
||||
"earnings_keys": sorted(data.get("earnings", {}).keys()),
|
||||
"performance_keys": sorted(data.get("performance", {}).keys()),
|
||||
}
|
||||
|
||||
|
||||
def validate_payload(symbol, data):
|
||||
if "error" in data:
|
||||
return f"API error for {symbol}: {data}"
|
||||
if data.get("stock", "").upper() != symbol.upper():
|
||||
return f"Symbol mismatch: expected {symbol} got {data.get('stock')}"
|
||||
validation = data.get("validation", {})
|
||||
if validation.get("symbol_match") is not True:
|
||||
return f"Validation symbol_match failed for {symbol}: {validation}"
|
||||
if validation.get("issues"):
|
||||
return f"Validation issues for {symbol}: {validation}"
|
||||
|
||||
for section in REQUIRED_SECTIONS:
|
||||
if section not in data:
|
||||
return f"Missing section {section} for {symbol}"
|
||||
|
||||
key_metrics = data.get("key_metrics", {})
|
||||
for field in REQUIRED_KEY_METRICS:
|
||||
if field not in key_metrics:
|
||||
return f"Missing key metric {field} for {symbol}"
|
||||
|
||||
return None
|
||||
|
||||
|
||||
def main():
|
||||
parser = argparse.ArgumentParser(description="Yahoo profile scraper test cycles")
|
||||
parser.add_argument(
|
||||
"--base-url",
|
||||
default="http://127.0.0.1:9777/profile",
|
||||
help="Base URL for /profile",
|
||||
)
|
||||
parser.add_argument(
|
||||
"--symbols",
|
||||
default=",".join(DEFAULT_SYMBOLS),
|
||||
help="Comma-separated stock symbols",
|
||||
)
|
||||
parser.add_argument(
|
||||
"--runs",
|
||||
type=int,
|
||||
default=8,
|
||||
help="Number of validation runs per symbol",
|
||||
)
|
||||
parser.add_argument(
|
||||
"--timeout",
|
||||
type=int,
|
||||
default=180,
|
||||
help="Request timeout in seconds",
|
||||
)
|
||||
parser.add_argument(
|
||||
"--sleep",
|
||||
type=float,
|
||||
default=0.2,
|
||||
help="Sleep between requests",
|
||||
)
|
||||
args = parser.parse_args()
|
||||
|
||||
symbols = parse_list(args.symbols, DEFAULT_SYMBOLS)
|
||||
signatures = {}
|
||||
|
||||
print(f"Running {args.runs} profile cycles for: {', '.join(symbols)}")
|
||||
for run in range(1, args.runs + 1):
|
||||
print(f"Cycle {run}/{args.runs}")
|
||||
for symbol in symbols:
|
||||
data = http_get(args.base_url, {"stock": symbol}, args.timeout)
|
||||
error = validate_payload(symbol, data)
|
||||
if error:
|
||||
print(f"ERROR: {error}")
|
||||
sys.exit(1)
|
||||
signature = build_signature(data)
|
||||
if symbol not in signatures:
|
||||
signatures[symbol] = signature
|
||||
elif signatures[symbol] != signature:
|
||||
print(f"ERROR: Signature changed for {symbol}")
|
||||
print(f"Baseline: {signatures[symbol]}")
|
||||
print(f"Current: {signature}")
|
||||
sys.exit(1)
|
||||
time.sleep(args.sleep)
|
||||
print(f"Cycle {run} OK")
|
||||
|
||||
print("\nAll profile cycles completed successfully.")
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
main()
|
||||
Reference in New Issue
Block a user