Compare commits

...

11 Commits

7 changed files with 2486 additions and 60 deletions

13
.dockerignore Normal file
View File

@@ -0,0 +1,13 @@
.git/
.gitignore
__pycache__/
*.pyc
venv/
.venv/
.env
.env.*
.pytest_cache/
charts/
yahoo.html
scraper_service(works).py
scraper_service.working.backup.py

7
.gitignore vendored Normal file
View File

@@ -0,0 +1,7 @@
__pycache__/
*.pyc
venv/
.venv/
.env
.env.*
.pytest_cache/

754
AGENTS.md Normal file
View File

@@ -0,0 +1,754 @@
# AGENTS.md
## Context
- This project exposes a Flask API that uses Playwright to scrape Yahoo Finance options chains.
- Entry point: `scraper_service.py` (launched via `runner.bat` or directly with Python).
- The scraper loads the Yahoo options page (optionally with `?date=`) and validates expirations using the YYMMDD code embedded in contract symbols.
- Option chains come from the embedded `optionChain` JSON when available, with an HTML table fallback.
## API
- Route: `GET /scrape_sync`
- Query params:
- `stock`: symbol (default `MSFT`).
- `expiration|expiry|date`: epoch seconds (Yahoo date param) or a date string matching `DATE_FORMATS`.
- `strikeLimit`: number of nearest strikes to return per side (default `25`).
- Behavior:
- If `strikeLimit` is greater than available strikes, all available rows are returned.
- `pruned_calls_count` and `pruned_puts_count` report how many rows were removed beyond the limit.
- `selected_expiration` reports the resolved expiry (epoch + label), and mismatches return an error.
- Route: `GET /profile`
- Query params:
- `stock`: symbol (default `MSFT`).
- Behavior:
- Loads `https://finance.yahoo.com/quote/<SYMBOL>/` with Playwright.
- Pulls the embedded SvelteKit payloads (quoteSummary, quote, quoteType, ratings, recommendations).
- Returns a pruned JSON with valuation, profitability, growth, financial strength, cashflow, ownership, analyst, earnings, and performance summaries.
## Guard Rails
- Run local 10-cycle validation (4 stocks x 4 expiries) before any deploy or push.
- Run the same 10-cycle validation against the docker container before pushing the image.
- Do not push if any response contains `error` or if contract symbols do not contain the expected YYMMDD code.
- Keep Playwright version aligned with the docker base image (`mcr.microsoft.com/playwright/python:v1.57.0-jammy`).
- Keep the API port open after a successful deploy so it can be tested immediately.
## Testing
- Local server:
- Start: `.\venv\Scripts\python.exe scraper_service.py`
- Validate: `python scripts/test_cycles.py --base-url http://127.0.0.1:9777/scrape_sync`
- Profile validation (local server):
- Validate: `python scripts/test_profile_cycles.py --base-url http://127.0.0.1:9777/profile --runs 8`
- Docker server:
- Start: `docker run --rm -p 9777:9777 rushabhtechie/yahoo-scraper:latest`
- Validate: `python scripts/test_cycles.py --base-url http://127.0.0.1:9777/scrape_sync`
- Profile validation (docker server):
- Validate: `python scripts/test_profile_cycles.py --base-url http://127.0.0.1:9777/profile --runs 8`
## Update Log (2025-12-28)
- Added `/profile` endpoint backed by SvelteKit payload parsing (quoteSummary, quote, quoteType, ratings, recommendations).
- `/profile` response trimmed to focus on valuation, profitability, growth, financial strength, cashflow, ownership, analyst, earnings, and performance summaries.
- Validation ensures quote data matches the requested symbol, with issues reported in `validation`.
- Issue encountered: existing server instance bound to port 9777 without `/profile`, resolved by restarting the service with the updated script.
- Tests executed (local):
- `.\venv\Scripts\python.exe scripts/test_profile_cycles.py --runs 8 --timeout 180`
- `.\venv\Scripts\python.exe scripts\test_cycles.py --base-url http://127.0.0.1:9777/scrape_sync`
- Tests executed (docker):
- `docker build -t rushabhtechie/yahoo-scraper:latest .`
- `.\venv\Scripts\python.exe scripts\test_cycles.py --base-url http://127.0.0.1:9777/scrape_sync`
- `.\venv\Scripts\python.exe scripts\test_profile_cycles.py --base-url http://127.0.0.1:9777/profile --runs 8 --timeout 180`
- The test harness verifies:
- Requested expiration matches `selected_expiration.value`.
- Contract symbols include the expected YYMMDD code.
- `total_calls`/`total_puts` match `min(strikeLimit, available)`.
- `pruned_*_count` equals the number of rows removed.
## Docker
- Build: `docker build -t rushabhtechie/yahoo-scraper:latest .`
- Run (CPU): `docker run --rm -p 9777:9777 rushabhtechie/yahoo-scraper:latest`
- The container uses the Playwright base image with bundled browsers.
## GPU Acceleration
- GPU is auto-detected via `NVIDIA_VISIBLE_DEVICES`, `/dev/nvidia0`, or `/dev/dri`.
- Override detection:
- Force on: `ENABLE_GPU=1`
- Force off: `ENABLE_GPU=0`
- Docker (NVIDIA): `docker run --rm --gpus all -e ENABLE_GPU=1 -p 9777:9777 rushabhtechie/yahoo-scraper:latest`
- Docker (AMD/Intel): `docker run --rm --device=/dev/dri --group-add video -e ENABLE_GPU=1 -p 9777:9777 rushabhtechie/yahoo-scraper:latest`
## Line-by-line explanation of scraper_service.py
- Line 1: Import symbols or modules. Code: `from flask import Flask, jsonify, request`
- Line 2: Import symbols or modules. Code: `from playwright.sync_api import sync_playwright`
- Line 3: Import symbols or modules. Code: `from bs4 import BeautifulSoup`
- Line 4: Import symbols or modules. Code: `from datetime import datetime, timezone`
- Line 5: Import symbols or modules. Code: `import urllib.parse`
- Line 6: Import symbols or modules. Code: `import logging`
- Line 7: Import symbols or modules. Code: `import json`
- Line 8: Import symbols or modules. Code: `import re`
- Line 9: Import symbols or modules. Code: `import time`
- Line 10: Import symbols or modules. Code: `import os`
- Line 11: Blank line for readability. Code: `<blank>`
- Line 12: Execute the statement as written. Code: `app = Flask(__name__)`
- Line 13: Blank line for readability. Code: `<blank>`
- Line 14: Comment describing the next block. Code: `# Logging`
- Line 15: Execute the statement as written. Code: `logging.basicConfig(`
- Line 16: Execute the statement as written. Code: ` level=logging.INFO,`
- Line 17: Execute the statement as written. Code: ` format="%(asctime)s [%(levelname)s] %(message)s"`
- Line 18: Execute the statement as written. Code: `)`
- Line 19: Execute the statement as written. Code: `app.logger.setLevel(logging.INFO)`
- Line 20: Blank line for readability. Code: `<blank>`
- Line 21: Execute the statement as written. Code: `DATE_FORMATS = (`
- Line 22: Execute the statement as written. Code: ` "%Y-%m-%d",`
- Line 23: Execute the statement as written. Code: ` "%Y/%m/%d",`
- Line 24: Execute the statement as written. Code: ` "%Y%m%d",`
- Line 25: Execute the statement as written. Code: ` "%b %d, %Y",`
- Line 26: Execute the statement as written. Code: ` "%B %d, %Y",`
- Line 27: Execute the statement as written. Code: `)`
- Line 28: Blank line for readability. Code: `<blank>`
- Line 29: Execute the statement as written. Code: `GPU_ACCEL_ENV = "ENABLE_GPU"`
- Line 30: Blank line for readability. Code: `<blank>`
- Line 31: Blank line for readability. Code: `<blank>`
- Line 32: Define the parse_env_flag function. Code: `def parse_env_flag(value, default=False):`
- Line 33: Execute the statement as written. Code: ` if value is None:`
- Line 34: Execute the statement as written. Code: ` return default`
- Line 35: Execute the statement as written. Code: ` return str(value).strip().lower() in ("1", "true", "yes", "on")`
- Line 36: Blank line for readability. Code: `<blank>`
- Line 37: Blank line for readability. Code: `<blank>`
- Line 38: Define the detect_gpu_available function. Code: `def detect_gpu_available():`
- Line 39: Execute the statement as written. Code: ` env_value = os.getenv(GPU_ACCEL_ENV)`
- Line 40: Execute the statement as written. Code: ` if env_value is not None:`
- Line 41: Execute the statement as written. Code: ` return parse_env_flag(env_value, default=False)`
- Line 42: Blank line for readability. Code: `<blank>`
- Line 43: Execute the statement as written. Code: ` nvidia_visible = os.getenv("NVIDIA_VISIBLE_DEVICES")`
- Line 44: Execute the statement as written. Code: ` if nvidia_visible and nvidia_visible.lower() not in ("none", "void", "off"):`
- Line 45: Execute the statement as written. Code: ` return True`
- Line 46: Blank line for readability. Code: `<blank>`
- Line 47: Execute the statement as written. Code: ` if os.path.exists("/dev/nvidia0"):`
- Line 48: Execute the statement as written. Code: ` return True`
- Line 49: Blank line for readability. Code: `<blank>`
- Line 50: Execute the statement as written. Code: ` if os.path.exists("/dev/dri/renderD128") or os.path.exists("/dev/dri/card0"):`
- Line 51: Execute the statement as written. Code: ` return True`
- Line 52: Blank line for readability. Code: `<blank>`
- Line 53: Execute the statement as written. Code: ` return False`
- Line 54: Blank line for readability. Code: `<blank>`
- Line 55: Blank line for readability. Code: `<blank>`
- Line 56: Define the chromium_launch_args function. Code: `def chromium_launch_args():`
- Line 57: Execute the statement as written. Code: ` if not detect_gpu_available():`
- Line 58: Execute the statement as written. Code: ` return []`
- Line 59: Blank line for readability. Code: `<blank>`
- Line 60: Execute the statement as written. Code: ` if os.name == "nt":`
- Line 61: Execute the statement as written. Code: ` return ["--enable-gpu"]`
- Line 62: Blank line for readability. Code: `<blank>`
- Line 63: Execute the statement as written. Code: ` return [`
- Line 64: Execute the statement as written. Code: ` "--enable-gpu",`
- Line 65: Execute the statement as written. Code: ` "--ignore-gpu-blocklist",`
- Line 66: Execute the statement as written. Code: ` "--disable-software-rasterizer",`
- Line 67: Execute the statement as written. Code: ` "--use-gl=egl",`
- Line 68: Execute the statement as written. Code: ` "--enable-zero-copy",`
- Line 69: Execute the statement as written. Code: ` "--enable-gpu-rasterization",`
- Line 70: Execute the statement as written. Code: ` ]`
- Line 71: Blank line for readability. Code: `<blank>`
- Line 72: Blank line for readability. Code: `<blank>`
- Line 73: Define the parse_date function. Code: `def parse_date(value):`
- Line 74: Execute the statement as written. Code: ` for fmt in DATE_FORMATS:`
- Line 75: Execute the statement as written. Code: ` try:`
- Line 76: Execute the statement as written. Code: ` return datetime.strptime(value, fmt).date()`
- Line 77: Execute the statement as written. Code: ` except ValueError:`
- Line 78: Execute the statement as written. Code: ` continue`
- Line 79: Execute the statement as written. Code: ` return None`
- Line 80: Blank line for readability. Code: `<blank>`
- Line 81: Blank line for readability. Code: `<blank>`
- Line 82: Define the normalize_label function. Code: `def normalize_label(value):`
- Line 83: Execute the statement as written. Code: ` return " ".join(value.strip().split()).lower()`
- Line 84: Blank line for readability. Code: `<blank>`
- Line 85: Blank line for readability. Code: `<blank>`
- Line 86: Define the format_expiration_label function. Code: `def format_expiration_label(timestamp):`
- Line 87: Execute the statement as written. Code: ` try:`
- Line 88: Execute the statement as written. Code: ` return datetime.utcfromtimestamp(timestamp).strftime("%Y-%m-%d")`
- Line 89: Execute the statement as written. Code: ` except Exception:`
- Line 90: Execute the statement as written. Code: ` return str(timestamp)`
- Line 91: Blank line for readability. Code: `<blank>`
- Line 92: Blank line for readability. Code: `<blank>`
- Line 93: Define the format_percent function. Code: `def format_percent(value):`
- Line 94: Execute the statement as written. Code: ` if value is None:`
- Line 95: Execute the statement as written. Code: ` return None`
- Line 96: Execute the statement as written. Code: ` try:`
- Line 97: Execute the statement as written. Code: ` return f"{value * 100:.2f}%"`
- Line 98: Execute the statement as written. Code: ` except Exception:`
- Line 99: Execute the statement as written. Code: ` return None`
- Line 100: Blank line for readability. Code: `<blank>`
- Line 101: Blank line for readability. Code: `<blank>`
- Line 102: Define the extract_raw_value function. Code: `def extract_raw_value(value):`
- Line 103: Execute the statement as written. Code: ` if isinstance(value, dict):`
- Line 104: Execute the statement as written. Code: ` return value.get("raw")`
- Line 105: Execute the statement as written. Code: ` return value`
- Line 106: Blank line for readability. Code: `<blank>`
- Line 107: Blank line for readability. Code: `<blank>`
- Line 108: Define the extract_fmt_value function. Code: `def extract_fmt_value(value):`
- Line 109: Execute the statement as written. Code: ` if isinstance(value, dict):`
- Line 110: Execute the statement as written. Code: ` return value.get("fmt")`
- Line 111: Execute the statement as written. Code: ` return None`
- Line 112: Blank line for readability. Code: `<blank>`
- Line 113: Blank line for readability. Code: `<blank>`
- Line 114: Define the format_percent_value function. Code: `def format_percent_value(value):`
- Line 115: Execute the statement as written. Code: ` fmt = extract_fmt_value(value)`
- Line 116: Execute the statement as written. Code: ` if fmt is not None:`
- Line 117: Execute the statement as written. Code: ` return fmt`
- Line 118: Execute the statement as written. Code: ` return format_percent(extract_raw_value(value))`
- Line 119: Blank line for readability. Code: `<blank>`
- Line 120: Blank line for readability. Code: `<blank>`
- Line 121: Define the format_last_trade_date function. Code: `def format_last_trade_date(timestamp):`
- Line 122: Execute the statement as written. Code: ` timestamp = extract_raw_value(timestamp)`
- Line 123: Execute the statement as written. Code: ` if not timestamp:`
- Line 124: Execute the statement as written. Code: ` return None`
- Line 125: Execute the statement as written. Code: ` try:`
- Line 126: Execute the statement as written. Code: ` return datetime.fromtimestamp(timestamp).strftime("%m/%d/%Y %I:%M %p") + " EST"`
- Line 127: Execute the statement as written. Code: ` except Exception:`
- Line 128: Execute the statement as written. Code: ` return None`
- Line 129: Blank line for readability. Code: `<blank>`
- Line 130: Blank line for readability. Code: `<blank>`
- Line 131: Define the extract_option_chain_from_html function. Code: `def extract_option_chain_from_html(html):`
- Line 132: Execute the statement as written. Code: ` if not html:`
- Line 133: Execute the statement as written. Code: ` return None`
- Line 134: Blank line for readability. Code: `<blank>`
- Line 135: Execute the statement as written. Code: ` token = "\"body\":\""`
- Line 136: Execute the statement as written. Code: ` start = 0`
- Line 137: Execute the statement as written. Code: ` while True:`
- Line 138: Execute the statement as written. Code: ` idx = html.find(token, start)`
- Line 139: Execute the statement as written. Code: ` if idx == -1:`
- Line 140: Execute the statement as written. Code: ` break`
- Line 141: Execute the statement as written. Code: ` i = idx + len(token)`
- Line 142: Execute the statement as written. Code: ` escaped = False`
- Line 143: Execute the statement as written. Code: ` raw_chars = []`
- Line 144: Execute the statement as written. Code: ` while i < len(html):`
- Line 145: Execute the statement as written. Code: ` ch = html[i]`
- Line 146: Execute the statement as written. Code: ` if escaped:`
- Line 147: Execute the statement as written. Code: ` raw_chars.append(ch)`
- Line 148: Execute the statement as written. Code: ` escaped = False`
- Line 149: Execute the statement as written. Code: ` else:`
- Line 150: Execute the statement as written. Code: ` if ch == "\\":`
- Line 151: Execute the statement as written. Code: ` raw_chars.append(ch)`
- Line 152: Execute the statement as written. Code: ` escaped = True`
- Line 153: Execute the statement as written. Code: ` elif ch == "\"":`
- Line 154: Execute the statement as written. Code: ` break`
- Line 155: Execute the statement as written. Code: ` else:`
- Line 156: Execute the statement as written. Code: ` raw_chars.append(ch)`
- Line 157: Execute the statement as written. Code: ` i += 1`
- Line 158: Execute the statement as written. Code: ` raw = "".join(raw_chars)`
- Line 159: Execute the statement as written. Code: ` try:`
- Line 160: Execute the statement as written. Code: ` body_text = json.loads(f"\"{raw}\"")`
- Line 161: Execute the statement as written. Code: ` except json.JSONDecodeError:`
- Line 162: Execute the statement as written. Code: ` start = idx + len(token)`
- Line 163: Execute the statement as written. Code: ` continue`
- Line 164: Execute the statement as written. Code: ` if "optionChain" not in body_text:`
- Line 165: Execute the statement as written. Code: ` start = idx + len(token)`
- Line 166: Execute the statement as written. Code: ` continue`
- Line 167: Execute the statement as written. Code: ` try:`
- Line 168: Execute the statement as written. Code: ` payload = json.loads(body_text)`
- Line 169: Execute the statement as written. Code: ` except json.JSONDecodeError:`
- Line 170: Execute the statement as written. Code: ` start = idx + len(token)`
- Line 171: Execute the statement as written. Code: ` continue`
- Line 172: Execute the statement as written. Code: ` option_chain = payload.get("optionChain")`
- Line 173: Execute the statement as written. Code: ` if option_chain and option_chain.get("result"):`
- Line 174: Execute the statement as written. Code: ` return option_chain`
- Line 175: Blank line for readability. Code: `<blank>`
- Line 176: Execute the statement as written. Code: ` start = idx + len(token)`
- Line 177: Blank line for readability. Code: `<blank>`
- Line 178: Execute the statement as written. Code: ` return None`
- Line 179: Blank line for readability. Code: `<blank>`
- Line 180: Blank line for readability. Code: `<blank>`
- Line 181: Define the extract_expiration_dates_from_chain function. Code: `def extract_expiration_dates_from_chain(chain):`
- Line 182: Execute the statement as written. Code: ` if not chain:`
- Line 183: Execute the statement as written. Code: ` return []`
- Line 184: Blank line for readability. Code: `<blank>`
- Line 185: Execute the statement as written. Code: ` result = chain.get("result", [])`
- Line 186: Execute the statement as written. Code: ` if not result:`
- Line 187: Execute the statement as written. Code: ` return []`
- Line 188: Execute the statement as written. Code: ` return result[0].get("expirationDates", []) or []`
- Line 189: Blank line for readability. Code: `<blank>`
- Line 190: Blank line for readability. Code: `<blank>`
- Line 191: Define the normalize_chain_rows function. Code: `def normalize_chain_rows(rows):`
- Line 192: Execute the statement as written. Code: ` normalized = []`
- Line 193: Execute the statement as written. Code: ` for row in rows or []:`
- Line 194: Execute the statement as written. Code: ` normalized.append(`
- Line 195: Execute the statement as written. Code: ` {`
- Line 196: Execute the statement as written. Code: ` "Contract Name": row.get("contractSymbol"),`
- Line 197: Execute the statement as written. Code: ` "Last Trade Date (EST)": format_last_trade_date(`
- Line 198: Execute the statement as written. Code: ` row.get("lastTradeDate")`
- Line 199: Execute the statement as written. Code: ` ),`
- Line 200: Execute the statement as written. Code: ` "Strike": extract_raw_value(row.get("strike")),`
- Line 201: Execute the statement as written. Code: ` "Last Price": extract_raw_value(row.get("lastPrice")),`
- Line 202: Execute the statement as written. Code: ` "Bid": extract_raw_value(row.get("bid")),`
- Line 203: Execute the statement as written. Code: ` "Ask": extract_raw_value(row.get("ask")),`
- Line 204: Execute the statement as written. Code: ` "Change": extract_raw_value(row.get("change")),`
- Line 205: Execute the statement as written. Code: ` "% Change": format_percent_value(row.get("percentChange")),`
- Line 206: Execute the statement as written. Code: ` "Volume": extract_raw_value(row.get("volume")),`
- Line 207: Execute the statement as written. Code: ` "Open Interest": extract_raw_value(row.get("openInterest")),`
- Line 208: Execute the statement as written. Code: ` "Implied Volatility": format_percent_value(`
- Line 209: Execute the statement as written. Code: ` row.get("impliedVolatility")`
- Line 210: Execute the statement as written. Code: ` ),`
- Line 211: Execute the statement as written. Code: ` }`
- Line 212: Execute the statement as written. Code: ` )`
- Line 213: Execute the statement as written. Code: ` return normalized`
- Line 214: Blank line for readability. Code: `<blank>`
- Line 215: Blank line for readability. Code: `<blank>`
- Line 216: Define the build_rows_from_chain function. Code: `def build_rows_from_chain(chain):`
- Line 217: Execute the statement as written. Code: ` result = chain.get("result", []) if chain else []`
- Line 218: Execute the statement as written. Code: ` if not result:`
- Line 219: Execute the statement as written. Code: ` return [], []`
- Line 220: Execute the statement as written. Code: ` options = result[0].get("options", [])`
- Line 221: Execute the statement as written. Code: ` if not options:`
- Line 222: Execute the statement as written. Code: ` return [], []`
- Line 223: Execute the statement as written. Code: ` option = options[0]`
- Line 224: Execute the statement as written. Code: ` return (`
- Line 225: Execute the statement as written. Code: ` normalize_chain_rows(option.get("calls")),`
- Line 226: Execute the statement as written. Code: ` normalize_chain_rows(option.get("puts")),`
- Line 227: Execute the statement as written. Code: ` )`
- Line 228: Blank line for readability. Code: `<blank>`
- Line 229: Blank line for readability. Code: `<blank>`
- Line 230: Define the extract_contract_expiry_code function. Code: `def extract_contract_expiry_code(contract_name):`
- Line 231: Execute the statement as written. Code: ` if not contract_name:`
- Line 232: Execute the statement as written. Code: ` return None`
- Line 233: Execute the statement as written. Code: ` match = re.search(r"(\d{6})", contract_name)`
- Line 234: Execute the statement as written. Code: ` return match.group(1) if match else None`
- Line 235: Blank line for readability. Code: `<blank>`
- Line 236: Blank line for readability. Code: `<blank>`
- Line 237: Define the expected_expiry_code function. Code: `def expected_expiry_code(timestamp):`
- Line 238: Execute the statement as written. Code: ` if not timestamp:`
- Line 239: Execute the statement as written. Code: ` return None`
- Line 240: Execute the statement as written. Code: ` try:`
- Line 241: Execute the statement as written. Code: ` return datetime.utcfromtimestamp(timestamp).strftime("%y%m%d")`
- Line 242: Execute the statement as written. Code: ` except Exception:`
- Line 243: Execute the statement as written. Code: ` return None`
- Line 244: Blank line for readability. Code: `<blank>`
- Line 245: Blank line for readability. Code: `<blank>`
- Line 246: Define the extract_expiration_dates_from_html function. Code: `def extract_expiration_dates_from_html(html):`
- Line 247: Execute the statement as written. Code: ` if not html:`
- Line 248: Execute the statement as written. Code: ` return []`
- Line 249: Blank line for readability. Code: `<blank>`
- Line 250: Execute the statement as written. Code: ` patterns = (`
- Line 251: Execute the statement as written. Code: ` r'\\"expirationDates\\":\[(.*?)\]',`
- Line 252: Execute the statement as written. Code: ` r'"expirationDates":\[(.*?)\]',`
- Line 253: Execute the statement as written. Code: ` )`
- Line 254: Execute the statement as written. Code: ` match = None`
- Line 255: Execute the statement as written. Code: ` for pattern in patterns:`
- Line 256: Execute the statement as written. Code: ` match = re.search(pattern, html, re.DOTALL)`
- Line 257: Execute the statement as written. Code: ` if match:`
- Line 258: Execute the statement as written. Code: ` break`
- Line 259: Execute the statement as written. Code: ` if not match:`
- Line 260: Execute the statement as written. Code: ` return []`
- Line 261: Blank line for readability. Code: `<blank>`
- Line 262: Execute the statement as written. Code: ` raw = match.group(1)`
- Line 263: Execute the statement as written. Code: ` values = []`
- Line 264: Execute the statement as written. Code: ` for part in raw.split(","):`
- Line 265: Execute the statement as written. Code: ` part = part.strip()`
- Line 266: Execute the statement as written. Code: ` if part.isdigit():`
- Line 267: Execute the statement as written. Code: ` try:`
- Line 268: Execute the statement as written. Code: ` values.append(int(part))`
- Line 269: Execute the statement as written. Code: ` except Exception:`
- Line 270: Execute the statement as written. Code: ` continue`
- Line 271: Execute the statement as written. Code: ` return values`
- Line 272: Blank line for readability. Code: `<blank>`
- Line 273: Blank line for readability. Code: `<blank>`
- Line 274: Define the build_expiration_options function. Code: `def build_expiration_options(expiration_dates):`
- Line 275: Execute the statement as written. Code: ` options = []`
- Line 276: Execute the statement as written. Code: ` for value in expiration_dates or []:`
- Line 277: Execute the statement as written. Code: ` try:`
- Line 278: Execute the statement as written. Code: ` value_int = int(value)`
- Line 279: Execute the statement as written. Code: ` except Exception:`
- Line 280: Execute the statement as written. Code: ` continue`
- Line 281: Blank line for readability. Code: `<blank>`
- Line 282: Execute the statement as written. Code: ` label = format_expiration_label(value_int)`
- Line 283: Execute the statement as written. Code: ` try:`
- Line 284: Execute the statement as written. Code: ` date_value = datetime.utcfromtimestamp(value_int).date()`
- Line 285: Execute the statement as written. Code: ` except Exception:`
- Line 286: Execute the statement as written. Code: ` date_value = None`
- Line 287: Blank line for readability. Code: `<blank>`
- Line 288: Execute the statement as written. Code: ` options.append({"value": value_int, "label": label, "date": date_value})`
- Line 289: Execute the statement as written. Code: ` return sorted(options, key=lambda x: x["value"])`
- Line 290: Blank line for readability. Code: `<blank>`
- Line 291: Blank line for readability. Code: `<blank>`
- Line 292: Define the resolve_expiration function. Code: `def resolve_expiration(expiration, options):`
- Line 293: Execute the statement as written. Code: ` if not expiration:`
- Line 294: Execute the statement as written. Code: ` return None, None`
- Line 295: Blank line for readability. Code: `<blank>`
- Line 296: Execute the statement as written. Code: ` raw = expiration.strip()`
- Line 297: Execute the statement as written. Code: ` if not raw:`
- Line 298: Execute the statement as written. Code: ` return None, None`
- Line 299: Blank line for readability. Code: `<blank>`
- Line 300: Execute the statement as written. Code: ` if raw.isdigit():`
- Line 301: Execute the statement as written. Code: ` value = int(raw)`
- Line 302: Execute the statement as written. Code: ` if options:`
- Line 303: Execute the statement as written. Code: ` for opt in options:`
- Line 304: Execute the statement as written. Code: ` if opt.get("value") == value:`
- Line 305: Execute the statement as written. Code: ` return value, opt.get("label")`
- Line 306: Execute the statement as written. Code: ` return None, None`
- Line 307: Execute the statement as written. Code: ` return value, format_expiration_label(value)`
- Line 308: Blank line for readability. Code: `<blank>`
- Line 309: Execute the statement as written. Code: ` requested_date = parse_date(raw)`
- Line 310: Execute the statement as written. Code: ` if requested_date:`
- Line 311: Execute the statement as written. Code: ` for opt in options:`
- Line 312: Execute the statement as written. Code: ` if opt.get("date") == requested_date:`
- Line 313: Execute the statement as written. Code: ` return opt.get("value"), opt.get("label")`
- Line 314: Execute the statement as written. Code: ` return None, None`
- Line 315: Blank line for readability. Code: `<blank>`
- Line 316: Execute the statement as written. Code: ` normalized = normalize_label(raw)`
- Line 317: Execute the statement as written. Code: ` for opt in options:`
- Line 318: Execute the statement as written. Code: ` if normalize_label(opt.get("label", "")) == normalized:`
- Line 319: Execute the statement as written. Code: ` return opt.get("value"), opt.get("label")`
- Line 320: Blank line for readability. Code: `<blank>`
- Line 321: Execute the statement as written. Code: ` return None, None`
- Line 322: Blank line for readability. Code: `<blank>`
- Line 323: Blank line for readability. Code: `<blank>`
- Line 324: Define the wait_for_tables function. Code: `def wait_for_tables(page):`
- Line 325: Execute the statement as written. Code: ` try:`
- Line 326: Execute the statement as written. Code: ` page.wait_for_selector(`
- Line 327: Execute the statement as written. Code: ` "section[data-testid='options-list-table'] table",`
- Line 328: Execute the statement as written. Code: ` timeout=30000,`
- Line 329: Execute the statement as written. Code: ` )`
- Line 330: Execute the statement as written. Code: ` except Exception:`
- Line 331: Execute the statement as written. Code: ` page.wait_for_selector("table", timeout=30000)`
- Line 332: Blank line for readability. Code: `<blank>`
- Line 333: Execute the statement as written. Code: ` for _ in range(30): # 30 * 1s = 30 seconds`
- Line 334: Execute the statement as written. Code: ` tables = page.query_selector_all(`
- Line 335: Execute the statement as written. Code: ` "section[data-testid='options-list-table'] table"`
- Line 336: Execute the statement as written. Code: ` )`
- Line 337: Execute the statement as written. Code: ` if len(tables) >= 2:`
- Line 338: Execute the statement as written. Code: ` return tables`
- Line 339: Execute the statement as written. Code: ` tables = page.query_selector_all("table")`
- Line 340: Execute the statement as written. Code: ` if len(tables) >= 2:`
- Line 341: Execute the statement as written. Code: ` return tables`
- Line 342: Execute the statement as written. Code: ` time.sleep(1)`
- Line 343: Execute the statement as written. Code: ` return []`
- Line 344: Blank line for readability. Code: `<blank>`
- Line 345: Blank line for readability. Code: `<blank>`
- Line 346: Define the parse_strike_limit function. Code: `def parse_strike_limit(value, default=25):`
- Line 347: Execute the statement as written. Code: ` if value is None:`
- Line 348: Execute the statement as written. Code: ` return default`
- Line 349: Execute the statement as written. Code: ` try:`
- Line 350: Execute the statement as written. Code: ` limit = int(value)`
- Line 351: Execute the statement as written. Code: ` except (TypeError, ValueError):`
- Line 352: Execute the statement as written. Code: ` return default`
- Line 353: Execute the statement as written. Code: ` return limit if limit > 0 else default`
- Line 354: Blank line for readability. Code: `<blank>`
- Line 355: Blank line for readability. Code: `<blank>`
- Line 356: Define the scrape_yahoo_options function. Code: `def scrape_yahoo_options(symbol, expiration=None, strike_limit=25):`
- Line 357: Define the parse_table function. Code: ` def parse_table(table_html, side):`
- Line 358: Execute the statement as written. Code: ` if not table_html:`
- Line 359: Execute the statement as written. Code: ` app.logger.warning("No %s table HTML for %s", side, symbol)`
- Line 360: Execute the statement as written. Code: ` return []`
- Line 361: Blank line for readability. Code: `<blank>`
- Line 362: Execute the statement as written. Code: ` soup = BeautifulSoup(table_html, "html.parser")`
- Line 363: Blank line for readability. Code: `<blank>`
- Line 364: Execute the statement as written. Code: ` headers = [th.get_text(strip=True) for th in soup.select("thead th")]`
- Line 365: Execute the statement as written. Code: ` rows = soup.select("tbody tr")`
- Line 366: Blank line for readability. Code: `<blank>`
- Line 367: Execute the statement as written. Code: ` parsed = []`
- Line 368: Execute the statement as written. Code: ` for r in rows:`
- Line 369: Execute the statement as written. Code: ` tds = r.find_all("td")`
- Line 370: Execute the statement as written. Code: ` if len(tds) != len(headers):`
- Line 371: Execute the statement as written. Code: ` continue`
- Line 372: Blank line for readability. Code: `<blank>`
- Line 373: Execute the statement as written. Code: ` item = {}`
- Line 374: Execute the statement as written. Code: ` for i, c in enumerate(tds):`
- Line 375: Execute the statement as written. Code: ` key = headers[i]`
- Line 376: Execute the statement as written. Code: ` val = c.get_text(" ", strip=True)`
- Line 377: Blank line for readability. Code: `<blank>`
- Line 378: Comment describing the next block. Code: ` # Convert numeric fields`
- Line 379: Execute the statement as written. Code: ` if key in ["Strike", "Last Price", "Bid", "Ask", "Change"]:`
- Line 380: Execute the statement as written. Code: ` try:`
- Line 381: Execute the statement as written. Code: ` val = float(val.replace(",", ""))`
- Line 382: Execute the statement as written. Code: ` except Exception:`
- Line 383: Execute the statement as written. Code: ` val = None`
- Line 384: Execute the statement as written. Code: ` elif key in ["Volume", "Open Interest"]:`
- Line 385: Execute the statement as written. Code: ` try:`
- Line 386: Execute the statement as written. Code: ` val = int(val.replace(",", ""))`
- Line 387: Execute the statement as written. Code: ` except Exception:`
- Line 388: Execute the statement as written. Code: ` val = None`
- Line 389: Execute the statement as written. Code: ` elif val in ["-", ""]:`
- Line 390: Execute the statement as written. Code: ` val = None`
- Line 391: Blank line for readability. Code: `<blank>`
- Line 392: Execute the statement as written. Code: ` item[key] = val`
- Line 393: Blank line for readability. Code: `<blank>`
- Line 394: Execute the statement as written. Code: ` parsed.append(item)`
- Line 395: Blank line for readability. Code: `<blank>`
- Line 396: Execute the statement as written. Code: ` app.logger.info("Parsed %d %s rows", len(parsed), side)`
- Line 397: Execute the statement as written. Code: ` return parsed`
- Line 398: Blank line for readability. Code: `<blank>`
- Line 399: Define the read_option_chain function. Code: ` def read_option_chain(page):`
- Line 400: Execute the statement as written. Code: ` html = page.content()`
- Line 401: Execute the statement as written. Code: ` option_chain = extract_option_chain_from_html(html)`
- Line 402: Execute the statement as written. Code: ` if option_chain:`
- Line 403: Execute the statement as written. Code: ` expiration_dates = extract_expiration_dates_from_chain(option_chain)`
- Line 404: Execute the statement as written. Code: ` else:`
- Line 405: Execute the statement as written. Code: ` expiration_dates = extract_expiration_dates_from_html(html)`
- Line 406: Execute the statement as written. Code: ` return option_chain, expiration_dates`
- Line 407: Blank line for readability. Code: `<blank>`
- Line 408: Define the has_expected_expiry function. Code: ` def has_expected_expiry(options, expected_code):`
- Line 409: Execute the statement as written. Code: ` if not expected_code:`
- Line 410: Execute the statement as written. Code: ` return False`
- Line 411: Execute the statement as written. Code: ` for row in options or []:`
- Line 412: Execute the statement as written. Code: ` name = row.get("Contract Name")`
- Line 413: Execute the statement as written. Code: ` if extract_contract_expiry_code(name) == expected_code:`
- Line 414: Execute the statement as written. Code: ` return True`
- Line 415: Execute the statement as written. Code: ` return False`
- Line 416: Blank line for readability. Code: `<blank>`
- Line 417: Execute the statement as written. Code: ` encoded = urllib.parse.quote(symbol, safe="")`
- Line 418: Execute the statement as written. Code: ` base_url = f"https://finance.yahoo.com/quote/{encoded}/options/"`
- Line 419: Execute the statement as written. Code: ` requested_expiration = expiration.strip() if expiration else None`
- Line 420: Execute the statement as written. Code: ` if not requested_expiration:`
- Line 421: Execute the statement as written. Code: ` requested_expiration = None`
- Line 422: Execute the statement as written. Code: ` url = base_url`
- Line 423: Blank line for readability. Code: `<blank>`
- Line 424: Execute the statement as written. Code: ` app.logger.info(`
- Line 425: Execute the statement as written. Code: ` "Starting scrape for symbol=%s expiration=%s url=%s",`
- Line 426: Execute the statement as written. Code: ` symbol,`
- Line 427: Execute the statement as written. Code: ` requested_expiration,`
- Line 428: Execute the statement as written. Code: ` base_url,`
- Line 429: Execute the statement as written. Code: ` )`
- Line 430: Blank line for readability. Code: `<blank>`
- Line 431: Execute the statement as written. Code: ` calls_html = None`
- Line 432: Execute the statement as written. Code: ` puts_html = None`
- Line 433: Execute the statement as written. Code: ` calls_full = []`
- Line 434: Execute the statement as written. Code: ` puts_full = []`
- Line 435: Execute the statement as written. Code: ` price = None`
- Line 436: Execute the statement as written. Code: ` selected_expiration_value = None`
- Line 437: Execute the statement as written. Code: ` selected_expiration_label = None`
- Line 438: Execute the statement as written. Code: ` expiration_options = []`
- Line 439: Execute the statement as written. Code: ` target_date = None`
- Line 440: Execute the statement as written. Code: ` fallback_to_base = False`
- Line 441: Blank line for readability. Code: `<blank>`
- Line 442: Execute the statement as written. Code: ` with sync_playwright() as p:`
- Line 443: Execute the statement as written. Code: ` launch_args = chromium_launch_args()`
- Line 444: Execute the statement as written. Code: ` if launch_args:`
- Line 445: Execute the statement as written. Code: ` app.logger.info("GPU acceleration enabled")`
- Line 446: Execute the statement as written. Code: ` else:`
- Line 447: Execute the statement as written. Code: ` app.logger.info("GPU acceleration disabled")`
- Line 448: Execute the statement as written. Code: ` browser = p.chromium.launch(headless=True, args=launch_args)`
- Line 449: Execute the statement as written. Code: ` page = browser.new_page()`
- Line 450: Execute the statement as written. Code: ` page.set_extra_http_headers(`
- Line 451: Execute the statement as written. Code: ` {`
- Line 452: Execute the statement as written. Code: ` "User-Agent": (`
- Line 453: Execute the statement as written. Code: ` "Mozilla/5.0 (Windows NT 10.0; Win64; x64) "`
- Line 454: Execute the statement as written. Code: ` "AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120 Safari/537.36"`
- Line 455: Execute the statement as written. Code: ` )`
- Line 456: Execute the statement as written. Code: ` }`
- Line 457: Execute the statement as written. Code: ` )`
- Line 458: Execute the statement as written. Code: ` page.set_default_timeout(60000)`
- Line 459: Blank line for readability. Code: `<blank>`
- Line 460: Execute the statement as written. Code: ` try:`
- Line 461: Execute the statement as written. Code: ` if requested_expiration:`
- Line 462: Execute the statement as written. Code: ` if requested_expiration.isdigit():`
- Line 463: Execute the statement as written. Code: ` target_date = int(requested_expiration)`
- Line 464: Execute the statement as written. Code: ` selected_expiration_value = target_date`
- Line 465: Execute the statement as written. Code: ` selected_expiration_label = format_expiration_label(target_date)`
- Line 466: Execute the statement as written. Code: ` else:`
- Line 467: Execute the statement as written. Code: ` parsed_date = parse_date(requested_expiration)`
- Line 468: Execute the statement as written. Code: ` if parsed_date:`
- Line 469: Execute the statement as written. Code: ` target_date = int(`
- Line 470: Execute the statement as written. Code: ` datetime(`
- Line 471: Execute the statement as written. Code: ` parsed_date.year,`
- Line 472: Execute the statement as written. Code: ` parsed_date.month,`
- Line 473: Execute the statement as written. Code: ` parsed_date.day,`
- Line 474: Execute the statement as written. Code: ` tzinfo=timezone.utc,`
- Line 475: Execute the statement as written. Code: ` ).timestamp()`
- Line 476: Execute the statement as written. Code: ` )`
- Line 477: Execute the statement as written. Code: ` selected_expiration_value = target_date`
- Line 478: Execute the statement as written. Code: ` selected_expiration_label = format_expiration_label(target_date)`
- Line 479: Execute the statement as written. Code: ` else:`
- Line 480: Execute the statement as written. Code: ` fallback_to_base = True`
- Line 481: Blank line for readability. Code: `<blank>`
- Line 482: Execute the statement as written. Code: ` if target_date:`
- Line 483: Execute the statement as written. Code: ` url = f"{base_url}?date={target_date}"`
- Line 484: Blank line for readability. Code: `<blank>`
- Line 485: Execute the statement as written. Code: ` page.goto(url, wait_until="domcontentloaded", timeout=60000)`
- Line 486: Execute the statement as written. Code: ` app.logger.info("Page loaded (domcontentloaded) for %s", symbol)`
- Line 487: Blank line for readability. Code: `<blank>`
- Line 488: Execute the statement as written. Code: ` option_chain, expiration_dates = read_option_chain(page)`
- Line 489: Execute the statement as written. Code: ` app.logger.info("Option chain found: %s", bool(option_chain))`
- Line 490: Execute the statement as written. Code: ` expiration_options = build_expiration_options(expiration_dates)`
- Line 491: Blank line for readability. Code: `<blank>`
- Line 492: Execute the statement as written. Code: ` if fallback_to_base:`
- Line 493: Execute the statement as written. Code: ` resolved_value, resolved_label = resolve_expiration(`
- Line 494: Execute the statement as written. Code: ` requested_expiration, expiration_options`
- Line 495: Execute the statement as written. Code: ` )`
- Line 496: Execute the statement as written. Code: ` if resolved_value is None:`
- Line 497: Execute the statement as written. Code: ` return {`
- Line 498: Execute the statement as written. Code: ` "error": "Requested expiration not available",`
- Line 499: Execute the statement as written. Code: ` "stock": symbol,`
- Line 500: Execute the statement as written. Code: ` "requested_expiration": requested_expiration,`
- Line 501: Execute the statement as written. Code: ` "available_expirations": [`
- Line 502: Execute the statement as written. Code: ` {"label": opt.get("label"), "value": opt.get("value")}`
- Line 503: Execute the statement as written. Code: ` for opt in expiration_options`
- Line 504: Execute the statement as written. Code: ` ],`
- Line 505: Execute the statement as written. Code: ` }`
- Line 506: Blank line for readability. Code: `<blank>`
- Line 507: Execute the statement as written. Code: ` target_date = resolved_value`
- Line 508: Execute the statement as written. Code: ` selected_expiration_value = resolved_value`
- Line 509: Execute the statement as written. Code: ` selected_expiration_label = resolved_label or format_expiration_label(`
- Line 510: Execute the statement as written. Code: ` resolved_value`
- Line 511: Execute the statement as written. Code: ` )`
- Line 512: Execute the statement as written. Code: ` url = f"{base_url}?date={resolved_value}"`
- Line 513: Execute the statement as written. Code: ` page.goto(url, wait_until="domcontentloaded", timeout=60000)`
- Line 514: Execute the statement as written. Code: ` app.logger.info("Page loaded (domcontentloaded) for %s", symbol)`
- Line 515: Blank line for readability. Code: `<blank>`
- Line 516: Execute the statement as written. Code: ` option_chain, expiration_dates = read_option_chain(page)`
- Line 517: Execute the statement as written. Code: ` expiration_options = build_expiration_options(expiration_dates)`
- Line 518: Blank line for readability. Code: `<blank>`
- Line 519: Execute the statement as written. Code: ` if target_date and expiration_options:`
- Line 520: Execute the statement as written. Code: ` matched = None`
- Line 521: Execute the statement as written. Code: ` for opt in expiration_options:`
- Line 522: Execute the statement as written. Code: ` if opt.get("value") == target_date:`
- Line 523: Execute the statement as written. Code: ` matched = opt`
- Line 524: Execute the statement as written. Code: ` break`
- Line 525: Execute the statement as written. Code: ` if not matched:`
- Line 526: Execute the statement as written. Code: ` return {`
- Line 527: Execute the statement as written. Code: ` "error": "Requested expiration not available",`
- Line 528: Execute the statement as written. Code: ` "stock": symbol,`
- Line 529: Execute the statement as written. Code: ` "requested_expiration": requested_expiration,`
- Line 530: Execute the statement as written. Code: ` "available_expirations": [`
- Line 531: Execute the statement as written. Code: ` {"label": opt.get("label"), "value": opt.get("value")}`
- Line 532: Execute the statement as written. Code: ` for opt in expiration_options`
- Line 533: Execute the statement as written. Code: ` ],`
- Line 534: Execute the statement as written. Code: ` }`
- Line 535: Execute the statement as written. Code: ` selected_expiration_value = matched.get("value")`
- Line 536: Execute the statement as written. Code: ` selected_expiration_label = matched.get("label")`
- Line 537: Execute the statement as written. Code: ` elif expiration_options and not target_date:`
- Line 538: Execute the statement as written. Code: ` selected_expiration_value = expiration_options[0].get("value")`
- Line 539: Execute the statement as written. Code: ` selected_expiration_label = expiration_options[0].get("label")`
- Line 540: Blank line for readability. Code: `<blank>`
- Line 541: Execute the statement as written. Code: ` calls_full, puts_full = build_rows_from_chain(option_chain)`
- Line 542: Execute the statement as written. Code: ` app.logger.info(`
- Line 543: Execute the statement as written. Code: ` "Option chain rows: calls=%d puts=%d",`
- Line 544: Execute the statement as written. Code: ` len(calls_full),`
- Line 545: Execute the statement as written. Code: ` len(puts_full),`
- Line 546: Execute the statement as written. Code: ` )`
- Line 547: Blank line for readability. Code: `<blank>`
- Line 548: Execute the statement as written. Code: ` if not calls_full and not puts_full:`
- Line 549: Execute the statement as written. Code: ` app.logger.info("Waiting for options tables...")`
- Line 550: Blank line for readability. Code: `<blank>`
- Line 551: Execute the statement as written. Code: ` tables = wait_for_tables(page)`
- Line 552: Execute the statement as written. Code: ` if len(tables) < 2:`
- Line 553: Execute the statement as written. Code: ` app.logger.error(`
- Line 554: Execute the statement as written. Code: ` "Only %d tables found; expected 2. HTML may have changed.",`
- Line 555: Execute the statement as written. Code: ` len(tables),`
- Line 556: Execute the statement as written. Code: ` )`
- Line 557: Execute the statement as written. Code: ` return {"error": "Could not locate options tables", "stock": symbol}`
- Line 558: Blank line for readability. Code: `<blank>`
- Line 559: Execute the statement as written. Code: ` app.logger.info("Found %d tables. Extracting Calls & Puts.", len(tables))`
- Line 560: Blank line for readability. Code: `<blank>`
- Line 561: Execute the statement as written. Code: ` calls_html = tables[0].evaluate("el => el.outerHTML")`
- Line 562: Execute the statement as written. Code: ` puts_html = tables[1].evaluate("el => el.outerHTML")`
- Line 563: Blank line for readability. Code: `<blank>`
- Line 564: Comment describing the next block. Code: ` # --- Extract current price ---`
- Line 565: Execute the statement as written. Code: ` try:`
- Line 566: Comment describing the next block. Code: ` # Primary selector`
- Line 567: Execute the statement as written. Code: ` price_text = page.locator(`
- Line 568: Execute the statement as written. Code: ` "fin-streamer[data-field='regularMarketPrice']"`
- Line 569: Execute the statement as written. Code: ` ).inner_text()`
- Line 570: Execute the statement as written. Code: ` price = float(price_text.replace(",", ""))`
- Line 571: Execute the statement as written. Code: ` except Exception:`
- Line 572: Execute the statement as written. Code: ` try:`
- Line 573: Comment describing the next block. Code: ` # Fallback`
- Line 574: Execute the statement as written. Code: ` price_text = page.locator("span[data-testid='qsp-price']").inner_text()`
- Line 575: Execute the statement as written. Code: ` price = float(price_text.replace(",", ""))`
- Line 576: Execute the statement as written. Code: ` except Exception as e:`
- Line 577: Execute the statement as written. Code: ` app.logger.warning("Failed to extract price for %s: %s", symbol, e)`
- Line 578: Blank line for readability. Code: `<blank>`
- Line 579: Execute the statement as written. Code: ` app.logger.info("Current price for %s = %s", symbol, price)`
- Line 580: Execute the statement as written. Code: ` finally:`
- Line 581: Execute the statement as written. Code: ` browser.close()`
- Line 582: Blank line for readability. Code: `<blank>`
- Line 583: Execute the statement as written. Code: ` if not calls_full and not puts_full and calls_html and puts_html:`
- Line 584: Execute the statement as written. Code: ` calls_full = parse_table(calls_html, "calls")`
- Line 585: Execute the statement as written. Code: ` puts_full = parse_table(puts_html, "puts")`
- Line 586: Blank line for readability. Code: `<blank>`
- Line 587: Execute the statement as written. Code: ` expected_code = expected_expiry_code(target_date)`
- Line 588: Execute the statement as written. Code: ` if expected_code:`
- Line 589: Execute the statement as written. Code: ` if not has_expected_expiry(calls_full, expected_code) and not has_expected_expiry(`
- Line 590: Execute the statement as written. Code: ` puts_full, expected_code`
- Line 591: Execute the statement as written. Code: ` ):`
- Line 592: Execute the statement as written. Code: ` return {`
- Line 593: Execute the statement as written. Code: ` "error": "Options chain does not match requested expiration",`
- Line 594: Execute the statement as written. Code: ` "stock": symbol,`
- Line 595: Execute the statement as written. Code: ` "requested_expiration": requested_expiration,`
- Line 596: Execute the statement as written. Code: ` "expected_expiration_code": expected_code,`
- Line 597: Execute the statement as written. Code: ` "selected_expiration": {`
- Line 598: Execute the statement as written. Code: ` "value": selected_expiration_value,`
- Line 599: Execute the statement as written. Code: ` "label": selected_expiration_label,`
- Line 600: Execute the statement as written. Code: ` },`
- Line 601: Execute the statement as written. Code: ` }`
- Line 602: Blank line for readability. Code: `<blank>`
- Line 603: Comment describing the next block. Code: ` # ----------------------------------------------------------------------`
- Line 604: Comment describing the next block. Code: ` # Pruning logic`
- Line 605: Comment describing the next block. Code: ` # ----------------------------------------------------------------------`
- Line 606: Define the prune_nearest function. Code: ` def prune_nearest(options, price_value, limit=25, side=""):`
- Line 607: Execute the statement as written. Code: ` if price_value is None:`
- Line 608: Execute the statement as written. Code: ` return options, 0`
- Line 609: Blank line for readability. Code: `<blank>`
- Line 610: Execute the statement as written. Code: ` numeric = [o for o in options if isinstance(o.get("Strike"), (int, float))]`
- Line 611: Blank line for readability. Code: `<blank>`
- Line 612: Execute the statement as written. Code: ` if len(numeric) <= limit:`
- Line 613: Execute the statement as written. Code: ` return numeric, 0`
- Line 614: Blank line for readability. Code: `<blank>`
- Line 615: Execute the statement as written. Code: ` sorted_opts = sorted(numeric, key=lambda x: abs(x["Strike"] - price_value))`
- Line 616: Execute the statement as written. Code: ` pruned = sorted_opts[:limit]`
- Line 617: Execute the statement as written. Code: ` pruned_count = len(options) - len(pruned)`
- Line 618: Execute the statement as written. Code: ` return pruned, pruned_count`
- Line 619: Blank line for readability. Code: `<blank>`
- Line 620: Execute the statement as written. Code: ` calls, pruned_calls = prune_nearest(`
- Line 621: Execute the statement as written. Code: ` calls_full,`
- Line 622: Execute the statement as written. Code: ` price,`
- Line 623: Execute the statement as written. Code: ` limit=strike_limit,`
- Line 624: Execute the statement as written. Code: ` side="calls",`
- Line 625: Execute the statement as written. Code: ` )`
- Line 626: Execute the statement as written. Code: ` puts, pruned_puts = prune_nearest(`
- Line 627: Execute the statement as written. Code: ` puts_full,`
- Line 628: Execute the statement as written. Code: ` price,`
- Line 629: Execute the statement as written. Code: ` limit=strike_limit,`
- Line 630: Execute the statement as written. Code: ` side="puts",`
- Line 631: Execute the statement as written. Code: ` )`
- Line 632: Blank line for readability. Code: `<blank>`
- Line 633: Define the strike_range function. Code: ` def strike_range(opts):`
- Line 634: Execute the statement as written. Code: ` strikes = [o["Strike"] for o in opts if isinstance(o.get("Strike"), (int, float))]`
- Line 635: Execute the statement as written. Code: ` return [min(strikes), max(strikes)] if strikes else [None, None]`
- Line 636: Blank line for readability. Code: `<blank>`
- Line 637: Execute the statement as written. Code: ` return {`
- Line 638: Execute the statement as written. Code: ` "stock": symbol,`
- Line 639: Execute the statement as written. Code: ` "url": url,`
- Line 640: Execute the statement as written. Code: ` "requested_expiration": requested_expiration,`
- Line 641: Execute the statement as written. Code: ` "selected_expiration": {`
- Line 642: Execute the statement as written. Code: ` "value": selected_expiration_value,`
- Line 643: Execute the statement as written. Code: ` "label": selected_expiration_label,`
- Line 644: Execute the statement as written. Code: ` },`
- Line 645: Execute the statement as written. Code: ` "current_price": price,`
- Line 646: Execute the statement as written. Code: ` "calls": calls,`
- Line 647: Execute the statement as written. Code: ` "puts": puts,`
- Line 648: Execute the statement as written. Code: ` "calls_strike_range": strike_range(calls),`
- Line 649: Execute the statement as written. Code: ` "puts_strike_range": strike_range(puts),`
- Line 650: Execute the statement as written. Code: ` "total_calls": len(calls),`
- Line 651: Execute the statement as written. Code: ` "total_puts": len(puts),`
- Line 652: Execute the statement as written. Code: ` "pruned_calls_count": pruned_calls,`
- Line 653: Execute the statement as written. Code: ` "pruned_puts_count": pruned_puts,`
- Line 654: Execute the statement as written. Code: ` }`
- Line 655: Blank line for readability. Code: `<blank>`
- Line 656: Blank line for readability. Code: `<blank>`
- Line 657: Attach a decorator to the next function. Code: `@app.route("/scrape_sync")`
- Line 658: Define the scrape_sync function. Code: `def scrape_sync():`
- Line 659: Execute the statement as written. Code: ` symbol = request.args.get("stock", "MSFT")`
- Line 660: Execute the statement as written. Code: ` expiration = (`
- Line 661: Execute the statement as written. Code: ` request.args.get("expiration")`
- Line 662: Execute the statement as written. Code: ` or request.args.get("expiry")`
- Line 663: Execute the statement as written. Code: ` or request.args.get("date")`
- Line 664: Execute the statement as written. Code: ` )`
- Line 665: Execute the statement as written. Code: ` strike_limit = parse_strike_limit(request.args.get("strikeLimit"), default=25)`
- Line 666: Execute the statement as written. Code: ` app.logger.info(`
- Line 667: Execute the statement as written. Code: ` "Received /scrape_sync request for symbol=%s expiration=%s strike_limit=%s",`
- Line 668: Execute the statement as written. Code: ` symbol,`
- Line 669: Execute the statement as written. Code: ` expiration,`
- Line 670: Execute the statement as written. Code: ` strike_limit,`
- Line 671: Execute the statement as written. Code: ` )`
- Line 672: Execute the statement as written. Code: ` return jsonify(scrape_yahoo_options(symbol, expiration, strike_limit))`
- Line 673: Blank line for readability. Code: `<blank>`
- Line 674: Blank line for readability. Code: `<blank>`
- Line 675: Run the Flask development server when executed as a script. Code: `if __name__ == "__main__":`
- Line 676: Execute the statement as written. Code: ` app.run(host="0.0.0.0", port=9777)`

13
Dockerfile Normal file
View File

@@ -0,0 +1,13 @@
FROM mcr.microsoft.com/playwright/python:v1.57.0-jammy
WORKDIR /app
ENV PYTHONUNBUFFERED=1
COPY scraper_service.py /app/scraper_service.py
RUN python -m pip install --no-cache-dir flask beautifulsoup4 playwright==1.57.0
EXPOSE 9777
CMD ["python", "scraper_service.py"]

File diff suppressed because it is too large Load Diff

199
scripts/test_cycles.py Normal file
View File

@@ -0,0 +1,199 @@
import argparse
import datetime
import json
import sys
import time
import urllib.parse
import urllib.request
DEFAULT_STOCKS = ["AAPL", "AMZN", "MSFT", "TSLA"]
DEFAULT_CYCLES = [None, 5, 10, 25, 50, 75, 100, 150, 200, 500]
def http_get(base_url, params, timeout):
query = urllib.parse.urlencode(params)
url = f"{base_url}?{query}"
with urllib.request.urlopen(url, timeout=timeout) as resp:
return json.loads(resp.read().decode("utf-8"))
def expected_code_from_epoch(epoch):
return datetime.datetime.utcfromtimestamp(epoch).strftime("%y%m%d")
def all_contracts_match(opts, expected_code):
for opt in opts:
name = opt.get("Contract Name") or ""
if expected_code not in name:
return False
return True
def parse_list(value, default):
if not value:
return default
return [item.strip() for item in value.split(",") if item.strip()]
def parse_cycles(value):
if not value:
return DEFAULT_CYCLES
cycles = []
for item in value.split(","):
token = item.strip().lower()
if not token or token in ("default", "none"):
cycles.append(None)
continue
try:
cycles.append(int(token))
except ValueError:
raise ValueError(f"Invalid strikeLimit value: {item}")
return cycles
def main():
parser = argparse.ArgumentParser(description="Yahoo options scraper test cycles")
parser.add_argument(
"--base-url",
default="http://127.0.0.1:9777/scrape_sync",
help="Base URL for /scrape_sync",
)
parser.add_argument(
"--stocks",
default=",".join(DEFAULT_STOCKS),
help="Comma-separated stock symbols",
)
parser.add_argument(
"--strike-limits",
default="default,5,10,25,50,75,100,150,200,500",
help="Comma-separated strike limits (use 'default' for the API default)",
)
parser.add_argument(
"--baseline-limit",
type=int,
default=5000,
help="Large strikeLimit used to capture all available strikes",
)
parser.add_argument(
"--timeout",
type=int,
default=180,
help="Request timeout in seconds",
)
parser.add_argument(
"--sleep",
type=float,
default=0.2,
help="Sleep between requests",
)
args = parser.parse_args()
stocks = parse_list(args.stocks, DEFAULT_STOCKS)
cycles = parse_cycles(args.strike_limits)
print("Fetching expiration lists...")
expirations = {}
for stock in stocks:
data = http_get(args.base_url, {"stock": stock, "expiration": "invalid"}, args.timeout)
if "available_expirations" not in data:
print(f"ERROR: missing available_expirations for {stock}: {data}")
sys.exit(1)
values = [opt.get("value") for opt in data["available_expirations"] if opt.get("value")]
if len(values) < 4:
print(f"ERROR: not enough expirations for {stock}: {values}")
sys.exit(1)
expirations[stock] = values[:4]
print(f" {stock}: {expirations[stock]}")
time.sleep(args.sleep)
print("\nBuilding baseline counts (strikeLimit=%d)..." % args.baseline_limit)
baseline_counts = {}
for stock, exp_list in expirations.items():
for exp in exp_list:
data = http_get(
args.base_url,
{"stock": stock, "expiration": exp, "strikeLimit": args.baseline_limit},
args.timeout,
)
if "error" in data:
print(f"ERROR: baseline error for {stock} {exp}: {data}")
sys.exit(1)
calls_count = data.get("total_calls")
puts_count = data.get("total_puts")
if calls_count is None or puts_count is None:
print(f"ERROR: baseline missing counts for {stock} {exp}: {data}")
sys.exit(1)
expected_code = expected_code_from_epoch(exp)
if not all_contracts_match(data.get("calls", []), expected_code):
print(f"ERROR: baseline calls mismatch for {stock} {exp}")
sys.exit(1)
if not all_contracts_match(data.get("puts", []), expected_code):
print(f"ERROR: baseline puts mismatch for {stock} {exp}")
sys.exit(1)
baseline_counts[(stock, exp)] = (calls_count, puts_count)
print(f" {stock} {exp}: calls={calls_count} puts={puts_count}")
time.sleep(args.sleep)
print("\nRunning %d cycles of API tests..." % len(cycles))
for idx, strike_limit in enumerate(cycles, start=1):
print(f"Cycle {idx}/{len(cycles)} (strikeLimit={strike_limit})")
for stock, exp_list in expirations.items():
for exp in exp_list:
params = {"stock": stock, "expiration": exp}
if strike_limit is not None:
params["strikeLimit"] = strike_limit
data = http_get(args.base_url, params, args.timeout)
if "error" in data:
print(f"ERROR: {stock} {exp} -> {data}")
sys.exit(1)
selected_val = data.get("selected_expiration", {}).get("value")
if selected_val != exp:
print(
f"ERROR: selected expiration mismatch for {stock} {exp}: {selected_val}"
)
sys.exit(1)
expected_code = expected_code_from_epoch(exp)
if not all_contracts_match(data.get("calls", []), expected_code):
print(f"ERROR: calls expiry mismatch for {stock} {exp}")
sys.exit(1)
if not all_contracts_match(data.get("puts", []), expected_code):
print(f"ERROR: puts expiry mismatch for {stock} {exp}")
sys.exit(1)
available_calls, available_puts = baseline_counts[(stock, exp)]
expected_limit = strike_limit if strike_limit is not None else 25
expected_calls = min(expected_limit, available_calls)
expected_puts = min(expected_limit, available_puts)
if data.get("total_calls") != expected_calls:
print(
f"ERROR: call count mismatch for {stock} {exp}: "
f"got {data.get('total_calls')} expected {expected_calls}"
)
sys.exit(1)
if data.get("total_puts") != expected_puts:
print(
f"ERROR: put count mismatch for {stock} {exp}: "
f"got {data.get('total_puts')} expected {expected_puts}"
)
sys.exit(1)
expected_pruned_calls = max(0, available_calls - expected_calls)
expected_pruned_puts = max(0, available_puts - expected_puts)
if data.get("pruned_calls_count") != expected_pruned_calls:
print(
f"ERROR: pruned calls mismatch for {stock} {exp}: "
f"got {data.get('pruned_calls_count')} expected {expected_pruned_calls}"
)
sys.exit(1)
if data.get("pruned_puts_count") != expected_pruned_puts:
print(
f"ERROR: pruned puts mismatch for {stock} {exp}: "
f"got {data.get('pruned_puts_count')} expected {expected_pruned_puts}"
)
sys.exit(1)
time.sleep(args.sleep)
print(f"Cycle {idx} OK")
print("\nAll cycles completed successfully.")
if __name__ == "__main__":
main()

View File

@@ -0,0 +1,145 @@
import argparse
import json
import sys
import time
import urllib.parse
import urllib.request
DEFAULT_SYMBOLS = ["AAPL", "AMZN", "MSFT", "TSLA"]
REQUIRED_SECTIONS = [
"key_metrics",
"valuation",
"profitability",
"growth",
"financial_strength",
"cashflow",
"ownership",
"analyst",
"earnings",
"performance",
]
REQUIRED_KEY_METRICS = [
"previous_close",
"open",
"bid",
"ask",
"beta",
"eps_trailing",
"dividend_rate",
"current_price",
]
def http_get(base_url, params, timeout):
query = urllib.parse.urlencode(params)
url = f"{base_url}?{query}"
with urllib.request.urlopen(url, timeout=timeout) as resp:
return json.loads(resp.read().decode("utf-8"))
def parse_list(value, default):
if not value:
return default
return [item.strip() for item in value.split(",") if item.strip()]
def build_signature(data):
return {
"key_metrics_keys": sorted(data.get("key_metrics", {}).keys()),
"valuation_keys": sorted(data.get("valuation", {}).keys()),
"profitability_keys": sorted(data.get("profitability", {}).keys()),
"growth_keys": sorted(data.get("growth", {}).keys()),
"financial_strength_keys": sorted(data.get("financial_strength", {}).keys()),
"cashflow_keys": sorted(data.get("cashflow", {}).keys()),
"ownership_keys": sorted(data.get("ownership", {}).keys()),
"analyst_keys": sorted(data.get("analyst", {}).keys()),
"earnings_keys": sorted(data.get("earnings", {}).keys()),
"performance_keys": sorted(data.get("performance", {}).keys()),
}
def validate_payload(symbol, data):
if "error" in data:
return f"API error for {symbol}: {data}"
if data.get("stock", "").upper() != symbol.upper():
return f"Symbol mismatch: expected {symbol} got {data.get('stock')}"
validation = data.get("validation", {})
if validation.get("symbol_match") is not True:
return f"Validation symbol_match failed for {symbol}: {validation}"
if validation.get("issues"):
return f"Validation issues for {symbol}: {validation}"
for section in REQUIRED_SECTIONS:
if section not in data:
return f"Missing section {section} for {symbol}"
key_metrics = data.get("key_metrics", {})
for field in REQUIRED_KEY_METRICS:
if field not in key_metrics:
return f"Missing key metric {field} for {symbol}"
return None
def main():
parser = argparse.ArgumentParser(description="Yahoo profile scraper test cycles")
parser.add_argument(
"--base-url",
default="http://127.0.0.1:9777/profile",
help="Base URL for /profile",
)
parser.add_argument(
"--symbols",
default=",".join(DEFAULT_SYMBOLS),
help="Comma-separated stock symbols",
)
parser.add_argument(
"--runs",
type=int,
default=8,
help="Number of validation runs per symbol",
)
parser.add_argument(
"--timeout",
type=int,
default=180,
help="Request timeout in seconds",
)
parser.add_argument(
"--sleep",
type=float,
default=0.2,
help="Sleep between requests",
)
args = parser.parse_args()
symbols = parse_list(args.symbols, DEFAULT_SYMBOLS)
signatures = {}
print(f"Running {args.runs} profile cycles for: {', '.join(symbols)}")
for run in range(1, args.runs + 1):
print(f"Cycle {run}/{args.runs}")
for symbol in symbols:
data = http_get(args.base_url, {"stock": symbol}, args.timeout)
error = validate_payload(symbol, data)
if error:
print(f"ERROR: {error}")
sys.exit(1)
signature = build_signature(data)
if symbol not in signatures:
signatures[symbol] = signature
elif signatures[symbol] != signature:
print(f"ERROR: Signature changed for {symbol}")
print(f"Baseline: {signatures[symbol]}")
print(f"Current: {signature}")
sys.exit(1)
time.sleep(args.sleep)
print(f"Cycle {run} OK")
print("\nAll profile cycles completed successfully.")
if __name__ == "__main__":
main()