Files
SimpleScraper/AGENTS.md

644 lines
47 KiB
Markdown

# AGENTS.md
## Context
- This project exposes a Flask API that uses Playwright to scrape Yahoo Finance options chains.
- Entry point: `scraper_service.py` (launched via `runner.bat` or directly with Python).
- API route: `GET /scrape_sync` with `stock`, optional `expiration|expiry|date`, and `strikeLimit` parameters.
- Expiration inputs: epoch seconds (Yahoo date param) or date strings supported by `DATE_FORMATS`.
- strikeLimit defaults to 25 and controls the number of nearest strikes returned per side.
## Docker
- Build: `docker build -t <image>:latest .`
- Run: `docker run --rm -p 9777:9777 <image>:latest`
- The container uses the Playwright base image with bundled browsers.
## Line-by-line explanation of scraper_service.py
- Line 1: Import symbols from flask. Code: `from flask import Flask, jsonify, request`
- Line 2: Import symbols from playwright.sync_api. Code: `from playwright.sync_api import sync_playwright`
- Line 3: Import symbols from bs4. Code: `from bs4 import BeautifulSoup`
- Line 4: Import symbols from datetime. Code: `from datetime import datetime, timezone`
- Line 5: Import module urllib.parse. Code: `import urllib.parse`
- Line 6: Import module logging. Code: `import logging`
- Line 7: Import module json. Code: `import json`
- Line 8: Import module re. Code: `import re`
- Line 9: Import module time. Code: `import time`
- Line 10: Blank line for readability. Code: `<blank>`
- Line 11: Create the Flask application instance. Code: `app = Flask(__name__)`
- Line 12: Blank line for readability. Code: `<blank>`
- Line 13: Comment describing the next block. Code: `# Logging`
- Line 14: Configure logging defaults. Code: `logging.basicConfig(`
- Line 15: Execute the statement as written. Code: `level=logging.INFO,`
- Line 16: Execute the statement as written. Code: `format="%(asctime)s [%(levelname)s] %(message)s"`
- Line 17: Close the current block or container. Code: `)`
- Line 18: Set the Flask logger level. Code: `app.logger.setLevel(logging.INFO)`
- Line 19: Blank line for readability. Code: `<blank>`
- Line 20: Define accepted expiration date string formats. Code: `DATE_FORMATS = (`
- Line 21: Execute the statement as written. Code: `"%Y-%m-%d",`
- Line 22: Execute the statement as written. Code: `"%Y/%m/%d",`
- Line 23: Execute the statement as written. Code: `"%Y%m%d",`
- Line 24: Execute the statement as written. Code: `"%b %d, %Y",`
- Line 25: Execute the statement as written. Code: `"%B %d, %Y",`
- Line 26: Close the current block or container. Code: `)`
- Line 27: Blank line for readability. Code: `<blank>`
- Line 28: Blank line for readability. Code: `<blank>`
- Line 29: Define the parse_date function. Code: `def parse_date(value):`
- Line 30: Loop over items. Code: `for fmt in DATE_FORMATS:`
- Line 31: Start a try block for error handling. Code: `try:`
- Line 32: Return a value to the caller. Code: `return datetime.strptime(value, fmt).date()`
- Line 33: Handle exceptions for the preceding try block. Code: `except ValueError:`
- Line 34: Execute the statement as written. Code: `continue`
- Line 35: Return a value to the caller. Code: `return None`
- Line 36: Blank line for readability. Code: `<blank>`
- Line 37: Blank line for readability. Code: `<blank>`
- Line 38: Define the normalize_label function. Code: `def normalize_label(value):`
- Line 39: Return a value to the caller. Code: `return " ".join(value.strip().split()).lower()`
- Line 40: Blank line for readability. Code: `<blank>`
- Line 41: Blank line for readability. Code: `<blank>`
- Line 42: Define the format_expiration_label function. Code: `def format_expiration_label(timestamp):`
- Line 43: Start a try block for error handling. Code: `try:`
- Line 44: Return a value to the caller. Code: `return datetime.utcfromtimestamp(timestamp).strftime("%Y-%m-%d")`
- Line 45: Handle exceptions for the preceding try block. Code: `except Exception:`
- Line 46: Return a value to the caller. Code: `return str(timestamp)`
- Line 47: Blank line for readability. Code: `<blank>`
- Line 48: Blank line for readability. Code: `<blank>`
- Line 49: Define the format_percent function. Code: `def format_percent(value):`
- Line 50: Conditional branch. Code: `if value is None:`
- Line 51: Return a value to the caller. Code: `return None`
- Line 52: Start a try block for error handling. Code: `try:`
- Line 53: Return a value to the caller. Code: `return f"{value * 100:.2f}%"`
- Line 54: Handle exceptions for the preceding try block. Code: `except Exception:`
- Line 55: Return a value to the caller. Code: `return None`
- Line 56: Blank line for readability. Code: `<blank>`
- Line 57: Blank line for readability. Code: `<blank>`
- Line 58: Define the extract_raw_value function. Code: `def extract_raw_value(value):`
- Line 59: Conditional branch. Code: `if isinstance(value, dict):`
- Line 60: Return a value to the caller. Code: `return value.get("raw")`
- Line 61: Return a value to the caller. Code: `return value`
- Line 62: Blank line for readability. Code: `<blank>`
- Line 63: Blank line for readability. Code: `<blank>`
- Line 64: Define the extract_fmt_value function. Code: `def extract_fmt_value(value):`
- Line 65: Conditional branch. Code: `if isinstance(value, dict):`
- Line 66: Return a value to the caller. Code: `return value.get("fmt")`
- Line 67: Return a value to the caller. Code: `return None`
- Line 68: Blank line for readability. Code: `<blank>`
- Line 69: Blank line for readability. Code: `<blank>`
- Line 70: Define the format_percent_value function. Code: `def format_percent_value(value):`
- Line 71: Execute the statement as written. Code: `fmt = extract_fmt_value(value)`
- Line 72: Conditional branch. Code: `if fmt is not None:`
- Line 73: Return a value to the caller. Code: `return fmt`
- Line 74: Return a value to the caller. Code: `return format_percent(extract_raw_value(value))`
- Line 75: Blank line for readability. Code: `<blank>`
- Line 76: Blank line for readability. Code: `<blank>`
- Line 77: Define the format_last_trade_date function. Code: `def format_last_trade_date(timestamp):`
- Line 78: Execute the statement as written. Code: `timestamp = extract_raw_value(timestamp)`
- Line 79: Conditional branch. Code: `if not timestamp:`
- Line 80: Return a value to the caller. Code: `return None`
- Line 81: Start a try block for error handling. Code: `try:`
- Line 82: Return a value to the caller. Code: `return datetime.fromtimestamp(timestamp).strftime("%m/%d/%Y %I:%M %p") + " EST"`
- Line 83: Handle exceptions for the preceding try block. Code: `except Exception:`
- Line 84: Return a value to the caller. Code: `return None`
- Line 85: Blank line for readability. Code: `<blank>`
- Line 86: Blank line for readability. Code: `<blank>`
- Line 87: Define the extract_option_chain_from_html function. Code: `def extract_option_chain_from_html(html):`
- Line 88: Conditional branch. Code: `if not html:`
- Line 89: Return a value to the caller. Code: `return None`
- Line 90: Blank line for readability. Code: `<blank>`
- Line 91: Execute the statement as written. Code: `token = "\"body\":\""`
- Line 92: Execute the statement as written. Code: `start = 0`
- Line 93: Execute the statement as written. Code: `while True:`
- Line 94: Execute the statement as written. Code: `idx = html.find(token, start)`
- Line 95: Conditional branch. Code: `if idx == -1:`
- Line 96: Execute the statement as written. Code: `break`
- Line 97: Execute the statement as written. Code: `i = idx + len(token)`
- Line 98: Execute the statement as written. Code: `escaped = False`
- Line 99: Execute the statement as written. Code: `raw_chars = []`
- Line 100: Execute the statement as written. Code: `while i < len(html):`
- Line 101: Execute the statement as written. Code: `ch = html[i]`
- Line 102: Conditional branch. Code: `if escaped:`
- Line 103: Execute the statement as written. Code: `raw_chars.append(ch)`
- Line 104: Execute the statement as written. Code: `escaped = False`
- Line 105: Fallback branch. Code: `else:`
- Line 106: Conditional branch. Code: `if ch == "\\":`
- Line 107: Execute the statement as written. Code: `raw_chars.append(ch)`
- Line 108: Execute the statement as written. Code: `escaped = True`
- Line 109: Alternative conditional branch. Code: `elif ch == "\"":`
- Line 110: Execute the statement as written. Code: `break`
- Line 111: Fallback branch. Code: `else:`
- Line 112: Execute the statement as written. Code: `raw_chars.append(ch)`
- Line 113: Execute the statement as written. Code: `i += 1`
- Line 114: Execute the statement as written. Code: `raw = "".join(raw_chars)`
- Line 115: Start a try block for error handling. Code: `try:`
- Line 116: Execute the statement as written. Code: `body_text = json.loads(f"\"{raw}\"")`
- Line 117: Handle exceptions for the preceding try block. Code: `except json.JSONDecodeError:`
- Line 118: Execute the statement as written. Code: `start = idx + len(token)`
- Line 119: Execute the statement as written. Code: `continue`
- Line 120: Conditional branch. Code: `if "optionChain" not in body_text:`
- Line 121: Execute the statement as written. Code: `start = idx + len(token)`
- Line 122: Execute the statement as written. Code: `continue`
- Line 123: Start a try block for error handling. Code: `try:`
- Line 124: Execute the statement as written. Code: `payload = json.loads(body_text)`
- Line 125: Handle exceptions for the preceding try block. Code: `except json.JSONDecodeError:`
- Line 126: Execute the statement as written. Code: `start = idx + len(token)`
- Line 127: Execute the statement as written. Code: `continue`
- Line 128: Execute the statement as written. Code: `option_chain = payload.get("optionChain")`
- Line 129: Conditional branch. Code: `if option_chain and option_chain.get("result"):`
- Line 130: Return a value to the caller. Code: `return option_chain`
- Line 131: Blank line for readability. Code: `<blank>`
- Line 132: Execute the statement as written. Code: `start = idx + len(token)`
- Line 133: Blank line for readability. Code: `<blank>`
- Line 134: Return a value to the caller. Code: `return None`
- Line 135: Blank line for readability. Code: `<blank>`
- Line 136: Blank line for readability. Code: `<blank>`
- Line 137: Define the extract_expiration_dates_from_chain function. Code: `def extract_expiration_dates_from_chain(chain):`
- Line 138: Conditional branch. Code: `if not chain:`
- Line 139: Return a value to the caller. Code: `return []`
- Line 140: Blank line for readability. Code: `<blank>`
- Line 141: Execute the statement as written. Code: `result = chain.get("result", [])`
- Line 142: Conditional branch. Code: `if not result:`
- Line 143: Return a value to the caller. Code: `return []`
- Line 144: Return a value to the caller. Code: `return result[0].get("expirationDates", []) or []`
- Line 145: Blank line for readability. Code: `<blank>`
- Line 146: Blank line for readability. Code: `<blank>`
- Line 147: Define the normalize_chain_rows function. Code: `def normalize_chain_rows(rows):`
- Line 148: Execute the statement as written. Code: `normalized = []`
- Line 149: Loop over items. Code: `for row in rows or []:`
- Line 150: Execute the statement as written. Code: `normalized.append(`
- Line 151: Execute the statement as written. Code: `{`
- Line 152: Execute the statement as written. Code: `"Contract Name": row.get("contractSymbol"),`
- Line 153: Execute the statement as written. Code: `"Last Trade Date (EST)": format_last_trade_date(`
- Line 154: Execute the statement as written. Code: `row.get("lastTradeDate")`
- Line 155: Close the current block or container. Code: `),`
- Line 156: Execute the statement as written. Code: `"Strike": extract_raw_value(row.get("strike")),`
- Line 157: Execute the statement as written. Code: `"Last Price": extract_raw_value(row.get("lastPrice")),`
- Line 158: Execute the statement as written. Code: `"Bid": extract_raw_value(row.get("bid")),`
- Line 159: Execute the statement as written. Code: `"Ask": extract_raw_value(row.get("ask")),`
- Line 160: Execute the statement as written. Code: `"Change": extract_raw_value(row.get("change")),`
- Line 161: Execute the statement as written. Code: `"% Change": format_percent_value(row.get("percentChange")),`
- Line 162: Execute the statement as written. Code: `"Volume": extract_raw_value(row.get("volume")),`
- Line 163: Execute the statement as written. Code: `"Open Interest": extract_raw_value(row.get("openInterest")),`
- Line 164: Execute the statement as written. Code: `"Implied Volatility": format_percent_value(`
- Line 165: Execute the statement as written. Code: `row.get("impliedVolatility")`
- Line 166: Close the current block or container. Code: `),`
- Line 167: Close the current block or container. Code: `}`
- Line 168: Close the current block or container. Code: `)`
- Line 169: Return a value to the caller. Code: `return normalized`
- Line 170: Blank line for readability. Code: `<blank>`
- Line 171: Blank line for readability. Code: `<blank>`
- Line 172: Define the build_rows_from_chain function. Code: `def build_rows_from_chain(chain):`
- Line 173: Execute the statement as written. Code: `result = chain.get("result", []) if chain else []`
- Line 174: Conditional branch. Code: `if not result:`
- Line 175: Return a value to the caller. Code: `return [], []`
- Line 176: Execute the statement as written. Code: `options = result[0].get("options", [])`
- Line 177: Conditional branch. Code: `if not options:`
- Line 178: Return a value to the caller. Code: `return [], []`
- Line 179: Execute the statement as written. Code: `option = options[0]`
- Line 180: Return a value to the caller. Code: `return (`
- Line 181: Execute the statement as written. Code: `normalize_chain_rows(option.get("calls")),`
- Line 182: Execute the statement as written. Code: `normalize_chain_rows(option.get("puts")),`
- Line 183: Close the current block or container. Code: `)`
- Line 184: Blank line for readability. Code: `<blank>`
- Line 185: Blank line for readability. Code: `<blank>`
- Line 186: Define the extract_contract_expiry_code function. Code: `def extract_contract_expiry_code(contract_name):`
- Line 187: Conditional branch. Code: `if not contract_name:`
- Line 188: Return a value to the caller. Code: `return None`
- Line 189: Execute the statement as written. Code: `match = re.search(r"(\d{6})", contract_name)`
- Line 190: Return a value to the caller. Code: `return match.group(1) if match else None`
- Line 191: Blank line for readability. Code: `<blank>`
- Line 192: Blank line for readability. Code: `<blank>`
- Line 193: Define the expected_expiry_code function. Code: `def expected_expiry_code(timestamp):`
- Line 194: Conditional branch. Code: `if not timestamp:`
- Line 195: Return a value to the caller. Code: `return None`
- Line 196: Start a try block for error handling. Code: `try:`
- Line 197: Return a value to the caller. Code: `return datetime.utcfromtimestamp(timestamp).strftime("%y%m%d")`
- Line 198: Handle exceptions for the preceding try block. Code: `except Exception:`
- Line 199: Return a value to the caller. Code: `return None`
- Line 200: Blank line for readability. Code: `<blank>`
- Line 201: Blank line for readability. Code: `<blank>`
- Line 202: Define the extract_expiration_dates_from_html function. Code: `def extract_expiration_dates_from_html(html):`
- Line 203: Conditional branch. Code: `if not html:`
- Line 204: Return a value to the caller. Code: `return []`
- Line 205: Blank line for readability. Code: `<blank>`
- Line 206: Execute the statement as written. Code: `patterns = (`
- Line 207: Execute the statement as written. Code: `r'\\"expirationDates\\":\[(.*?)\]',`
- Line 208: Execute the statement as written. Code: `r'"expirationDates":\[(.*?)\]',`
- Line 209: Close the current block or container. Code: `)`
- Line 210: Execute the statement as written. Code: `match = None`
- Line 211: Loop over items. Code: `for pattern in patterns:`
- Line 212: Execute the statement as written. Code: `match = re.search(pattern, html, re.DOTALL)`
- Line 213: Conditional branch. Code: `if match:`
- Line 214: Execute the statement as written. Code: `break`
- Line 215: Conditional branch. Code: `if not match:`
- Line 216: Return a value to the caller. Code: `return []`
- Line 217: Blank line for readability. Code: `<blank>`
- Line 218: Execute the statement as written. Code: `raw = match.group(1)`
- Line 219: Execute the statement as written. Code: `values = []`
- Line 220: Loop over items. Code: `for part in raw.split(","):`
- Line 221: Execute the statement as written. Code: `part = part.strip()`
- Line 222: Conditional branch. Code: `if part.isdigit():`
- Line 223: Start a try block for error handling. Code: `try:`
- Line 224: Execute the statement as written. Code: `values.append(int(part))`
- Line 225: Handle exceptions for the preceding try block. Code: `except Exception:`
- Line 226: Execute the statement as written. Code: `continue`
- Line 227: Return a value to the caller. Code: `return values`
- Line 228: Blank line for readability. Code: `<blank>`
- Line 229: Blank line for readability. Code: `<blank>`
- Line 230: Define the build_expiration_options function. Code: `def build_expiration_options(expiration_dates):`
- Line 231: Execute the statement as written. Code: `options = []`
- Line 232: Loop over items. Code: `for value in expiration_dates or []:`
- Line 233: Start a try block for error handling. Code: `try:`
- Line 234: Execute the statement as written. Code: `value_int = int(value)`
- Line 235: Handle exceptions for the preceding try block. Code: `except Exception:`
- Line 236: Execute the statement as written. Code: `continue`
- Line 237: Blank line for readability. Code: `<blank>`
- Line 238: Execute the statement as written. Code: `label = format_expiration_label(value_int)`
- Line 239: Start a try block for error handling. Code: `try:`
- Line 240: Execute the statement as written. Code: `date_value = datetime.utcfromtimestamp(value_int).date()`
- Line 241: Handle exceptions for the preceding try block. Code: `except Exception:`
- Line 242: Execute the statement as written. Code: `date_value = None`
- Line 243: Blank line for readability. Code: `<blank>`
- Line 244: Execute the statement as written. Code: `options.append({"value": value_int, "label": label, "date": date_value})`
- Line 245: Return a value to the caller. Code: `return sorted(options, key=lambda x: x["value"])`
- Line 246: Blank line for readability. Code: `<blank>`
- Line 247: Blank line for readability. Code: `<blank>`
- Line 248: Define the resolve_expiration function. Code: `def resolve_expiration(expiration, options):`
- Line 249: Conditional branch. Code: `if not expiration:`
- Line 250: Return a value to the caller. Code: `return None, None`
- Line 251: Blank line for readability. Code: `<blank>`
- Line 252: Execute the statement as written. Code: `raw = expiration.strip()`
- Line 253: Conditional branch. Code: `if not raw:`
- Line 254: Return a value to the caller. Code: `return None, None`
- Line 255: Blank line for readability. Code: `<blank>`
- Line 256: Conditional branch. Code: `if raw.isdigit():`
- Line 257: Execute the statement as written. Code: `value = int(raw)`
- Line 258: Conditional branch. Code: `if options:`
- Line 259: Loop over items. Code: `for opt in options:`
- Line 260: Conditional branch. Code: `if opt.get("value") == value:`
- Line 261: Return a value to the caller. Code: `return value, opt.get("label")`
- Line 262: Return a value to the caller. Code: `return None, None`
- Line 263: Return a value to the caller. Code: `return value, format_expiration_label(value)`
- Line 264: Blank line for readability. Code: `<blank>`
- Line 265: Execute the statement as written. Code: `requested_date = parse_date(raw)`
- Line 266: Conditional branch. Code: `if requested_date:`
- Line 267: Loop over items. Code: `for opt in options:`
- Line 268: Conditional branch. Code: `if opt.get("date") == requested_date:`
- Line 269: Return a value to the caller. Code: `return opt.get("value"), opt.get("label")`
- Line 270: Return a value to the caller. Code: `return None, None`
- Line 271: Blank line for readability. Code: `<blank>`
- Line 272: Execute the statement as written. Code: `normalized = normalize_label(raw)`
- Line 273: Loop over items. Code: `for opt in options:`
- Line 274: Conditional branch. Code: `if normalize_label(opt.get("label", "")) == normalized:`
- Line 275: Return a value to the caller. Code: `return opt.get("value"), opt.get("label")`
- Line 276: Blank line for readability. Code: `<blank>`
- Line 277: Return a value to the caller. Code: `return None, None`
- Line 278: Blank line for readability. Code: `<blank>`
- Line 279: Blank line for readability. Code: `<blank>`
- Line 280: Define the wait_for_tables function. Code: `def wait_for_tables(page):`
- Line 281: Start a try block for error handling. Code: `try:`
- Line 282: Interact with the Playwright page. Code: `page.wait_for_selector(`
- Line 283: Execute the statement as written. Code: `"section[data-testid='options-list-table'] table",`
- Line 284: Execute the statement as written. Code: `timeout=30000,`
- Line 285: Close the current block or container. Code: `)`
- Line 286: Handle exceptions for the preceding try block. Code: `except Exception:`
- Line 287: Interact with the Playwright page. Code: `page.wait_for_selector("table", timeout=30000)`
- Line 288: Blank line for readability. Code: `<blank>`
- Line 289: Loop over items. Code: `for _ in range(30): # 30 * 1s = 30 seconds`
- Line 290: Collect option tables from the page. Code: `tables = page.query_selector_all(`
- Line 291: Execute the statement as written. Code: `"section[data-testid='options-list-table'] table"`
- Line 292: Close the current block or container. Code: `)`
- Line 293: Conditional branch. Code: `if len(tables) >= 2:`
- Line 294: Return a value to the caller. Code: `return tables`
- Line 295: Collect option tables from the page. Code: `tables = page.query_selector_all("table")`
- Line 296: Conditional branch. Code: `if len(tables) >= 2:`
- Line 297: Return a value to the caller. Code: `return tables`
- Line 298: Execute the statement as written. Code: `time.sleep(1)`
- Line 299: Return a value to the caller. Code: `return []`
- Line 300: Blank line for readability. Code: `<blank>`
- Line 301: Blank line for readability. Code: `<blank>`
- Line 302: Define the parse_strike_limit function. Code: `def parse_strike_limit(value, default=25):`
- Line 303: Conditional branch. Code: `if value is None:`
- Line 304: Return a value to the caller. Code: `return default`
- Line 305: Start a try block for error handling. Code: `try:`
- Line 306: Execute the statement as written. Code: `limit = int(value)`
- Line 307: Handle exceptions for the preceding try block. Code: `except (TypeError, ValueError):`
- Line 308: Return a value to the caller. Code: `return default`
- Line 309: Return a value to the caller. Code: `return limit if limit > 0 else default`
- Line 310: Blank line for readability. Code: `<blank>`
- Line 311: Blank line for readability. Code: `<blank>`
- Line 312: Define the scrape_yahoo_options function. Code: `def scrape_yahoo_options(symbol, expiration=None, strike_limit=25):`
- Line 313: Define the parse_table function. Code: `def parse_table(table_html, side):`
- Line 314: Conditional branch. Code: `if not table_html:`
- Line 315: Emit or configure a log message. Code: `app.logger.warning("No %s table HTML for %s", side, symbol)`
- Line 316: Return a value to the caller. Code: `return []`
- Line 317: Blank line for readability. Code: `<blank>`
- Line 318: Execute the statement as written. Code: `soup = BeautifulSoup(table_html, "html.parser")`
- Line 319: Blank line for readability. Code: `<blank>`
- Line 320: Extract header labels from the table. Code: `headers = [th.get_text(strip=True) for th in soup.select("thead th")]`
- Line 321: Collect table rows for parsing. Code: `rows = soup.select("tbody tr")`
- Line 322: Blank line for readability. Code: `<blank>`
- Line 323: Initialize the parsed rows list. Code: `parsed = []`
- Line 324: Loop over items. Code: `for r in rows:`
- Line 325: Collect table cells for the current row. Code: `tds = r.find_all("td")`
- Line 326: Conditional branch. Code: `if len(tds) != len(headers):`
- Line 327: Execute the statement as written. Code: `continue`
- Line 328: Blank line for readability. Code: `<blank>`
- Line 329: Initialize a row dictionary. Code: `item = {}`
- Line 330: Loop over items. Code: `for i, c in enumerate(tds):`
- Line 331: Read the header name for the current column. Code: `key = headers[i]`
- Line 332: Read or convert the cell value. Code: `val = c.get_text(" ", strip=True)`
- Line 333: Blank line for readability. Code: `<blank>`
- Line 334: Comment describing the next block. Code: `# Convert numeric fields`
- Line 335: Conditional branch. Code: `if key in ["Strike", "Last Price", "Bid", "Ask", "Change"]:`
- Line 336: Start a try block for error handling. Code: `try:`
- Line 337: Read or convert the cell value. Code: `val = float(val.replace(",", ""))`
- Line 338: Handle exceptions for the preceding try block. Code: `except Exception:`
- Line 339: Read or convert the cell value. Code: `val = None`
- Line 340: Alternative conditional branch. Code: `elif key in ["Volume", "Open Interest"]:`
- Line 341: Start a try block for error handling. Code: `try:`
- Line 342: Read or convert the cell value. Code: `val = int(val.replace(",", ""))`
- Line 343: Handle exceptions for the preceding try block. Code: `except Exception:`
- Line 344: Read or convert the cell value. Code: `val = None`
- Line 345: Alternative conditional branch. Code: `elif val in ["-", ""]:`
- Line 346: Read or convert the cell value. Code: `val = None`
- Line 347: Blank line for readability. Code: `<blank>`
- Line 348: Execute the statement as written. Code: `item[key] = val`
- Line 349: Blank line for readability. Code: `<blank>`
- Line 350: Execute the statement as written. Code: `parsed.append(item)`
- Line 351: Blank line for readability. Code: `<blank>`
- Line 352: Emit or configure a log message. Code: `app.logger.info("Parsed %d %s rows", len(parsed), side)`
- Line 353: Return a value to the caller. Code: `return parsed`
- Line 354: Blank line for readability. Code: `<blank>`
- Line 355: Define the read_option_chain function. Code: `def read_option_chain(page):`
- Line 356: Capture the page HTML content. Code: `html = page.content()`
- Line 357: Execute the statement as written. Code: `option_chain = extract_option_chain_from_html(html)`
- Line 358: Conditional branch. Code: `if option_chain:`
- Line 359: Extract expiration date timestamps from the HTML. Code: `expiration_dates = extract_expiration_dates_from_chain(option_chain)`
- Line 360: Fallback branch. Code: `else:`
- Line 361: Extract expiration date timestamps from the HTML. Code: `expiration_dates = extract_expiration_dates_from_html(html)`
- Line 362: Return a value to the caller. Code: `return option_chain, expiration_dates`
- Line 363: Blank line for readability. Code: `<blank>`
- Line 364: Define the has_expected_expiry function. Code: `def has_expected_expiry(options, expected_code):`
- Line 365: Conditional branch. Code: `if not expected_code:`
- Line 366: Return a value to the caller. Code: `return False`
- Line 367: Loop over items. Code: `for row in options or []:`
- Line 368: Execute the statement as written. Code: `name = row.get("Contract Name")`
- Line 369: Conditional branch. Code: `if extract_contract_expiry_code(name) == expected_code:`
- Line 370: Return a value to the caller. Code: `return True`
- Line 371: Return a value to the caller. Code: `return False`
- Line 372: Blank line for readability. Code: `<blank>`
- Line 373: URL-encode the stock symbol. Code: `encoded = urllib.parse.quote(symbol, safe="")`
- Line 374: Build the base Yahoo Finance options URL. Code: `base_url = f"https://finance.yahoo.com/quote/{encoded}/options/"`
- Line 375: Normalize the expiration input string. Code: `requested_expiration = expiration.strip() if expiration else None`
- Line 376: Conditional branch. Code: `if not requested_expiration:`
- Line 377: Normalize the expiration input string. Code: `requested_expiration = None`
- Line 378: Set the URL to load. Code: `url = base_url`
- Line 379: Blank line for readability. Code: `<blank>`
- Line 380: Emit or configure a log message. Code: `app.logger.info(`
- Line 381: Execute the statement as written. Code: `"Starting scrape for symbol=%s expiration=%s url=%s",`
- Line 382: Execute the statement as written. Code: `symbol,`
- Line 383: Execute the statement as written. Code: `requested_expiration,`
- Line 384: Execute the statement as written. Code: `base_url,`
- Line 385: Close the current block or container. Code: `)`
- Line 386: Blank line for readability. Code: `<blank>`
- Line 387: Reserve storage for options table HTML. Code: `calls_html = None`
- Line 388: Reserve storage for options table HTML. Code: `puts_html = None`
- Line 389: Parse the full calls and puts tables. Code: `calls_full = []`
- Line 390: Parse the full calls and puts tables. Code: `puts_full = []`
- Line 391: Initialize or assign the current price. Code: `price = None`
- Line 392: Track the resolved expiration metadata. Code: `selected_expiration_value = None`
- Line 393: Track the resolved expiration metadata. Code: `selected_expiration_label = None`
- Line 394: Prepare or update the list of available expirations. Code: `expiration_options = []`
- Line 395: Track the resolved expiration epoch timestamp. Code: `target_date = None`
- Line 396: Track whether a base-page lookup is needed. Code: `fallback_to_base = False`
- Line 397: Blank line for readability. Code: `<blank>`
- Line 398: Enter a context manager block. Code: `with sync_playwright() as p:`
- Line 399: Launch a Playwright browser instance. Code: `browser = p.chromium.launch(headless=True)`
- Line 400: Create a new Playwright page. Code: `page = browser.new_page()`
- Line 401: Interact with the Playwright page. Code: `page.set_extra_http_headers(`
- Line 402: Execute the statement as written. Code: `{`
- Line 403: Execute the statement as written. Code: `"User-Agent": (`
- Line 404: Execute the statement as written. Code: `"Mozilla/5.0 (Windows NT 10.0; Win64; x64) "`
- Line 405: Execute the statement as written. Code: `"AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120 Safari/537.36"`
- Line 406: Close the current block or container. Code: `)`
- Line 407: Close the current block or container. Code: `}`
- Line 408: Close the current block or container. Code: `)`
- Line 409: Interact with the Playwright page. Code: `page.set_default_timeout(60000)`
- Line 410: Blank line for readability. Code: `<blank>`
- Line 411: Start a try block for error handling. Code: `try:`
- Line 412: Conditional branch. Code: `if requested_expiration:`
- Line 413: Conditional branch. Code: `if requested_expiration.isdigit():`
- Line 414: Track the resolved expiration epoch timestamp. Code: `target_date = int(requested_expiration)`
- Line 415: Track the resolved expiration metadata. Code: `selected_expiration_value = target_date`
- Line 416: Track the resolved expiration metadata. Code: `selected_expiration_label = format_expiration_label(target_date)`
- Line 417: Fallback branch. Code: `else:`
- Line 418: Execute the statement as written. Code: `parsed_date = parse_date(requested_expiration)`
- Line 419: Conditional branch. Code: `if parsed_date:`
- Line 420: Track the resolved expiration epoch timestamp. Code: `target_date = int(`
- Line 421: Execute the statement as written. Code: `datetime(`
- Line 422: Execute the statement as written. Code: `parsed_date.year,`
- Line 423: Execute the statement as written. Code: `parsed_date.month,`
- Line 424: Execute the statement as written. Code: `parsed_date.day,`
- Line 425: Execute the statement as written. Code: `tzinfo=timezone.utc,`
- Line 426: Execute the statement as written. Code: `).timestamp()`
- Line 427: Close the current block or container. Code: `)`
- Line 428: Track the resolved expiration metadata. Code: `selected_expiration_value = target_date`
- Line 429: Track the resolved expiration metadata. Code: `selected_expiration_label = format_expiration_label(target_date)`
- Line 430: Fallback branch. Code: `else:`
- Line 431: Track whether a base-page lookup is needed. Code: `fallback_to_base = True`
- Line 432: Blank line for readability. Code: `<blank>`
- Line 433: Conditional branch. Code: `if target_date:`
- Line 434: Set the URL to load. Code: `url = f"{base_url}?date={target_date}"`
- Line 435: Blank line for readability. Code: `<blank>`
- Line 436: Navigate the Playwright page to the target URL. Code: `page.goto(url, wait_until="domcontentloaded", timeout=60000)`
- Line 437: Emit or configure a log message. Code: `app.logger.info("Page loaded (domcontentloaded) for %s", symbol)`
- Line 438: Blank line for readability. Code: `<blank>`
- Line 439: Execute the statement as written. Code: `option_chain, expiration_dates = read_option_chain(page)`
- Line 440: Emit or configure a log message. Code: `app.logger.info("Option chain found: %s", bool(option_chain))`
- Line 441: Prepare or update the list of available expirations. Code: `expiration_options = build_expiration_options(expiration_dates)`
- Line 442: Blank line for readability. Code: `<blank>`
- Line 443: Conditional branch. Code: `if fallback_to_base:`
- Line 444: Execute the statement as written. Code: `resolved_value, resolved_label = resolve_expiration(`
- Line 445: Execute the statement as written. Code: `requested_expiration, expiration_options`
- Line 446: Close the current block or container. Code: `)`
- Line 447: Conditional branch. Code: `if resolved_value is None:`
- Line 448: Return a value to the caller. Code: `return {`
- Line 449: Execute the statement as written. Code: `"error": "Requested expiration not available",`
- Line 450: Execute the statement as written. Code: `"stock": symbol,`
- Line 451: Execute the statement as written. Code: `"requested_expiration": requested_expiration,`
- Line 452: Execute the statement as written. Code: `"available_expirations": [`
- Line 453: Execute the statement as written. Code: `{"label": opt.get("label"), "value": opt.get("value")}`
- Line 454: Loop over items. Code: `for opt in expiration_options`
- Line 455: Close the current block or container. Code: `],`
- Line 456: Close the current block or container. Code: `}`
- Line 457: Blank line for readability. Code: `<blank>`
- Line 458: Track the resolved expiration epoch timestamp. Code: `target_date = resolved_value`
- Line 459: Track the resolved expiration metadata. Code: `selected_expiration_value = resolved_value`
- Line 460: Track the resolved expiration metadata. Code: `selected_expiration_label = resolved_label or format_expiration_label(`
- Line 461: Execute the statement as written. Code: `resolved_value`
- Line 462: Close the current block or container. Code: `)`
- Line 463: Set the URL to load. Code: `url = f"{base_url}?date={resolved_value}"`
- Line 464: Navigate the Playwright page to the target URL. Code: `page.goto(url, wait_until="domcontentloaded", timeout=60000)`
- Line 465: Emit or configure a log message. Code: `app.logger.info("Page loaded (domcontentloaded) for %s", symbol)`
- Line 466: Blank line for readability. Code: `<blank>`
- Line 467: Execute the statement as written. Code: `option_chain, expiration_dates = read_option_chain(page)`
- Line 468: Prepare or update the list of available expirations. Code: `expiration_options = build_expiration_options(expiration_dates)`
- Line 469: Blank line for readability. Code: `<blank>`
- Line 470: Conditional branch. Code: `if target_date and expiration_options:`
- Line 471: Execute the statement as written. Code: `matched = None`
- Line 472: Loop over items. Code: `for opt in expiration_options:`
- Line 473: Conditional branch. Code: `if opt.get("value") == target_date:`
- Line 474: Execute the statement as written. Code: `matched = opt`
- Line 475: Execute the statement as written. Code: `break`
- Line 476: Conditional branch. Code: `if not matched:`
- Line 477: Return a value to the caller. Code: `return {`
- Line 478: Execute the statement as written. Code: `"error": "Requested expiration not available",`
- Line 479: Execute the statement as written. Code: `"stock": symbol,`
- Line 480: Execute the statement as written. Code: `"requested_expiration": requested_expiration,`
- Line 481: Execute the statement as written. Code: `"available_expirations": [`
- Line 482: Execute the statement as written. Code: `{"label": opt.get("label"), "value": opt.get("value")}`
- Line 483: Loop over items. Code: `for opt in expiration_options`
- Line 484: Close the current block or container. Code: `],`
- Line 485: Close the current block or container. Code: `}`
- Line 486: Track the resolved expiration metadata. Code: `selected_expiration_value = matched.get("value")`
- Line 487: Track the resolved expiration metadata. Code: `selected_expiration_label = matched.get("label")`
- Line 488: Alternative conditional branch. Code: `elif expiration_options and not target_date:`
- Line 489: Track the resolved expiration metadata. Code: `selected_expiration_value = expiration_options[0].get("value")`
- Line 490: Track the resolved expiration metadata. Code: `selected_expiration_label = expiration_options[0].get("label")`
- Line 491: Blank line for readability. Code: `<blank>`
- Line 492: Execute the statement as written. Code: `calls_full, puts_full = build_rows_from_chain(option_chain)`
- Line 493: Emit or configure a log message. Code: `app.logger.info(`
- Line 494: Execute the statement as written. Code: `"Option chain rows: calls=%d puts=%d",`
- Line 495: Execute the statement as written. Code: `len(calls_full),`
- Line 496: Execute the statement as written. Code: `len(puts_full),`
- Line 497: Close the current block or container. Code: `)`
- Line 498: Blank line for readability. Code: `<blank>`
- Line 499: Conditional branch. Code: `if not calls_full and not puts_full:`
- Line 500: Emit or configure a log message. Code: `app.logger.info("Waiting for options tables...")`
- Line 501: Blank line for readability. Code: `<blank>`
- Line 502: Collect option tables from the page. Code: `tables = wait_for_tables(page)`
- Line 503: Conditional branch. Code: `if len(tables) < 2:`
- Line 504: Emit or configure a log message. Code: `app.logger.error(`
- Line 505: Execute the statement as written. Code: `"Only %d tables found; expected 2. HTML may have changed.",`
- Line 506: Execute the statement as written. Code: `len(tables),`
- Line 507: Close the current block or container. Code: `)`
- Line 508: Return a value to the caller. Code: `return {"error": "Could not locate options tables", "stock": symbol}`
- Line 509: Blank line for readability. Code: `<blank>`
- Line 510: Emit or configure a log message. Code: `app.logger.info("Found %d tables. Extracting Calls & Puts.", len(tables))`
- Line 511: Blank line for readability. Code: `<blank>`
- Line 512: Reserve storage for options table HTML. Code: `calls_html = tables[0].evaluate("el => el.outerHTML")`
- Line 513: Reserve storage for options table HTML. Code: `puts_html = tables[1].evaluate("el => el.outerHTML")`
- Line 514: Blank line for readability. Code: `<blank>`
- Line 515: Comment describing the next block. Code: `# --- Extract current price ---`
- Line 516: Start a try block for error handling. Code: `try:`
- Line 517: Comment describing the next block. Code: `# Primary selector`
- Line 518: Read the current price text from the page. Code: `price_text = page.locator(`
- Line 519: Execute the statement as written. Code: `"fin-streamer[data-field='regularMarketPrice']"`
- Line 520: Execute the statement as written. Code: `).inner_text()`
- Line 521: Initialize or assign the current price. Code: `price = float(price_text.replace(",", ""))`
- Line 522: Handle exceptions for the preceding try block. Code: `except Exception:`
- Line 523: Start a try block for error handling. Code: `try:`
- Line 524: Comment describing the next block. Code: `# Fallback`
- Line 525: Read the current price text from the page. Code: `price_text = page.locator("span[data-testid='qsp-price']").inner_text()`
- Line 526: Initialize or assign the current price. Code: `price = float(price_text.replace(",", ""))`
- Line 527: Handle exceptions for the preceding try block. Code: `except Exception as e:`
- Line 528: Emit or configure a log message. Code: `app.logger.warning("Failed to extract price for %s: %s", symbol, e)`
- Line 529: Blank line for readability. Code: `<blank>`
- Line 530: Emit or configure a log message. Code: `app.logger.info("Current price for %s = %s", symbol, price)`
- Line 531: Execute the statement as written. Code: `finally:`
- Line 532: Execute the statement as written. Code: `browser.close()`
- Line 533: Blank line for readability. Code: `<blank>`
- Line 534: Conditional branch. Code: `if not calls_full and not puts_full and calls_html and puts_html:`
- Line 535: Parse the full calls and puts tables. Code: `calls_full = parse_table(calls_html, "calls")`
- Line 536: Parse the full calls and puts tables. Code: `puts_full = parse_table(puts_html, "puts")`
- Line 537: Blank line for readability. Code: `<blank>`
- Line 538: Execute the statement as written. Code: `expected_code = expected_expiry_code(target_date)`
- Line 539: Conditional branch. Code: `if expected_code:`
- Line 540: Conditional branch. Code: `if not has_expected_expiry(calls_full, expected_code) and not has_expected_expiry(`
- Line 541: Execute the statement as written. Code: `puts_full, expected_code`
- Line 542: Close the current block or container. Code: `):`
- Line 543: Return a value to the caller. Code: `return {`
- Line 544: Execute the statement as written. Code: `"error": "Options chain does not match requested expiration",`
- Line 545: Execute the statement as written. Code: `"stock": symbol,`
- Line 546: Execute the statement as written. Code: `"requested_expiration": requested_expiration,`
- Line 547: Execute the statement as written. Code: `"expected_expiration_code": expected_code,`
- Line 548: Execute the statement as written. Code: `"selected_expiration": {`
- Line 549: Execute the statement as written. Code: `"value": selected_expiration_value,`
- Line 550: Execute the statement as written. Code: `"label": selected_expiration_label,`
- Line 551: Close the current block or container. Code: `},`
- Line 552: Close the current block or container. Code: `}`
- Line 553: Blank line for readability. Code: `<blank>`
- Line 554: Comment describing the next block. Code: `# ----------------------------------------------------------------------`
- Line 555: Comment describing the next block. Code: `# Pruning logic`
- Line 556: Comment describing the next block. Code: `# ----------------------------------------------------------------------`
- Line 557: Define the prune_nearest function. Code: `def prune_nearest(options, price_value, limit=25, side=""):`
- Line 558: Conditional branch. Code: `if price_value is None:`
- Line 559: Return a value to the caller. Code: `return options, 0`
- Line 560: Blank line for readability. Code: `<blank>`
- Line 561: Filter options to numeric strike entries. Code: `numeric = [o for o in options if isinstance(o.get("Strike"), (int, float))]`
- Line 562: Blank line for readability. Code: `<blank>`
- Line 563: Conditional branch. Code: `if len(numeric) <= limit:`
- Line 564: Return a value to the caller. Code: `return numeric, 0`
- Line 565: Blank line for readability. Code: `<blank>`
- Line 566: Sort options by distance to current price. Code: `sorted_opts = sorted(numeric, key=lambda x: abs(x["Strike"] - price_value))`
- Line 567: Keep the closest strike entries. Code: `pruned = sorted_opts[:limit]`
- Line 568: Compute how many rows were pruned. Code: `pruned_count = len(options) - len(pruned)`
- Line 569: Return a value to the caller. Code: `return pruned, pruned_count`
- Line 570: Blank line for readability. Code: `<blank>`
- Line 571: Apply pruning to calls. Code: `calls, pruned_calls = prune_nearest(`
- Line 572: Execute the statement as written. Code: `calls_full,`
- Line 573: Execute the statement as written. Code: `price,`
- Line 574: Execute the statement as written. Code: `limit=strike_limit,`
- Line 575: Execute the statement as written. Code: `side="calls",`
- Line 576: Close the current block or container. Code: `)`
- Line 577: Apply pruning to puts. Code: `puts, pruned_puts = prune_nearest(`
- Line 578: Execute the statement as written. Code: `puts_full,`
- Line 579: Execute the statement as written. Code: `price,`
- Line 580: Execute the statement as written. Code: `limit=strike_limit,`
- Line 581: Execute the statement as written. Code: `side="puts",`
- Line 582: Close the current block or container. Code: `)`
- Line 583: Blank line for readability. Code: `<blank>`
- Line 584: Define the strike_range function. Code: `def strike_range(opts):`
- Line 585: Collect strike prices from the option list. Code: `strikes = [o["Strike"] for o in opts if isinstance(o.get("Strike"), (int, float))]`
- Line 586: Return a value to the caller. Code: `return [min(strikes), max(strikes)] if strikes else [None, None]`
- Line 587: Blank line for readability. Code: `<blank>`
- Line 588: Return a value to the caller. Code: `return {`
- Line 589: Execute the statement as written. Code: `"stock": symbol,`
- Line 590: Execute the statement as written. Code: `"url": url,`
- Line 591: Execute the statement as written. Code: `"requested_expiration": requested_expiration,`
- Line 592: Execute the statement as written. Code: `"selected_expiration": {`
- Line 593: Execute the statement as written. Code: `"value": selected_expiration_value,`
- Line 594: Execute the statement as written. Code: `"label": selected_expiration_label,`
- Line 595: Close the current block or container. Code: `},`
- Line 596: Execute the statement as written. Code: `"current_price": price,`
- Line 597: Execute the statement as written. Code: `"calls": calls,`
- Line 598: Execute the statement as written. Code: `"puts": puts,`
- Line 599: Execute the statement as written. Code: `"calls_strike_range": strike_range(calls),`
- Line 600: Execute the statement as written. Code: `"puts_strike_range": strike_range(puts),`
- Line 601: Execute the statement as written. Code: `"total_calls": len(calls),`
- Line 602: Execute the statement as written. Code: `"total_puts": len(puts),`
- Line 603: Execute the statement as written. Code: `"pruned_calls_count": pruned_calls,`
- Line 604: Execute the statement as written. Code: `"pruned_puts_count": pruned_puts,`
- Line 605: Close the current block or container. Code: `}`
- Line 606: Blank line for readability. Code: `<blank>`
- Line 607: Blank line for readability. Code: `<blank>`
- Line 608: Attach the route decorator to the handler. Code: `@app.route("/scrape_sync")`
- Line 609: Define the scrape_sync function. Code: `def scrape_sync():`
- Line 610: Read the stock symbol parameter. Code: `symbol = request.args.get("stock", "MSFT")`
- Line 611: Read the expiration parameters from the request. Code: `expiration = (`
- Line 612: Execute the statement as written. Code: `request.args.get("expiration")`
- Line 613: Execute the statement as written. Code: `or request.args.get("expiry")`
- Line 614: Execute the statement as written. Code: `or request.args.get("date")`
- Line 615: Close the current block or container. Code: `)`
- Line 616: Read or default the strikeLimit parameter. Code: `strike_limit = parse_strike_limit(request.args.get("strikeLimit"), default=25)`
- Line 617: Emit or configure a log message. Code: `app.logger.info(`
- Line 618: Execute the statement as written. Code: `"Received /scrape_sync request for symbol=%s expiration=%s strike_limit=%s",`
- Line 619: Execute the statement as written. Code: `symbol,`
- Line 620: Execute the statement as written. Code: `expiration,`
- Line 621: Read or default the strikeLimit parameter. Code: `strike_limit,`
- Line 622: Close the current block or container. Code: `)`
- Line 623: Return a value to the caller. Code: `return jsonify(scrape_yahoo_options(symbol, expiration, strike_limit))`
- Line 624: Blank line for readability. Code: `<blank>`
- Line 625: Blank line for readability. Code: `<blank>`
- Line 626: Conditional branch. Code: `if __name__ == "__main__":`
- Line 627: Run the Flask development server. Code: `app.run(host="0.0.0.0", port=9777)`