46 KiB
46 KiB
AGENTS.md
Context
- This project exposes a Flask API that uses Playwright to scrape Yahoo Finance options chains.
- Entry point:
scraper_service.py(launched viarunner.bator directly with Python). - API route:
GET /scrape_syncwithstockand optionalexpiration|expiry|dateparameters. - Expiration inputs: epoch seconds (Yahoo date param) or date strings supported by
DATE_FORMATS.
Docker
- Build:
docker build -t <image>:latest . - Run:
docker run --rm -p 9777:9777 <image>:latest - The container uses the Playwright base image with bundled browsers.
Line-by-line explanation of scraper_service.py
- Line 1: Import symbols from flask. Code:
from flask import Flask, jsonify, request - Line 2: Import symbols from playwright.sync_api. Code:
from playwright.sync_api import sync_playwright - Line 3: Import symbols from bs4. Code:
from bs4 import BeautifulSoup - Line 4: Import symbols from datetime. Code:
from datetime import datetime, timezone - Line 5: Import module urllib.parse. Code:
import urllib.parse - Line 6: Import module logging. Code:
import logging - Line 7: Import module json. Code:
import json - Line 8: Import module re. Code:
import re - Line 9: Import module time. Code:
import time - Line 10: Blank line for readability. Code:
<blank> - Line 11: Create the Flask application instance. Code:
app = Flask(__name__) - Line 12: Blank line for readability. Code:
<blank> - Line 13: Comment describing the next block. Code:
# Logging - Line 14: Configure logging defaults. Code:
logging.basicConfig( - Line 15: Execute the statement as written. Code:
level=logging.INFO, - Line 16: Execute the statement as written. Code:
format="%(asctime)s [%(levelname)s] %(message)s" - Line 17: Close the current block or container. Code:
) - Line 18: Set the Flask logger level. Code:
app.logger.setLevel(logging.INFO) - Line 19: Blank line for readability. Code:
<blank> - Line 20: Define accepted expiration date string formats. Code:
DATE_FORMATS = ( - Line 21: Execute the statement as written. Code:
"%Y-%m-%d", - Line 22: Execute the statement as written. Code:
"%Y/%m/%d", - Line 23: Execute the statement as written. Code:
"%Y%m%d", - Line 24: Execute the statement as written. Code:
"%b %d, %Y", - Line 25: Execute the statement as written. Code:
"%B %d, %Y", - Line 26: Close the current block or container. Code:
) - Line 27: Blank line for readability. Code:
<blank> - Line 28: Blank line for readability. Code:
<blank> - Line 29: Define the parse_date function. Code:
def parse_date(value): - Line 30: Loop over items. Code:
for fmt in DATE_FORMATS: - Line 31: Start a try block for error handling. Code:
try: - Line 32: Return a value to the caller. Code:
return datetime.strptime(value, fmt).date() - Line 33: Handle exceptions for the preceding try block. Code:
except ValueError: - Line 34: Execute the statement as written. Code:
continue - Line 35: Return a value to the caller. Code:
return None - Line 36: Blank line for readability. Code:
<blank> - Line 37: Blank line for readability. Code:
<blank> - Line 38: Define the normalize_label function. Code:
def normalize_label(value): - Line 39: Return a value to the caller. Code:
return " ".join(value.strip().split()).lower() - Line 40: Blank line for readability. Code:
<blank> - Line 41: Blank line for readability. Code:
<blank> - Line 42: Define the format_expiration_label function. Code:
def format_expiration_label(timestamp): - Line 43: Start a try block for error handling. Code:
try: - Line 44: Return a value to the caller. Code:
return datetime.utcfromtimestamp(timestamp).strftime("%Y-%m-%d") - Line 45: Handle exceptions for the preceding try block. Code:
except Exception: - Line 46: Return a value to the caller. Code:
return str(timestamp) - Line 47: Blank line for readability. Code:
<blank> - Line 48: Blank line for readability. Code:
<blank> - Line 49: Define the format_percent function. Code:
def format_percent(value): - Line 50: Conditional branch. Code:
if value is None: - Line 51: Return a value to the caller. Code:
return None - Line 52: Start a try block for error handling. Code:
try: - Line 53: Return a value to the caller. Code:
return f"{value * 100:.2f}%" - Line 54: Handle exceptions for the preceding try block. Code:
except Exception: - Line 55: Return a value to the caller. Code:
return None - Line 56: Blank line for readability. Code:
<blank> - Line 57: Blank line for readability. Code:
<blank> - Line 58: Define the extract_raw_value function. Code:
def extract_raw_value(value): - Line 59: Conditional branch. Code:
if isinstance(value, dict): - Line 60: Return a value to the caller. Code:
return value.get("raw") - Line 61: Return a value to the caller. Code:
return value - Line 62: Blank line for readability. Code:
<blank> - Line 63: Blank line for readability. Code:
<blank> - Line 64: Define the extract_fmt_value function. Code:
def extract_fmt_value(value): - Line 65: Conditional branch. Code:
if isinstance(value, dict): - Line 66: Return a value to the caller. Code:
return value.get("fmt") - Line 67: Return a value to the caller. Code:
return None - Line 68: Blank line for readability. Code:
<blank> - Line 69: Blank line for readability. Code:
<blank> - Line 70: Define the format_percent_value function. Code:
def format_percent_value(value): - Line 71: Execute the statement as written. Code:
fmt = extract_fmt_value(value) - Line 72: Conditional branch. Code:
if fmt is not None: - Line 73: Return a value to the caller. Code:
return fmt - Line 74: Return a value to the caller. Code:
return format_percent(extract_raw_value(value)) - Line 75: Blank line for readability. Code:
<blank> - Line 76: Blank line for readability. Code:
<blank> - Line 77: Define the format_last_trade_date function. Code:
def format_last_trade_date(timestamp): - Line 78: Execute the statement as written. Code:
timestamp = extract_raw_value(timestamp) - Line 79: Conditional branch. Code:
if not timestamp: - Line 80: Return a value to the caller. Code:
return None - Line 81: Start a try block for error handling. Code:
try: - Line 82: Return a value to the caller. Code:
return datetime.fromtimestamp(timestamp).strftime("%m/%d/%Y %I:%M %p") + " EST" - Line 83: Handle exceptions for the preceding try block. Code:
except Exception: - Line 84: Return a value to the caller. Code:
return None - Line 85: Blank line for readability. Code:
<blank> - Line 86: Blank line for readability. Code:
<blank> - Line 87: Define the extract_option_chain_from_html function. Code:
def extract_option_chain_from_html(html): - Line 88: Conditional branch. Code:
if not html: - Line 89: Return a value to the caller. Code:
return None - Line 90: Blank line for readability. Code:
<blank> - Line 91: Execute the statement as written. Code:
token = "\"body\":\"" - Line 92: Execute the statement as written. Code:
start = 0 - Line 93: Execute the statement as written. Code:
while True: - Line 94: Execute the statement as written. Code:
idx = html.find(token, start) - Line 95: Conditional branch. Code:
if idx == -1: - Line 96: Execute the statement as written. Code:
break - Line 97: Execute the statement as written. Code:
i = idx + len(token) - Line 98: Execute the statement as written. Code:
escaped = False - Line 99: Execute the statement as written. Code:
raw_chars = [] - Line 100: Execute the statement as written. Code:
while i < len(html): - Line 101: Execute the statement as written. Code:
ch = html[i] - Line 102: Conditional branch. Code:
if escaped: - Line 103: Execute the statement as written. Code:
raw_chars.append(ch) - Line 104: Execute the statement as written. Code:
escaped = False - Line 105: Fallback branch. Code:
else: - Line 106: Conditional branch. Code:
if ch == "\\": - Line 107: Execute the statement as written. Code:
raw_chars.append(ch) - Line 108: Execute the statement as written. Code:
escaped = True - Line 109: Alternative conditional branch. Code:
elif ch == "\"": - Line 110: Execute the statement as written. Code:
break - Line 111: Fallback branch. Code:
else: - Line 112: Execute the statement as written. Code:
raw_chars.append(ch) - Line 113: Execute the statement as written. Code:
i += 1 - Line 114: Execute the statement as written. Code:
raw = "".join(raw_chars) - Line 115: Start a try block for error handling. Code:
try: - Line 116: Execute the statement as written. Code:
body_text = json.loads(f"\"{raw}\"") - Line 117: Handle exceptions for the preceding try block. Code:
except json.JSONDecodeError: - Line 118: Execute the statement as written. Code:
start = idx + len(token) - Line 119: Execute the statement as written. Code:
continue - Line 120: Conditional branch. Code:
if "optionChain" not in body_text: - Line 121: Execute the statement as written. Code:
start = idx + len(token) - Line 122: Execute the statement as written. Code:
continue - Line 123: Start a try block for error handling. Code:
try: - Line 124: Execute the statement as written. Code:
payload = json.loads(body_text) - Line 125: Handle exceptions for the preceding try block. Code:
except json.JSONDecodeError: - Line 126: Execute the statement as written. Code:
start = idx + len(token) - Line 127: Execute the statement as written. Code:
continue - Line 128: Execute the statement as written. Code:
option_chain = payload.get("optionChain") - Line 129: Conditional branch. Code:
if option_chain and option_chain.get("result"): - Line 130: Return a value to the caller. Code:
return option_chain - Line 131: Blank line for readability. Code:
<blank> - Line 132: Execute the statement as written. Code:
start = idx + len(token) - Line 133: Blank line for readability. Code:
<blank> - Line 134: Return a value to the caller. Code:
return None - Line 135: Blank line for readability. Code:
<blank> - Line 136: Blank line for readability. Code:
<blank> - Line 137: Define the extract_expiration_dates_from_chain function. Code:
def extract_expiration_dates_from_chain(chain): - Line 138: Conditional branch. Code:
if not chain: - Line 139: Return a value to the caller. Code:
return [] - Line 140: Blank line for readability. Code:
<blank> - Line 141: Execute the statement as written. Code:
result = chain.get("result", []) - Line 142: Conditional branch. Code:
if not result: - Line 143: Return a value to the caller. Code:
return [] - Line 144: Return a value to the caller. Code:
return result[0].get("expirationDates", []) or [] - Line 145: Blank line for readability. Code:
<blank> - Line 146: Blank line for readability. Code:
<blank> - Line 147: Define the normalize_chain_rows function. Code:
def normalize_chain_rows(rows): - Line 148: Execute the statement as written. Code:
normalized = [] - Line 149: Loop over items. Code:
for row in rows or []: - Line 150: Execute the statement as written. Code:
normalized.append( - Line 151: Execute the statement as written. Code:
{ - Line 152: Execute the statement as written. Code:
"Contract Name": row.get("contractSymbol"), - Line 153: Execute the statement as written. Code:
"Last Trade Date (EST)": format_last_trade_date( - Line 154: Execute the statement as written. Code:
row.get("lastTradeDate") - Line 155: Close the current block or container. Code:
), - Line 156: Execute the statement as written. Code:
"Strike": extract_raw_value(row.get("strike")), - Line 157: Execute the statement as written. Code:
"Last Price": extract_raw_value(row.get("lastPrice")), - Line 158: Execute the statement as written. Code:
"Bid": extract_raw_value(row.get("bid")), - Line 159: Execute the statement as written. Code:
"Ask": extract_raw_value(row.get("ask")), - Line 160: Execute the statement as written. Code:
"Change": extract_raw_value(row.get("change")), - Line 161: Execute the statement as written. Code:
"% Change": format_percent_value(row.get("percentChange")), - Line 162: Execute the statement as written. Code:
"Volume": extract_raw_value(row.get("volume")), - Line 163: Execute the statement as written. Code:
"Open Interest": extract_raw_value(row.get("openInterest")), - Line 164: Execute the statement as written. Code:
"Implied Volatility": format_percent_value( - Line 165: Execute the statement as written. Code:
row.get("impliedVolatility") - Line 166: Close the current block or container. Code:
), - Line 167: Close the current block or container. Code:
} - Line 168: Close the current block or container. Code:
) - Line 169: Return a value to the caller. Code:
return normalized - Line 170: Blank line for readability. Code:
<blank> - Line 171: Blank line for readability. Code:
<blank> - Line 172: Define the build_rows_from_chain function. Code:
def build_rows_from_chain(chain): - Line 173: Execute the statement as written. Code:
result = chain.get("result", []) if chain else [] - Line 174: Conditional branch. Code:
if not result: - Line 175: Return a value to the caller. Code:
return [], [] - Line 176: Execute the statement as written. Code:
options = result[0].get("options", []) - Line 177: Conditional branch. Code:
if not options: - Line 178: Return a value to the caller. Code:
return [], [] - Line 179: Execute the statement as written. Code:
option = options[0] - Line 180: Return a value to the caller. Code:
return ( - Line 181: Execute the statement as written. Code:
normalize_chain_rows(option.get("calls")), - Line 182: Execute the statement as written. Code:
normalize_chain_rows(option.get("puts")), - Line 183: Close the current block or container. Code:
) - Line 184: Blank line for readability. Code:
<blank> - Line 185: Blank line for readability. Code:
<blank> - Line 186: Define the extract_contract_expiry_code function. Code:
def extract_contract_expiry_code(contract_name): - Line 187: Conditional branch. Code:
if not contract_name: - Line 188: Return a value to the caller. Code:
return None - Line 189: Execute the statement as written. Code:
match = re.search(r"(\d{6})", contract_name) - Line 190: Return a value to the caller. Code:
return match.group(1) if match else None - Line 191: Blank line for readability. Code:
<blank> - Line 192: Blank line for readability. Code:
<blank> - Line 193: Define the expected_expiry_code function. Code:
def expected_expiry_code(timestamp): - Line 194: Conditional branch. Code:
if not timestamp: - Line 195: Return a value to the caller. Code:
return None - Line 196: Start a try block for error handling. Code:
try: - Line 197: Return a value to the caller. Code:
return datetime.utcfromtimestamp(timestamp).strftime("%y%m%d") - Line 198: Handle exceptions for the preceding try block. Code:
except Exception: - Line 199: Return a value to the caller. Code:
return None - Line 200: Blank line for readability. Code:
<blank> - Line 201: Blank line for readability. Code:
<blank> - Line 202: Define the extract_expiration_dates_from_html function. Code:
def extract_expiration_dates_from_html(html): - Line 203: Conditional branch. Code:
if not html: - Line 204: Return a value to the caller. Code:
return [] - Line 205: Blank line for readability. Code:
<blank> - Line 206: Execute the statement as written. Code:
patterns = ( - Line 207: Execute the statement as written. Code:
r'\\"expirationDates\\":\[(.*?)\]', - Line 208: Execute the statement as written. Code:
r'"expirationDates":\[(.*?)\]', - Line 209: Close the current block or container. Code:
) - Line 210: Execute the statement as written. Code:
match = None - Line 211: Loop over items. Code:
for pattern in patterns: - Line 212: Execute the statement as written. Code:
match = re.search(pattern, html, re.DOTALL) - Line 213: Conditional branch. Code:
if match: - Line 214: Execute the statement as written. Code:
break - Line 215: Conditional branch. Code:
if not match: - Line 216: Return a value to the caller. Code:
return [] - Line 217: Blank line for readability. Code:
<blank> - Line 218: Execute the statement as written. Code:
raw = match.group(1) - Line 219: Execute the statement as written. Code:
values = [] - Line 220: Loop over items. Code:
for part in raw.split(","): - Line 221: Execute the statement as written. Code:
part = part.strip() - Line 222: Conditional branch. Code:
if part.isdigit(): - Line 223: Start a try block for error handling. Code:
try: - Line 224: Execute the statement as written. Code:
values.append(int(part)) - Line 225: Handle exceptions for the preceding try block. Code:
except Exception: - Line 226: Execute the statement as written. Code:
continue - Line 227: Return a value to the caller. Code:
return values - Line 228: Blank line for readability. Code:
<blank> - Line 229: Blank line for readability. Code:
<blank> - Line 230: Define the build_expiration_options function. Code:
def build_expiration_options(expiration_dates): - Line 231: Execute the statement as written. Code:
options = [] - Line 232: Loop over items. Code:
for value in expiration_dates or []: - Line 233: Start a try block for error handling. Code:
try: - Line 234: Execute the statement as written. Code:
value_int = int(value) - Line 235: Handle exceptions for the preceding try block. Code:
except Exception: - Line 236: Execute the statement as written. Code:
continue - Line 237: Blank line for readability. Code:
<blank> - Line 238: Execute the statement as written. Code:
label = format_expiration_label(value_int) - Line 239: Start a try block for error handling. Code:
try: - Line 240: Execute the statement as written. Code:
date_value = datetime.utcfromtimestamp(value_int).date() - Line 241: Handle exceptions for the preceding try block. Code:
except Exception: - Line 242: Execute the statement as written. Code:
date_value = None - Line 243: Blank line for readability. Code:
<blank> - Line 244: Execute the statement as written. Code:
options.append({"value": value_int, "label": label, "date": date_value}) - Line 245: Return a value to the caller. Code:
return sorted(options, key=lambda x: x["value"]) - Line 246: Blank line for readability. Code:
<blank> - Line 247: Blank line for readability. Code:
<blank> - Line 248: Define the resolve_expiration function. Code:
def resolve_expiration(expiration, options): - Line 249: Conditional branch. Code:
if not expiration: - Line 250: Return a value to the caller. Code:
return None, None - Line 251: Blank line for readability. Code:
<blank> - Line 252: Execute the statement as written. Code:
raw = expiration.strip() - Line 253: Conditional branch. Code:
if not raw: - Line 254: Return a value to the caller. Code:
return None, None - Line 255: Blank line for readability. Code:
<blank> - Line 256: Conditional branch. Code:
if raw.isdigit(): - Line 257: Execute the statement as written. Code:
value = int(raw) - Line 258: Conditional branch. Code:
if options: - Line 259: Loop over items. Code:
for opt in options: - Line 260: Conditional branch. Code:
if opt.get("value") == value: - Line 261: Return a value to the caller. Code:
return value, opt.get("label") - Line 262: Return a value to the caller. Code:
return None, None - Line 263: Return a value to the caller. Code:
return value, format_expiration_label(value) - Line 264: Blank line for readability. Code:
<blank> - Line 265: Execute the statement as written. Code:
requested_date = parse_date(raw) - Line 266: Conditional branch. Code:
if requested_date: - Line 267: Loop over items. Code:
for opt in options: - Line 268: Conditional branch. Code:
if opt.get("date") == requested_date: - Line 269: Return a value to the caller. Code:
return opt.get("value"), opt.get("label") - Line 270: Return a value to the caller. Code:
return None, None - Line 271: Blank line for readability. Code:
<blank> - Line 272: Execute the statement as written. Code:
normalized = normalize_label(raw) - Line 273: Loop over items. Code:
for opt in options: - Line 274: Conditional branch. Code:
if normalize_label(opt.get("label", "")) == normalized: - Line 275: Return a value to the caller. Code:
return opt.get("value"), opt.get("label") - Line 276: Blank line for readability. Code:
<blank> - Line 277: Return a value to the caller. Code:
return None, None - Line 278: Blank line for readability. Code:
<blank> - Line 279: Blank line for readability. Code:
<blank> - Line 280: Define the wait_for_tables function. Code:
def wait_for_tables(page): - Line 281: Start a try block for error handling. Code:
try: - Line 282: Interact with the Playwright page. Code:
page.wait_for_selector( - Line 283: Execute the statement as written. Code:
"section[data-testid='options-list-table'] table", - Line 284: Execute the statement as written. Code:
timeout=30000, - Line 285: Close the current block or container. Code:
) - Line 286: Handle exceptions for the preceding try block. Code:
except Exception: - Line 287: Interact with the Playwright page. Code:
page.wait_for_selector("table", timeout=30000) - Line 288: Blank line for readability. Code:
<blank> - Line 289: Loop over items. Code:
for _ in range(30): # 30 * 1s = 30 seconds - Line 290: Collect option tables from the page. Code:
tables = page.query_selector_all( - Line 291: Execute the statement as written. Code:
"section[data-testid='options-list-table'] table" - Line 292: Close the current block or container. Code:
) - Line 293: Conditional branch. Code:
if len(tables) >= 2: - Line 294: Return a value to the caller. Code:
return tables - Line 295: Collect option tables from the page. Code:
tables = page.query_selector_all("table") - Line 296: Conditional branch. Code:
if len(tables) >= 2: - Line 297: Return a value to the caller. Code:
return tables - Line 298: Execute the statement as written. Code:
time.sleep(1) - Line 299: Return a value to the caller. Code:
return [] - Line 300: Blank line for readability. Code:
<blank> - Line 301: Blank line for readability. Code:
<blank> - Line 302: Define the scrape_yahoo_options function. Code:
def scrape_yahoo_options(symbol, expiration=None): - Line 303: Define the parse_table function. Code:
def parse_table(table_html, side): - Line 304: Conditional branch. Code:
if not table_html: - Line 305: Emit or configure a log message. Code:
app.logger.warning("No %s table HTML for %s", side, symbol) - Line 306: Return a value to the caller. Code:
return [] - Line 307: Blank line for readability. Code:
<blank> - Line 308: Execute the statement as written. Code:
soup = BeautifulSoup(table_html, "html.parser") - Line 309: Blank line for readability. Code:
<blank> - Line 310: Extract header labels from the table. Code:
headers = [th.get_text(strip=True) for th in soup.select("thead th")] - Line 311: Collect table rows for parsing. Code:
rows = soup.select("tbody tr") - Line 312: Blank line for readability. Code:
<blank> - Line 313: Initialize the parsed rows list. Code:
parsed = [] - Line 314: Loop over items. Code:
for r in rows: - Line 315: Collect table cells for the current row. Code:
tds = r.find_all("td") - Line 316: Conditional branch. Code:
if len(tds) != len(headers): - Line 317: Execute the statement as written. Code:
continue - Line 318: Blank line for readability. Code:
<blank> - Line 319: Initialize a row dictionary. Code:
item = {} - Line 320: Loop over items. Code:
for i, c in enumerate(tds): - Line 321: Read the header name for the current column. Code:
key = headers[i] - Line 322: Read or convert the cell value. Code:
val = c.get_text(" ", strip=True) - Line 323: Blank line for readability. Code:
<blank> - Line 324: Comment describing the next block. Code:
# Convert numeric fields - Line 325: Conditional branch. Code:
if key in ["Strike", "Last Price", "Bid", "Ask", "Change"]: - Line 326: Start a try block for error handling. Code:
try: - Line 327: Read or convert the cell value. Code:
val = float(val.replace(",", "")) - Line 328: Handle exceptions for the preceding try block. Code:
except Exception: - Line 329: Read or convert the cell value. Code:
val = None - Line 330: Alternative conditional branch. Code:
elif key in ["Volume", "Open Interest"]: - Line 331: Start a try block for error handling. Code:
try: - Line 332: Read or convert the cell value. Code:
val = int(val.replace(",", "")) - Line 333: Handle exceptions for the preceding try block. Code:
except Exception: - Line 334: Read or convert the cell value. Code:
val = None - Line 335: Alternative conditional branch. Code:
elif val in ["-", ""]: - Line 336: Read or convert the cell value. Code:
val = None - Line 337: Blank line for readability. Code:
<blank> - Line 338: Execute the statement as written. Code:
item[key] = val - Line 339: Blank line for readability. Code:
<blank> - Line 340: Execute the statement as written. Code:
parsed.append(item) - Line 341: Blank line for readability. Code:
<blank> - Line 342: Emit or configure a log message. Code:
app.logger.info("Parsed %d %s rows", len(parsed), side) - Line 343: Return a value to the caller. Code:
return parsed - Line 344: Blank line for readability. Code:
<blank> - Line 345: Define the read_option_chain function. Code:
def read_option_chain(page): - Line 346: Capture the page HTML content. Code:
html = page.content() - Line 347: Execute the statement as written. Code:
option_chain = extract_option_chain_from_html(html) - Line 348: Conditional branch. Code:
if option_chain: - Line 349: Extract expiration date timestamps from the HTML. Code:
expiration_dates = extract_expiration_dates_from_chain(option_chain) - Line 350: Fallback branch. Code:
else: - Line 351: Extract expiration date timestamps from the HTML. Code:
expiration_dates = extract_expiration_dates_from_html(html) - Line 352: Return a value to the caller. Code:
return option_chain, expiration_dates - Line 353: Blank line for readability. Code:
<blank> - Line 354: Define the has_expected_expiry function. Code:
def has_expected_expiry(options, expected_code): - Line 355: Conditional branch. Code:
if not expected_code: - Line 356: Return a value to the caller. Code:
return False - Line 357: Loop over items. Code:
for row in options or []: - Line 358: Execute the statement as written. Code:
name = row.get("Contract Name") - Line 359: Conditional branch. Code:
if extract_contract_expiry_code(name) == expected_code: - Line 360: Return a value to the caller. Code:
return True - Line 361: Return a value to the caller. Code:
return False - Line 362: Blank line for readability. Code:
<blank> - Line 363: URL-encode the stock symbol. Code:
encoded = urllib.parse.quote(symbol, safe="") - Line 364: Build the base Yahoo Finance options URL. Code:
base_url = f"https://finance.yahoo.com/quote/{encoded}/options/" - Line 365: Normalize the expiration input string. Code:
requested_expiration = expiration.strip() if expiration else None - Line 366: Conditional branch. Code:
if not requested_expiration: - Line 367: Normalize the expiration input string. Code:
requested_expiration = None - Line 368: Set the URL to load. Code:
url = base_url - Line 369: Blank line for readability. Code:
<blank> - Line 370: Emit or configure a log message. Code:
app.logger.info( - Line 371: Execute the statement as written. Code:
"Starting scrape for symbol=%s expiration=%s url=%s", - Line 372: Execute the statement as written. Code:
symbol, - Line 373: Execute the statement as written. Code:
requested_expiration, - Line 374: Execute the statement as written. Code:
base_url, - Line 375: Close the current block or container. Code:
) - Line 376: Blank line for readability. Code:
<blank> - Line 377: Reserve storage for options table HTML. Code:
calls_html = None - Line 378: Reserve storage for options table HTML. Code:
puts_html = None - Line 379: Parse the full calls and puts tables. Code:
calls_full = [] - Line 380: Parse the full calls and puts tables. Code:
puts_full = [] - Line 381: Initialize or assign the current price. Code:
price = None - Line 382: Track the resolved expiration metadata. Code:
selected_expiration_value = None - Line 383: Track the resolved expiration metadata. Code:
selected_expiration_label = None - Line 384: Prepare or update the list of available expirations. Code:
expiration_options = [] - Line 385: Track the resolved expiration epoch timestamp. Code:
target_date = None - Line 386: Track whether a base-page lookup is needed. Code:
fallback_to_base = False - Line 387: Blank line for readability. Code:
<blank> - Line 388: Enter a context manager block. Code:
with sync_playwright() as p: - Line 389: Launch a Playwright browser instance. Code:
browser = p.chromium.launch(headless=True) - Line 390: Create a new Playwright page. Code:
page = browser.new_page() - Line 391: Interact with the Playwright page. Code:
page.set_extra_http_headers( - Line 392: Execute the statement as written. Code:
{ - Line 393: Execute the statement as written. Code:
"User-Agent": ( - Line 394: Execute the statement as written. Code:
"Mozilla/5.0 (Windows NT 10.0; Win64; x64) " - Line 395: Execute the statement as written. Code:
"AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120 Safari/537.36" - Line 396: Close the current block or container. Code:
) - Line 397: Close the current block or container. Code:
} - Line 398: Close the current block or container. Code:
) - Line 399: Interact with the Playwright page. Code:
page.set_default_timeout(60000) - Line 400: Blank line for readability. Code:
<blank> - Line 401: Start a try block for error handling. Code:
try: - Line 402: Conditional branch. Code:
if requested_expiration: - Line 403: Conditional branch. Code:
if requested_expiration.isdigit(): - Line 404: Track the resolved expiration epoch timestamp. Code:
target_date = int(requested_expiration) - Line 405: Track the resolved expiration metadata. Code:
selected_expiration_value = target_date - Line 406: Track the resolved expiration metadata. Code:
selected_expiration_label = format_expiration_label(target_date) - Line 407: Fallback branch. Code:
else: - Line 408: Execute the statement as written. Code:
parsed_date = parse_date(requested_expiration) - Line 409: Conditional branch. Code:
if parsed_date: - Line 410: Track the resolved expiration epoch timestamp. Code:
target_date = int( - Line 411: Execute the statement as written. Code:
datetime( - Line 412: Execute the statement as written. Code:
parsed_date.year, - Line 413: Execute the statement as written. Code:
parsed_date.month, - Line 414: Execute the statement as written. Code:
parsed_date.day, - Line 415: Execute the statement as written. Code:
tzinfo=timezone.utc, - Line 416: Execute the statement as written. Code:
).timestamp() - Line 417: Close the current block or container. Code:
) - Line 418: Track the resolved expiration metadata. Code:
selected_expiration_value = target_date - Line 419: Track the resolved expiration metadata. Code:
selected_expiration_label = format_expiration_label(target_date) - Line 420: Fallback branch. Code:
else: - Line 421: Track whether a base-page lookup is needed. Code:
fallback_to_base = True - Line 422: Blank line for readability. Code:
<blank> - Line 423: Conditional branch. Code:
if target_date: - Line 424: Set the URL to load. Code:
url = f"{base_url}?date={target_date}" - Line 425: Blank line for readability. Code:
<blank> - Line 426: Navigate the Playwright page to the target URL. Code:
page.goto(url, wait_until="domcontentloaded", timeout=60000) - Line 427: Emit or configure a log message. Code:
app.logger.info("Page loaded (domcontentloaded) for %s", symbol) - Line 428: Blank line for readability. Code:
<blank> - Line 429: Execute the statement as written. Code:
option_chain, expiration_dates = read_option_chain(page) - Line 430: Emit or configure a log message. Code:
app.logger.info("Option chain found: %s", bool(option_chain)) - Line 431: Prepare or update the list of available expirations. Code:
expiration_options = build_expiration_options(expiration_dates) - Line 432: Blank line for readability. Code:
<blank> - Line 433: Conditional branch. Code:
if fallback_to_base: - Line 434: Execute the statement as written. Code:
resolved_value, resolved_label = resolve_expiration( - Line 435: Execute the statement as written. Code:
requested_expiration, expiration_options - Line 436: Close the current block or container. Code:
) - Line 437: Conditional branch. Code:
if resolved_value is None: - Line 438: Return a value to the caller. Code:
return { - Line 439: Execute the statement as written. Code:
"error": "Requested expiration not available", - Line 440: Execute the statement as written. Code:
"stock": symbol, - Line 441: Execute the statement as written. Code:
"requested_expiration": requested_expiration, - Line 442: Execute the statement as written. Code:
"available_expirations": [ - Line 443: Execute the statement as written. Code:
{"label": opt.get("label"), "value": opt.get("value")} - Line 444: Loop over items. Code:
for opt in expiration_options - Line 445: Close the current block or container. Code:
], - Line 446: Close the current block or container. Code:
} - Line 447: Blank line for readability. Code:
<blank> - Line 448: Track the resolved expiration epoch timestamp. Code:
target_date = resolved_value - Line 449: Track the resolved expiration metadata. Code:
selected_expiration_value = resolved_value - Line 450: Track the resolved expiration metadata. Code:
selected_expiration_label = resolved_label or format_expiration_label( - Line 451: Execute the statement as written. Code:
resolved_value - Line 452: Close the current block or container. Code:
) - Line 453: Set the URL to load. Code:
url = f"{base_url}?date={resolved_value}" - Line 454: Navigate the Playwright page to the target URL. Code:
page.goto(url, wait_until="domcontentloaded", timeout=60000) - Line 455: Emit or configure a log message. Code:
app.logger.info("Page loaded (domcontentloaded) for %s", symbol) - Line 456: Blank line for readability. Code:
<blank> - Line 457: Execute the statement as written. Code:
option_chain, expiration_dates = read_option_chain(page) - Line 458: Prepare or update the list of available expirations. Code:
expiration_options = build_expiration_options(expiration_dates) - Line 459: Blank line for readability. Code:
<blank> - Line 460: Conditional branch. Code:
if target_date and expiration_options: - Line 461: Execute the statement as written. Code:
matched = None - Line 462: Loop over items. Code:
for opt in expiration_options: - Line 463: Conditional branch. Code:
if opt.get("value") == target_date: - Line 464: Execute the statement as written. Code:
matched = opt - Line 465: Execute the statement as written. Code:
break - Line 466: Conditional branch. Code:
if not matched: - Line 467: Return a value to the caller. Code:
return { - Line 468: Execute the statement as written. Code:
"error": "Requested expiration not available", - Line 469: Execute the statement as written. Code:
"stock": symbol, - Line 470: Execute the statement as written. Code:
"requested_expiration": requested_expiration, - Line 471: Execute the statement as written. Code:
"available_expirations": [ - Line 472: Execute the statement as written. Code:
{"label": opt.get("label"), "value": opt.get("value")} - Line 473: Loop over items. Code:
for opt in expiration_options - Line 474: Close the current block or container. Code:
], - Line 475: Close the current block or container. Code:
} - Line 476: Track the resolved expiration metadata. Code:
selected_expiration_value = matched.get("value") - Line 477: Track the resolved expiration metadata. Code:
selected_expiration_label = matched.get("label") - Line 478: Alternative conditional branch. Code:
elif expiration_options and not target_date: - Line 479: Track the resolved expiration metadata. Code:
selected_expiration_value = expiration_options[0].get("value") - Line 480: Track the resolved expiration metadata. Code:
selected_expiration_label = expiration_options[0].get("label") - Line 481: Blank line for readability. Code:
<blank> - Line 482: Execute the statement as written. Code:
calls_full, puts_full = build_rows_from_chain(option_chain) - Line 483: Emit or configure a log message. Code:
app.logger.info( - Line 484: Execute the statement as written. Code:
"Option chain rows: calls=%d puts=%d", - Line 485: Execute the statement as written. Code:
len(calls_full), - Line 486: Execute the statement as written. Code:
len(puts_full), - Line 487: Close the current block or container. Code:
) - Line 488: Blank line for readability. Code:
<blank> - Line 489: Conditional branch. Code:
if not calls_full and not puts_full: - Line 490: Emit or configure a log message. Code:
app.logger.info("Waiting for options tables...") - Line 491: Blank line for readability. Code:
<blank> - Line 492: Collect option tables from the page. Code:
tables = wait_for_tables(page) - Line 493: Conditional branch. Code:
if len(tables) < 2: - Line 494: Emit or configure a log message. Code:
app.logger.error( - Line 495: Execute the statement as written. Code:
"Only %d tables found; expected 2. HTML may have changed.", - Line 496: Execute the statement as written. Code:
len(tables), - Line 497: Close the current block or container. Code:
) - Line 498: Return a value to the caller. Code:
return {"error": "Could not locate options tables", "stock": symbol} - Line 499: Blank line for readability. Code:
<blank> - Line 500: Emit or configure a log message. Code:
app.logger.info("Found %d tables. Extracting Calls & Puts.", len(tables)) - Line 501: Blank line for readability. Code:
<blank> - Line 502: Reserve storage for options table HTML. Code:
calls_html = tables[0].evaluate("el => el.outerHTML") - Line 503: Reserve storage for options table HTML. Code:
puts_html = tables[1].evaluate("el => el.outerHTML") - Line 504: Blank line for readability. Code:
<blank> - Line 505: Comment describing the next block. Code:
# --- Extract current price --- - Line 506: Start a try block for error handling. Code:
try: - Line 507: Comment describing the next block. Code:
# Primary selector - Line 508: Read the current price text from the page. Code:
price_text = page.locator( - Line 509: Execute the statement as written. Code:
"fin-streamer[data-field='regularMarketPrice']" - Line 510: Execute the statement as written. Code:
).inner_text() - Line 511: Initialize or assign the current price. Code:
price = float(price_text.replace(",", "")) - Line 512: Handle exceptions for the preceding try block. Code:
except Exception: - Line 513: Start a try block for error handling. Code:
try: - Line 514: Comment describing the next block. Code:
# Fallback - Line 515: Read the current price text from the page. Code:
price_text = page.locator("span[data-testid='qsp-price']").inner_text() - Line 516: Initialize or assign the current price. Code:
price = float(price_text.replace(",", "")) - Line 517: Handle exceptions for the preceding try block. Code:
except Exception as e: - Line 518: Emit or configure a log message. Code:
app.logger.warning("Failed to extract price for %s: %s", symbol, e) - Line 519: Blank line for readability. Code:
<blank> - Line 520: Emit or configure a log message. Code:
app.logger.info("Current price for %s = %s", symbol, price) - Line 521: Execute the statement as written. Code:
finally: - Line 522: Execute the statement as written. Code:
browser.close() - Line 523: Blank line for readability. Code:
<blank> - Line 524: Conditional branch. Code:
if not calls_full and not puts_full and calls_html and puts_html: - Line 525: Parse the full calls and puts tables. Code:
calls_full = parse_table(calls_html, "calls") - Line 526: Parse the full calls and puts tables. Code:
puts_full = parse_table(puts_html, "puts") - Line 527: Blank line for readability. Code:
<blank> - Line 528: Execute the statement as written. Code:
expected_code = expected_expiry_code(target_date) - Line 529: Conditional branch. Code:
if expected_code: - Line 530: Conditional branch. Code:
if not has_expected_expiry(calls_full, expected_code) and not has_expected_expiry( - Line 531: Execute the statement as written. Code:
puts_full, expected_code - Line 532: Close the current block or container. Code:
): - Line 533: Return a value to the caller. Code:
return { - Line 534: Execute the statement as written. Code:
"error": "Options chain does not match requested expiration", - Line 535: Execute the statement as written. Code:
"stock": symbol, - Line 536: Execute the statement as written. Code:
"requested_expiration": requested_expiration, - Line 537: Execute the statement as written. Code:
"expected_expiration_code": expected_code, - Line 538: Execute the statement as written. Code:
"selected_expiration": { - Line 539: Execute the statement as written. Code:
"value": selected_expiration_value, - Line 540: Execute the statement as written. Code:
"label": selected_expiration_label, - Line 541: Close the current block or container. Code:
}, - Line 542: Close the current block or container. Code:
} - Line 543: Blank line for readability. Code:
<blank> - Line 544: Comment describing the next block. Code:
# ---------------------------------------------------------------------- - Line 545: Comment describing the next block. Code:
# Pruning logic - Line 546: Comment describing the next block. Code:
# ---------------------------------------------------------------------- - Line 547: Define the prune_nearest function. Code:
def prune_nearest(options, price_value, limit=26, side=""): - Line 548: Conditional branch. Code:
if price_value is None: - Line 549: Return a value to the caller. Code:
return options, 0 - Line 550: Blank line for readability. Code:
<blank> - Line 551: Filter options to numeric strike entries. Code:
numeric = [o for o in options if isinstance(o.get("Strike"), (int, float))] - Line 552: Blank line for readability. Code:
<blank> - Line 553: Conditional branch. Code:
if len(numeric) <= limit: - Line 554: Return a value to the caller. Code:
return numeric, 0 - Line 555: Blank line for readability. Code:
<blank> - Line 556: Sort options by distance to current price. Code:
sorted_opts = sorted(numeric, key=lambda x: abs(x["Strike"] - price_value)) - Line 557: Keep the closest strike entries. Code:
pruned = sorted_opts[:limit] - Line 558: Compute how many rows were pruned. Code:
pruned_count = len(options) - len(pruned) - Line 559: Return a value to the caller. Code:
return pruned, pruned_count - Line 560: Blank line for readability. Code:
<blank> - Line 561: Apply pruning to calls. Code:
calls, pruned_calls = prune_nearest(calls_full, price, side="calls") - Line 562: Apply pruning to puts. Code:
puts, pruned_puts = prune_nearest(puts_full, price, side="puts") - Line 563: Blank line for readability. Code:
<blank> - Line 564: Define the strike_range function. Code:
def strike_range(opts): - Line 565: Collect strike prices from the option list. Code:
strikes = [o["Strike"] for o in opts if isinstance(o.get("Strike"), (int, float))] - Line 566: Return a value to the caller. Code:
return [min(strikes), max(strikes)] if strikes else [None, None] - Line 567: Blank line for readability. Code:
<blank> - Line 568: Return a value to the caller. Code:
return { - Line 569: Execute the statement as written. Code:
"stock": symbol, - Line 570: Execute the statement as written. Code:
"url": url, - Line 571: Execute the statement as written. Code:
"requested_expiration": requested_expiration, - Line 572: Execute the statement as written. Code:
"selected_expiration": { - Line 573: Execute the statement as written. Code:
"value": selected_expiration_value, - Line 574: Execute the statement as written. Code:
"label": selected_expiration_label, - Line 575: Close the current block or container. Code:
}, - Line 576: Execute the statement as written. Code:
"current_price": price, - Line 577: Execute the statement as written. Code:
"calls": calls, - Line 578: Execute the statement as written. Code:
"puts": puts, - Line 579: Execute the statement as written. Code:
"calls_strike_range": strike_range(calls), - Line 580: Execute the statement as written. Code:
"puts_strike_range": strike_range(puts), - Line 581: Execute the statement as written. Code:
"total_calls": len(calls), - Line 582: Execute the statement as written. Code:
"total_puts": len(puts), - Line 583: Execute the statement as written. Code:
"pruned_calls_count": pruned_calls, - Line 584: Execute the statement as written. Code:
"pruned_puts_count": pruned_puts, - Line 585: Close the current block or container. Code:
} - Line 586: Blank line for readability. Code:
<blank> - Line 587: Blank line for readability. Code:
<blank> - Line 588: Attach the route decorator to the handler. Code:
@app.route("/scrape_sync") - Line 589: Define the scrape_sync function. Code:
def scrape_sync(): - Line 590: Read the stock symbol parameter. Code:
symbol = request.args.get("stock", "MSFT") - Line 591: Read the expiration parameters from the request. Code:
expiration = ( - Line 592: Execute the statement as written. Code:
request.args.get("expiration") - Line 593: Execute the statement as written. Code:
or request.args.get("expiry") - Line 594: Execute the statement as written. Code:
or request.args.get("date") - Line 595: Close the current block or container. Code:
) - Line 596: Emit or configure a log message. Code:
app.logger.info( - Line 597: Execute the statement as written. Code:
"Received /scrape_sync request for symbol=%s expiration=%s", - Line 598: Execute the statement as written. Code:
symbol, - Line 599: Execute the statement as written. Code:
expiration, - Line 600: Close the current block or container. Code:
) - Line 601: Return a value to the caller. Code:
return jsonify(scrape_yahoo_options(symbol, expiration)) - Line 602: Blank line for readability. Code:
<blank> - Line 603: Blank line for readability. Code:
<blank> - Line 604: Conditional branch. Code:
if __name__ == "__main__": - Line 605: Run the Flask development server. Code:
app.run(host="0.0.0.0", port=9777)