- Fix incorrect API method names (enterprise_value, ttm_ebitda, annual_eps_yoy_growth don't exist) - Add Decimal TypeError common issue section - Add Statement object access pattern (must call .df() not property) - Add comprehensive API reference with verified method names - Add reference to defeatbeta-mapping.org for yfinance migration - Add notebook runner instructions - Add direnv note for .envrc with layout uv - Remove reference to non-existent references/defeatbeta-api.org
5.2 KiB
AGENTS.md — Trading Research Environment
Instructions for AI agents working in this project.
Project Purpose
Quantitative trading research and learning environment. Primary data source is defeatbeta-api — a Python library that queries Yahoo Finance data stored as Parquet files on HuggingFace, accessed via DuckDB.
Using defeatbeta-api
Persistent cache (required for all scripts/notebooks)
The library's default cache lives in /tmp and is wiped on every reboot.
A pre-warmed persistent cache lives at ~/.cache/defeatbeta/.
Every script or notebook must start with:
from persistent_cache import enable_persistent_cache
enable_persistent_cache()
from defeatbeta_api.data.ticker import Ticker
from defeatbeta_api.data.tickers import Tickers
This redirects DuckDB's httpfs block cache to ~/.cache/defeatbeta/ so queries
are served from disk after the first warm-up, with no repeated network fetches.
Cache warmup (first run or after clearing)
Run once to download all 15 parquet tables (~3-4 GB):
uv run python warmup_cache.py
Resumes if interrupted. Progress recorded in ~/.cache/defeatbeta/warmup_done.json.
Fully offline mode
uv run python download_data.py # downloads to data/parquet/
from offline import enable_offline
enable_offline("data/parquet")
from defeatbeta_api.data.ticker import Ticker
Cache vs Offline
persistent_cache.py |
offline.py |
|
|---|---|---|
| Network on first use | Yes (warms cache) | No (pre-downloaded) |
| Persists across reboots | Yes | Yes |
| Data freshness | Auto-invalidated | Manual re-download |
Key Scripts
| File | Purpose |
|---|---|
persistent_cache.py |
Redirect cache to ~/.cache/defeatbeta/ |
offline.py |
Patch library for local parquet files |
warmup_cache.py |
Download all tables (resumable) |
download_data.py |
Download to data/parquet/ |
Common Issues
Decimal TypeError
DefeatBeta returns Decimal types, not float. Always convert before arithmetic:
# Wrong - raises TypeError
value / 1e9
# Correct
float(value) / 1e9
Statement Object Access
Financial statements return Statement objects, not DataFrames directly:
stmt = t.quarterly_income_statement()
df = stmt.df() # Get as DataFrame
df = stmt.data() # Alternative access
defeatbeta API Quick Reference
Comprehensive mapping: docs/defeatbeta_mapping.org
API docs: references/defeatbeta-api/doc/README.md
t = Ticker("AAPL")
# Info
t.info() # company profile (DataFrame)
t.news() # News object
t.earning_call_transcripts() # Transcripts object
t.sec_filing() # SEC filings
# Price & basic finance
t.price() # historical OHLCV (DataFrame)
t.beta() # 5Y monthly beta
t.dividends() # dividend history
t.splits() # stock split history
# Financial statements (return Statement objects)
t.quarterly_income_statement().df()
t.annual_balance_sheet().df()
t.quarterly_cash_flow().df()
# TTM metrics (all return DataFrames)
t.ttm_eps() # trailing EPS
t.ttm_pe() # trailing P/E
t.ttm_revenue() # trailing revenue
t.ttm_fcf() # trailing free cash flow
# Valuation
t.market_capitalization() # historical market cap
t.ps_ratio() # price/sales ratio
t.pb_ratio() # price/book ratio
t.peg_ratio() # PEG ratio
t.wacc() # weighted avg cost of capital
# Profitability (historical DataFrames)
t.roe() / t.roa() / t.roic()
t.quarterly_gross_margin()
t.quarterly_net_margin()
t.quarterly_operating_margin()
t.quarterly_ebitda_margin()
# Growth (YoY growth rates)
t.quarterly_revenue_yoy_growth()
t.quarterly_eps_yoy_growth()
t.annual_revenue_yoy_growth()
t.quarterly_ebitda_yoy_growth()
# Revenue breakdown (unique to DefeatBeta!)
t.revenue_by_segment() # by product segment
t.revenue_by_geography() # by region
t.revenue_by_product() # detailed product
# Earnings transcripts (unique to DefeatBeta!)
transcripts = t.earning_call_transcripts()
transcripts.get_transcripts_list() # list all quarters
transcripts.get_transcript(2025, 4) # specific quarter
# DCF valuation (unique to DefeatBeta!)
result = t.dcf() # returns dict with Excel file path
# Multi-ticker (parallel queries)
ts = Tickers(["AAPL", "NVDA", "MSFT"])
ts.info()
ts.annual_income_statement() # {"AAPL": Statement, ...}
Available Ticker Symbols
from defeatbeta_api.data.company_meta import CompanyMeta
meta = CompanyMeta()
symbols = meta.get_all_tickers() # List[str]
companies = meta.get_all_companies_info() # symbol, name, cik, currency
Interactive Notebook
Run the tutorial notebook:
./run_notebook.sh
# or
uv run jupyter notebook defeatbeta_tutorial.ipynb
Environment
- Python managed via
uv— always run scripts withuv run python <script> - Notebook runners:
marimoorjupyterlab - Dependencies in
pyproject.toml .envrcusesdirenvwithlayout uvfor automatic venv activation