b71a8e77b0
- Fix incorrect API method names (enterprise_value, ttm_ebitda, annual_eps_yoy_growth don't exist) - Add Decimal TypeError common issue section - Add Statement object access pattern (must call .df() not property) - Add comprehensive API reference with verified method names - Add reference to defeatbeta-mapping.org for yfinance migration - Add notebook runner instructions - Add direnv note for .envrc with layout uv - Remove reference to non-existent references/defeatbeta-api.org
189 lines
5.2 KiB
Markdown
189 lines
5.2 KiB
Markdown
# AGENTS.md — Trading Research Environment
|
|
|
|
Instructions for AI agents working in this project.
|
|
|
|
## Project Purpose
|
|
|
|
Quantitative trading research and learning environment. Primary data source is
|
|
**defeatbeta-api** — a Python library that queries Yahoo Finance data stored as
|
|
Parquet files on HuggingFace, accessed via DuckDB.
|
|
|
|
## Using defeatbeta-api
|
|
|
|
### Persistent cache (required for all scripts/notebooks)
|
|
|
|
The library's default cache lives in `/tmp` and is wiped on every reboot.
|
|
A pre-warmed persistent cache lives at `~/.cache/defeatbeta/`.
|
|
|
|
**Every script or notebook must start with:**
|
|
|
|
```python
|
|
from persistent_cache import enable_persistent_cache
|
|
enable_persistent_cache()
|
|
|
|
from defeatbeta_api.data.ticker import Ticker
|
|
from defeatbeta_api.data.tickers import Tickers
|
|
```
|
|
|
|
This redirects DuckDB's httpfs block cache to `~/.cache/defeatbeta/` so queries
|
|
are served from disk after the first warm-up, with no repeated network fetches.
|
|
|
|
### Cache warmup (first run or after clearing)
|
|
|
|
Run once to download all 15 parquet tables (~3-4 GB):
|
|
|
|
```bash
|
|
uv run python warmup_cache.py
|
|
```
|
|
|
|
Resumes if interrupted. Progress recorded in `~/.cache/defeatbeta/warmup_done.json`.
|
|
|
|
### Fully offline mode
|
|
|
|
```bash
|
|
uv run python download_data.py # downloads to data/parquet/
|
|
```
|
|
|
|
```python
|
|
from offline import enable_offline
|
|
enable_offline("data/parquet")
|
|
from defeatbeta_api.data.ticker import Ticker
|
|
```
|
|
|
|
### Cache vs Offline
|
|
|
|
| | `persistent_cache.py` | `offline.py` |
|
|
|---|---|---|
|
|
| Network on first use | Yes (warms cache) | No (pre-downloaded) |
|
|
| Persists across reboots | Yes | Yes |
|
|
| Data freshness | Auto-invalidated | Manual re-download |
|
|
|
|
## Key Scripts
|
|
|
|
| File | Purpose |
|
|
|---|---|
|
|
| `persistent_cache.py` | Redirect cache to `~/.cache/defeatbeta/` |
|
|
| `offline.py` | Patch library for local parquet files |
|
|
| `warmup_cache.py` | Download all tables (resumable) |
|
|
| `download_data.py` | Download to `data/parquet/` |
|
|
|
|
## Common Issues
|
|
|
|
### Decimal TypeError
|
|
|
|
DefeatBeta returns `Decimal` types, not `float`. Always convert before arithmetic:
|
|
|
|
```python
|
|
# Wrong - raises TypeError
|
|
value / 1e9
|
|
|
|
# Correct
|
|
float(value) / 1e9
|
|
```
|
|
|
|
### Statement Object Access
|
|
|
|
Financial statements return `Statement` objects, not DataFrames directly:
|
|
|
|
```python
|
|
stmt = t.quarterly_income_statement()
|
|
df = stmt.df() # Get as DataFrame
|
|
df = stmt.data() # Alternative access
|
|
```
|
|
|
|
## defeatbeta API Quick Reference
|
|
|
|
Comprehensive mapping: `docs/defeatbeta_mapping.org`
|
|
API docs: `references/defeatbeta-api/doc/README.md`
|
|
|
|
```python
|
|
t = Ticker("AAPL")
|
|
|
|
# Info
|
|
t.info() # company profile (DataFrame)
|
|
t.news() # News object
|
|
t.earning_call_transcripts() # Transcripts object
|
|
t.sec_filing() # SEC filings
|
|
|
|
# Price & basic finance
|
|
t.price() # historical OHLCV (DataFrame)
|
|
t.beta() # 5Y monthly beta
|
|
t.dividends() # dividend history
|
|
t.splits() # stock split history
|
|
|
|
# Financial statements (return Statement objects)
|
|
t.quarterly_income_statement().df()
|
|
t.annual_balance_sheet().df()
|
|
t.quarterly_cash_flow().df()
|
|
|
|
# TTM metrics (all return DataFrames)
|
|
t.ttm_eps() # trailing EPS
|
|
t.ttm_pe() # trailing P/E
|
|
t.ttm_revenue() # trailing revenue
|
|
t.ttm_fcf() # trailing free cash flow
|
|
|
|
# Valuation
|
|
t.market_capitalization() # historical market cap
|
|
t.ps_ratio() # price/sales ratio
|
|
t.pb_ratio() # price/book ratio
|
|
t.peg_ratio() # PEG ratio
|
|
t.wacc() # weighted avg cost of capital
|
|
|
|
# Profitability (historical DataFrames)
|
|
t.roe() / t.roa() / t.roic()
|
|
t.quarterly_gross_margin()
|
|
t.quarterly_net_margin()
|
|
t.quarterly_operating_margin()
|
|
t.quarterly_ebitda_margin()
|
|
|
|
# Growth (YoY growth rates)
|
|
t.quarterly_revenue_yoy_growth()
|
|
t.quarterly_eps_yoy_growth()
|
|
t.annual_revenue_yoy_growth()
|
|
t.quarterly_ebitda_yoy_growth()
|
|
|
|
# Revenue breakdown (unique to DefeatBeta!)
|
|
t.revenue_by_segment() # by product segment
|
|
t.revenue_by_geography() # by region
|
|
t.revenue_by_product() # detailed product
|
|
|
|
# Earnings transcripts (unique to DefeatBeta!)
|
|
transcripts = t.earning_call_transcripts()
|
|
transcripts.get_transcripts_list() # list all quarters
|
|
transcripts.get_transcript(2025, 4) # specific quarter
|
|
|
|
# DCF valuation (unique to DefeatBeta!)
|
|
result = t.dcf() # returns dict with Excel file path
|
|
|
|
# Multi-ticker (parallel queries)
|
|
ts = Tickers(["AAPL", "NVDA", "MSFT"])
|
|
ts.info()
|
|
ts.annual_income_statement() # {"AAPL": Statement, ...}
|
|
```
|
|
|
|
## Available Ticker Symbols
|
|
|
|
```python
|
|
from defeatbeta_api.data.company_meta import CompanyMeta
|
|
meta = CompanyMeta()
|
|
symbols = meta.get_all_tickers() # List[str]
|
|
companies = meta.get_all_companies_info() # symbol, name, cik, currency
|
|
```
|
|
|
|
## Interactive Notebook
|
|
|
|
Run the tutorial notebook:
|
|
|
|
```bash
|
|
./run_notebook.sh
|
|
# or
|
|
uv run jupyter notebook defeatbeta_tutorial.ipynb
|
|
```
|
|
|
|
## Environment
|
|
|
|
- Python managed via `uv` — always run scripts with `uv run python <script>`
|
|
- Notebook runners: `marimo` or `jupyterlab`
|
|
- Dependencies in `pyproject.toml`
|
|
- `.envrc` uses `direnv` with `layout uv` for automatic venv activation
|