docs: update AGENTS.md with verified API reference and common issues

- Fix incorrect API method names (enterprise_value, ttm_ebitda, annual_eps_yoy_growth don't exist)
- Add Decimal TypeError common issue section
- Add Statement object access pattern (must call .df() not property)
- Add comprehensive API reference with verified method names
- Add reference to defeatbeta-mapping.org for yfinance migration
- Add notebook runner instructions
- Add direnv note for .envrc with layout uv
- Remove reference to non-existent references/defeatbeta-api.org
This commit is contained in:
2026-04-26 02:15:17 +08:00
parent 4e5af95272
commit b71a8e77b0
+188
View File
@@ -0,0 +1,188 @@
# AGENTS.md — Trading Research Environment
Instructions for AI agents working in this project.
## Project Purpose
Quantitative trading research and learning environment. Primary data source is
**defeatbeta-api** — a Python library that queries Yahoo Finance data stored as
Parquet files on HuggingFace, accessed via DuckDB.
## Using defeatbeta-api
### Persistent cache (required for all scripts/notebooks)
The library's default cache lives in `/tmp` and is wiped on every reboot.
A pre-warmed persistent cache lives at `~/.cache/defeatbeta/`.
**Every script or notebook must start with:**
```python
from persistent_cache import enable_persistent_cache
enable_persistent_cache()
from defeatbeta_api.data.ticker import Ticker
from defeatbeta_api.data.tickers import Tickers
```
This redirects DuckDB's httpfs block cache to `~/.cache/defeatbeta/` so queries
are served from disk after the first warm-up, with no repeated network fetches.
### Cache warmup (first run or after clearing)
Run once to download all 15 parquet tables (~3-4 GB):
```bash
uv run python warmup_cache.py
```
Resumes if interrupted. Progress recorded in `~/.cache/defeatbeta/warmup_done.json`.
### Fully offline mode
```bash
uv run python download_data.py # downloads to data/parquet/
```
```python
from offline import enable_offline
enable_offline("data/parquet")
from defeatbeta_api.data.ticker import Ticker
```
### Cache vs Offline
| | `persistent_cache.py` | `offline.py` |
|---|---|---|
| Network on first use | Yes (warms cache) | No (pre-downloaded) |
| Persists across reboots | Yes | Yes |
| Data freshness | Auto-invalidated | Manual re-download |
## Key Scripts
| File | Purpose |
|---|---|
| `persistent_cache.py` | Redirect cache to `~/.cache/defeatbeta/` |
| `offline.py` | Patch library for local parquet files |
| `warmup_cache.py` | Download all tables (resumable) |
| `download_data.py` | Download to `data/parquet/` |
## Common Issues
### Decimal TypeError
DefeatBeta returns `Decimal` types, not `float`. Always convert before arithmetic:
```python
# Wrong - raises TypeError
value / 1e9
# Correct
float(value) / 1e9
```
### Statement Object Access
Financial statements return `Statement` objects, not DataFrames directly:
```python
stmt = t.quarterly_income_statement()
df = stmt.df() # Get as DataFrame
df = stmt.data() # Alternative access
```
## defeatbeta API Quick Reference
Comprehensive mapping: `docs/defeatbeta_mapping.org`
API docs: `references/defeatbeta-api/doc/README.md`
```python
t = Ticker("AAPL")
# Info
t.info() # company profile (DataFrame)
t.news() # News object
t.earning_call_transcripts() # Transcripts object
t.sec_filing() # SEC filings
# Price & basic finance
t.price() # historical OHLCV (DataFrame)
t.beta() # 5Y monthly beta
t.dividends() # dividend history
t.splits() # stock split history
# Financial statements (return Statement objects)
t.quarterly_income_statement().df()
t.annual_balance_sheet().df()
t.quarterly_cash_flow().df()
# TTM metrics (all return DataFrames)
t.ttm_eps() # trailing EPS
t.ttm_pe() # trailing P/E
t.ttm_revenue() # trailing revenue
t.ttm_fcf() # trailing free cash flow
# Valuation
t.market_capitalization() # historical market cap
t.ps_ratio() # price/sales ratio
t.pb_ratio() # price/book ratio
t.peg_ratio() # PEG ratio
t.wacc() # weighted avg cost of capital
# Profitability (historical DataFrames)
t.roe() / t.roa() / t.roic()
t.quarterly_gross_margin()
t.quarterly_net_margin()
t.quarterly_operating_margin()
t.quarterly_ebitda_margin()
# Growth (YoY growth rates)
t.quarterly_revenue_yoy_growth()
t.quarterly_eps_yoy_growth()
t.annual_revenue_yoy_growth()
t.quarterly_ebitda_yoy_growth()
# Revenue breakdown (unique to DefeatBeta!)
t.revenue_by_segment() # by product segment
t.revenue_by_geography() # by region
t.revenue_by_product() # detailed product
# Earnings transcripts (unique to DefeatBeta!)
transcripts = t.earning_call_transcripts()
transcripts.get_transcripts_list() # list all quarters
transcripts.get_transcript(2025, 4) # specific quarter
# DCF valuation (unique to DefeatBeta!)
result = t.dcf() # returns dict with Excel file path
# Multi-ticker (parallel queries)
ts = Tickers(["AAPL", "NVDA", "MSFT"])
ts.info()
ts.annual_income_statement() # {"AAPL": Statement, ...}
```
## Available Ticker Symbols
```python
from defeatbeta_api.data.company_meta import CompanyMeta
meta = CompanyMeta()
symbols = meta.get_all_tickers() # List[str]
companies = meta.get_all_companies_info() # symbol, name, cik, currency
```
## Interactive Notebook
Run the tutorial notebook:
```bash
./run_notebook.sh
# or
uv run jupyter notebook defeatbeta_tutorial.ipynb
```
## Environment
- Python managed via `uv` — always run scripts with `uv run python <script>`
- Notebook runners: `marimo` or `jupyterlab`
- Dependencies in `pyproject.toml`
- `.envrc` uses `direnv` with `layout uv` for automatic venv activation