docs: add API references, mapping corrections, and verification script
- Add yfinance.org and defeatbeta-api.org reference docs - Fix defeatbeta_mapping.org: deprecated yfinance property names (quarterly_financials→quarterly_income_stmt, financials→income_stmt), longName vs longBusinessSummary conceptual mismatch, cashflow note typo - Add Mapping Limitations section with live verification results (AAPL): DuckDB 1.4.3 incompatibility, format differences, coverage gaps - Add docs/test_mapping.py as runnable mapping verification script - Add offline.py, persistent_cache.py, download_data.py, warmup_cache.py for offline/cached defeatbeta usage - Add aapl_yfinance.py exploration script and quant.py scaffold - Add .envrc (uv layout) and update pyproject.toml + uv.lock Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
@@ -0,0 +1,203 @@
|
||||
#+TITLE: defeatbeta-api Reference
|
||||
#+AUTHOR: Wong Ding Feng
|
||||
#+DATE: 2026-04-25
|
||||
|
||||
* How Data Retrieval Works
|
||||
|
||||
** NOT a full download
|
||||
|
||||
Uses *DuckDB + ~cache_httpfs~ extension* querying *remote Parquet files on HuggingFace*
|
||||
(~defeatbeta/yahoo-finance-data~). Every query runs SQL directly against remote files:
|
||||
|
||||
#+begin_src sql
|
||||
SELECT * FROM 'https://huggingface.co/.../stock_prices.parquet' WHERE symbol = 'AAPL'
|
||||
#+end_src
|
||||
|
||||
Parquet's columnar format + DuckDB *predicate pushdown* = only the row-groups matching
|
||||
your ticker are fetched over HTTP range requests. Not the full 3-4 GB file.
|
||||
|
||||
** On-disk cache
|
||||
|
||||
- Default 1 GB cache at ~~/.defeatbeta/cache/~
|
||||
- Stores fetched blocks so repeated queries are fast
|
||||
- On startup: checks ~spec.json~ on HuggingFace, clears stale cache if dataset was updated
|
||||
|
||||
* Getting All Available Tickers
|
||||
|
||||
#+begin_src python
|
||||
from defeatbeta_api.data.company_meta import CompanyMeta
|
||||
|
||||
meta = CompanyMeta()
|
||||
all_tickers = meta.get_all_tickers() # List[str]
|
||||
all_companies = meta.get_all_companies_info() # List[dict]: symbol, name, cik, currency
|
||||
#+end_src
|
||||
|
||||
Reads ~company_tickers.json~ from HuggingFace — a small JSON, not the big Parquet files.
|
||||
|
||||
* Single Ticker API — ~Ticker("AAPL")~
|
||||
|
||||
#+begin_src python
|
||||
from defeatbeta_api.data.ticker import Ticker
|
||||
t = Ticker("AAPL")
|
||||
#+end_src
|
||||
|
||||
** Company Info
|
||||
|
||||
| Method | Returns | What it gives |
|
||||
|----------------------------+-------------+---------------------------------------------------------|
|
||||
| ~info()~ | DataFrame | Profile: name, sector, industry, description, headcount |
|
||||
| ~officers()~ | DataFrame | Executive officers |
|
||||
| ~sec_filing()~ | DataFrame | SEC filings list |
|
||||
| ~news()~ | ~News~ object | Latest news articles |
|
||||
| ~earning_call_transcripts()~ | ~Transcripts~ | Earnings call transcripts |
|
||||
| ~calendar()~ | DataFrame | Upcoming earnings dates |
|
||||
|
||||
** Prices & Basic Finance
|
||||
|
||||
| Method | Returns | What it gives |
|
||||
|------------------------------------+-----------+------------------------------|
|
||||
| ~price()~ | DataFrame | Historical OHLCV prices |
|
||||
| ~splits()~ | DataFrame | Stock split events |
|
||||
| ~dividends()~ | DataFrame | Dividend payment history |
|
||||
| ~shares()~ | DataFrame | Shares outstanding over time |
|
||||
| ~beta(period="5y", benchmark="SPY")~ | DataFrame | Calculated beta vs benchmark |
|
||||
| ~currency(symbol)~ | DataFrame | Exchange rate history |
|
||||
| ~ttm_eps()~ | DataFrame | Trailing 12-month EPS |
|
||||
|
||||
** Financial Statements
|
||||
|
||||
| Method | Returns | What it gives |
|
||||
|------------------------------+-----------+-------------------------|
|
||||
| ~quarterly_income_statement()~ | ~Statement~ | Quarterly P&L |
|
||||
| ~annual_income_statement()~ | ~Statement~ | Annual P&L |
|
||||
| ~quarterly_balance_sheet()~ | ~Statement~ | Quarterly balance sheet |
|
||||
| ~annual_balance_sheet()~ | ~Statement~ | Annual balance sheet |
|
||||
| ~quarterly_cash_flow()~ | ~Statement~ | Quarterly cash flow |
|
||||
| ~annual_cash_flow()~ | ~Statement~ | Annual cash flow |
|
||||
|
||||
** TTM Aggregates
|
||||
|
||||
| Method | Returns | What it gives |
|
||||
|----------------------------------------+-----------+----------------------------------|
|
||||
| ~ttm_revenue()~ | DataFrame | Trailing 12-month revenue |
|
||||
| ~ttm_fcf()~ | DataFrame | Trailing 12-month free cash flow |
|
||||
| ~ttm_ebitda()~ | DataFrame | Trailing 12-month EBITDA |
|
||||
| ~ttm_net_income_common_stockholders()~ | DataFrame | Trailing 12-month net income |
|
||||
| ~ttm_pe()~ | DataFrame | Trailing P/E (price / ttm_eps) |
|
||||
|
||||
** Revenue Breakdown
|
||||
|
||||
| Method | Returns | What it gives |
|
||||
|------------------------+-----------+-----------------------------|
|
||||
| ~revenue_by_segment()~ | DataFrame | Revenue by business segment |
|
||||
| ~revenue_by_geography()~ | DataFrame | Revenue by region |
|
||||
| ~revenue_by_product()~ | DataFrame | Revenue by product line |
|
||||
|
||||
** Valuation Multiples
|
||||
|
||||
| Method | Returns | What it gives |
|
||||
|-------------------------+-----------+------------------------------|
|
||||
| ~market_capitalization()~ | DataFrame | Historical market cap |
|
||||
| ~ps_ratio()~ | DataFrame | Price/Sales ratio |
|
||||
| ~pb_ratio()~ | DataFrame | Price/Book ratio |
|
||||
| ~peg_ratio()~ | DataFrame | PEG ratio |
|
||||
| ~enterprise_value()~ | DataFrame | Enterprise value |
|
||||
| ~enterprise_to_revenue()~ | DataFrame | EV/Revenue |
|
||||
| ~enterprise_to_ebitda()~ | DataFrame | EV/EBITDA |
|
||||
| ~debt_to_equity()~ | DataFrame | D/E ratio |
|
||||
| ~net_debt_ttm()~ | DataFrame | Net debt (TTM) |
|
||||
| ~wacc()~ | DataFrame | Weighted avg cost of capital |
|
||||
|
||||
** Profitability Returns
|
||||
|
||||
| Method | Returns | What it gives |
|
||||
|---------------------+-----------+------------------------------------|
|
||||
| ~roe()~ | DataFrame | Return on equity |
|
||||
| ~roa()~ | DataFrame | Return on assets |
|
||||
| ~roic()~ | DataFrame | Return on invested capital |
|
||||
| ~roce()~ | DataFrame | Return on capital employed |
|
||||
| ~equity_multiplier()~ | DataFrame | Financial leverage (assets/equity) |
|
||||
| ~asset_turnover()~ | DataFrame | Revenue/assets efficiency |
|
||||
|
||||
** Margins
|
||||
|
||||
| Method | Returns | What it gives |
|
||||
|------------------------------+-----------+--------------------|
|
||||
| ~quarterly_gross_margin()~ | DataFrame | Gross margin % |
|
||||
| ~annual_gross_margin()~ | DataFrame | Gross margin % |
|
||||
| ~quarterly_operating_margin()~ | DataFrame | Operating margin % |
|
||||
| ~annual_operating_margin()~ | DataFrame | Operating margin % |
|
||||
| ~quarterly_net_margin()~ | DataFrame | Net margin % |
|
||||
| ~annual_net_margin()~ | DataFrame | Net margin % |
|
||||
| ~quarterly_ebitda_margin()~ | DataFrame | EBITDA margin % |
|
||||
| ~annual_ebitda_margin()~ | DataFrame | EBITDA margin % |
|
||||
| ~quarterly_fcf_margin()~ | DataFrame | FCF margin % |
|
||||
| ~annual_fcf_margin()~ | DataFrame | FCF margin % |
|
||||
|
||||
** YoY Growth
|
||||
|
||||
| Method | Returns | What it gives |
|
||||
|-----------------------------------------+-----------+---------------------|
|
||||
| ~quarterly_revenue_yoy_growth()~ | DataFrame | Revenue growth % |
|
||||
| ~annual_revenue_yoy_growth()~ | DataFrame | Revenue growth % |
|
||||
| ~quarterly_operating_income_yoy_growth()~ | DataFrame | Op. income growth % |
|
||||
| ~annual_operating_income_yoy_growth()~ | DataFrame | Op. income growth % |
|
||||
| ~quarterly_ebitda_yoy_growth()~ | DataFrame | EBITDA growth % |
|
||||
| ~annual_ebitda_yoy_growth()~ | DataFrame | EBITDA growth % |
|
||||
| ~quarterly_net_income_yoy_growth()~ | DataFrame | Net income growth % |
|
||||
| ~annual_net_income_yoy_growth()~ | DataFrame | Net income growth % |
|
||||
| ~quarterly_fcf_yoy_growth()~ | DataFrame | FCF growth % |
|
||||
| ~annual_fcf_yoy_growth()~ | DataFrame | FCF growth % |
|
||||
| ~quarterly_eps_yoy_growth()~ | DataFrame | EPS growth % |
|
||||
| ~quarterly_ttm_eps_yoy_growth()~ | DataFrame | TTM EPS growth % |
|
||||
|
||||
** Industry Comparisons
|
||||
|
||||
Uses the ticker's own industry to benchmark against peers.
|
||||
|
||||
| Method | Returns | What it gives |
|
||||
|------------------------------------+-----------+--------------------------|
|
||||
| ~industry_ttm_pe()~ | DataFrame | Avg P/E across industry |
|
||||
| ~industry_ps_ratio()~ | DataFrame | Industry P/S |
|
||||
| ~industry_pb_ratio()~ | DataFrame | Industry P/B |
|
||||
| ~industry_roe()~ | DataFrame | Industry ROE |
|
||||
| ~industry_roa()~ | DataFrame | Industry ROA |
|
||||
| ~industry_roic()~ | DataFrame | Industry ROIC |
|
||||
| ~industry_equity_multiplier()~ | DataFrame | Industry leverage |
|
||||
| ~industry_asset_turnover()~ | DataFrame | Industry efficiency |
|
||||
| ~industry_quarterly_gross_margin()~ | DataFrame | Industry gross margin % |
|
||||
| ~industry_quarterly_ebitda_margin()~ | DataFrame | Industry EBITDA margin % |
|
||||
| ~industry_quarterly_net_margin()~ | DataFrame | Industry net margin % |
|
||||
|
||||
** DCF / Advanced
|
||||
|
||||
| Method | Returns | What it gives |
|
||||
|-----------------------------+---------+----------------------------------------|
|
||||
| ~dcf_data()~ | dict | All raw inputs for a DCF model |
|
||||
| ~dcf()~ | dict | Full DCF valuation + exports ~.xlsx~ |
|
||||
| ~download_data_performance()~ | str | Timing summary of data fetch durations |
|
||||
|
||||
* Multi-Ticker API — ~Tickers(["AAPL", "NVDA"])~
|
||||
|
||||
#+begin_src python
|
||||
from defeatbeta_api.data.tickers import Tickers
|
||||
t = Tickers(["AAPL", "NVDA"])
|
||||
t = Tickers(["AAPL", "NVDA"], max_workers=2) # limit parallelism
|
||||
#+end_src
|
||||
|
||||
Wraps all ~Ticker~ methods, running them in *parallel threads*.
|
||||
|
||||
- Methods returning simple data → *combined DataFrame* (all tickers in one table)
|
||||
- Methods returning complex objects (statements, news, transcripts) → ~{symbol: result}~ dict
|
||||
|
||||
Same method names as ~Ticker~, plus industry comparison methods operate per unique
|
||||
industry represented across the list.
|
||||
|
||||
#+begin_src python
|
||||
t.info() # → DataFrame (combined)
|
||||
t.price() # → DataFrame (combined)
|
||||
t.annual_income_statement() # → {'AAPL': Statement(...), 'NVDA': Statement(...)}
|
||||
t.news() # → {'AAPL': News(...), 'NVDA': News(...)}
|
||||
t.earning_call_transcripts() # → {'AAPL': Transcripts(...), 'NVDA': Transcripts(...)}
|
||||
t.industry_roe() # → DataFrame (one row per unique industry)
|
||||
#+end_src
|
||||
Reference in New Issue
Block a user