How It Works
Company Vectors
How each company in the screener is represented as a set of ranked financial and structural scores across 43 dimensions.
Before any news is scored, every company in the screener gets profiled into a company vector — a flat set of numbers that describes what kind of company it is across 43 dimensions grouped into 8 clusters.
Where the data comes from
Company vectors are built from financial data fetched from the Financial Modeling Prep (FMP) API. For each company, the system fetches:
- Income statements (last 8 quarters)
- Balance sheets
- Cash flow statements
- Key financial metrics and ratios
- Stock quote and 52-week high/low
- Analyst estimates
- Institutional holder records
- Company profile (sector, industry, country, description)
Computing raw dimension values
The dimension calculator takes that raw financial data and converts it into a raw numeric value for each of the 43 dimensions. Some examples of how dimensions are computed:
- Debt burden — total debt divided by EBITDA, computed directly from balance sheet and income statement data
- Earnings quality — free cash flow divided by net income, from cash flow and income statements
- Inflation sensitivity — the inverse of gross margin (1 minus gross profit divided by revenue)
- Price momentum — current stock price divided by the 52-week high
- Sector dimensions — a one-hot encoding based on the company's reported GICS sector (1.0 for its sector, 0.0 for all others)
- China revenue exposure — derived from the company's headquarters country and a keyword scan of its description
When data is insufficient or unavailable for a dimension, the raw value is recorded as None.
Rank normalisation
Raw values are not stored directly. Instead, each dimension is rank-normalised across the entire universe of companies being scored together.
For a given dimension, all companies with valid data are ranked from lowest to highest, and the rank is divided by the total number of valid companies to produce a score between 0 and 1. Companies with missing data for a dimension receive a neutral score of 0.5.
This means scores are always relative, not absolute:
- A score of 1.0 means highest in the screened universe for that dimension
- A score of 0.5 means median — or data was unavailable
- A score of 0.0 means lowest in the universe
Tie-breaking uses fractional ranking (the average of tied positions), so two companies with identical values share equal scores.
Storage and caching
Once computed, company vectors are written to the database and cached to disk with a 24-hour TTL. On the next run, the cache is checked first before making fresh FMP API calls. This keeps the system fast and avoids redundant data fetching for large universes.