What Is the Difference Between LLM-Based and Estimated Values?
Learn how estimated and LLM-based financial values are generated on North Data and why both may appear with an asterisk (*).
What are estimated values?
If a company does not publish certain financial KPIs, North Data may estimate them using available financial information from annual reports and other company data. These estimates are calculated using non-linear regression analysis, taking into account factors such as balance sheet figures, employees, and industry-specific parameters.
For more details on how revenue is estimated, please refer to our Help Center article.
What are LLM-based values?
LLM-based values are extracted automatically from financial reports using Large Language Models (LLMs).
Instead of calculating or forecasting a number, the LLM scans the narrative or free-text sections of annual reports and financial statements and identifies financial figures that are explicitly stated in the narrative text.
This approach allows North Data to identify additional financial figures such as:
-
Revenue
-
Pension provisions
-
Number of employees
How Does This Differ from Traditional NLP (Natural Language Processing)?
In some cases, financial figures appear in narrative sections (e.g. management reports or notes) rather than in structured tables such as the balance sheet or income statement.
While classical rule-based NLP may not reliably detect these values, the LLM can identify and extract financial figures even when they are embedded in free text.
For example: If a company mentions its revenue in the explanatory notes of an annual report, the LLM can detect and extract that exact number — even if it does not appear in a clearly structured table format.
Why are both marked with an asterisk (*)?
Both estimated and LLM-based values are marked with an asterisk (*) because they are considered unsafe values or indirect determinations.
This means the value was either:
-
Statistically Calculated or,
-
Automatically Extracted from Free-text.
Hovering the mouse pointer over the asterisk (*) displays additional information about how the value was determined.