FinReflectKG: Turning giant SEC filings into a Smart Financial Knowledge Graph
Imagine instantly seeing how a company’s revenue, expenses, and risk factors connect to teams, products, or regulatory requirements. That’s the promise behind FinReflectKG: an open, large-scale knowledge graph (KG) built from the 10-K filings of all S&P 100 companies. It’s not just about collecting data—it’s about organizing it in a way that machines can reason with, while staying trustworthy and transparent for humans.
In this post, we’ll unpack what FinReflectKG is, why it matters for enthusiasts and practitioners alike, and what makes its approach both practical and exciting.
Why a knowledge graph for finance?Financial documents—especially SEC 10-K filings—are a goldmine of structured and semi-structured information. They describe how a company makes money, where risks come from, how different departments relate to each other, and how regulators view the business. But turning those PDFs and tables into something a computer can reason about is tricky:
The data is heterogeneous: text, tables, footnotes, and cross-references.The same idea can appear in many different wordings (synonyms, abbreviations, co-references).Regulatory and business semantics matter: accuracy isn’t just nice to have, it’s essential.Enter knowledge graphs. A KG represents entities (like a department, product line, or risk factor) as nodes and their relationships as edges. It makes multi-hop questions possible (e.g., “Which products contributed most to revenue while increasing R&D spend in the last year?”) and supports advanced analytics like network insights and predictive modeling.
FinReflectKG isn’t just a one-off dataset. It’s an open-source, scalable framework designed to extract, normalize, and evaluate financial relationships from a complete set of filings, with an eye toward transparency and reproducibility.
What is FinReflectKG, in plain terms?An open, large-scale financial KG dataset built from SEC 10-K filings of all S&P 100 companies for 2024.A robust pipeline that combines:Intelligent document parsing (how to read reports).Table-aware chunking (how to slice tables so the system can understand them).Schema-guided iterative extraction (pulling out entities and relations in a structured way that matches a defined data model).Reflection-driven feedback (an “inner quality check” loop that helps the system refine its own work).A multi-faceted evaluation setup that uses:Rule-based checks (explicit policies the extraction must follow).Statistical validation (consistency and coverage checks).LLM-as-Judge assessments (two big AI strengths: language understanding and self-evaluation).Three extraction modes to balance speed, accuracy, and reliability:Single-pass: quick extraction in one go.Multi-pass: iterative refinement across passes.Reflection-agent-based: the most sophisticated mode, where an AI agent reflects, self-corrects, and re-weights outputs to maximize quality.A clear, practical takeaway: a high-quality, thoroughly evaluated dataset and a generalizable KG construction framework that researchers and practitioners can reuse and extend.If you’re curious about the dataset, it’s publicly available (the authors provide a link to a readme with details).
The three extraction modes, in simple termsSingle-pass: Do the extraction once. It’s fast, but may miss edge cases, cross-reference issues, or nuanced relationships that require more context.Multi-pass: Run several rounds. Each pass refines entities and relations—fixing gaps and inconsistencies discovered in earlier rounds. This is more reliable than a single pass but takes more time.Reflection-agent-based: The system uses a reflective loop. After an initial pass, it analyzes its own outputs, asks itself guided questions, and re-extracts or re-labels data where needed. This mode balances efficiency with high quality, and according to the study, leads to the best overall results in terms of compliance with rules, precision, coverage, and relevance as judged by AI-driven evaluation.The core idea is simple: the more the system can think about its own results before finalizing them, the more trustworthy the KG becomes—especially in a domain where precision matters a lot.
How FinReflectKG builds the knowledge graphThink of it as a carefully choreographed dance between humans and machines:
Intelligent document parsing: The system reads the 10-K filings, not just the text, but also the structure, sections, and embedded tables.Table-aware chunking: Financial statements are table-heavy. The approach slices up tables in a way that preserves relationships (e.g., linking a line item to its narrative discussion).Schema-guided iterative extraction: There’s a predefined schema (think: what kinds of entities and relations we expect), and extraction is guided to fill that schema consistently.Reflection-driven feedback loop: An internal quality check that uses self-reflection to improve extraction quality across iterations.This combination aims to produce a KG that is both rich in semantic relationships and reliable enough for downstream tasks like searching, multi-hop Q&A, or graph-powered analytics.
How the team evaluates qualityQuality in financial KGs matters a lot. They use a three-pronged evaluation:
Rule-based compliance (CheckRules): A set of explicit policies the extraction should satisfy. The reflection-based mode achieved 64.8% compliance across all rules, indicating solid alignment with the predefined standards.Statistical validation: Checks like coverage (how much of the domain is represented) and diversity (how varied the semantic content is).LLM-as-Judge assessments: Large language models act as a judge to compare outputs across modes (single-pass, multi-pass, reflection) on precision, comprehensiveness, and relevance. In these evaluations, the reflection-based approach tended to win in balance and quality.In short: while faster methods may be tempting, the reflection-based method consistently delivered the strongest overall performance in both objective rules and AI-based human-like judgments.
Why this matters for finance fans and practitionersOpen, reproducible resource: The dataset and the extraction framework are released to foster transparency and reproducibility in financial AI research. That means you can reproduce results, test your own ideas, or build on top of the dataset.Rich, actionable knowledge: A high-quality KG provides a structured map of financial knowledge across a whole sector, opening doors to advanced search, questions that require chaining multiple facts, and network-level analytics (like spotting how risk factors propagate through a corporate structure).Flexible workflows: The three extraction modes let teams tailor the process to their needs—whether they’re prioritizing speed for a quick experiment or reliability for production-grade insight.Real-world applications: Beyond simple lookups, the KG enables multi-hop Q&A, signals integration from news, and graph-powered predictive models. It’s the kind of foundation that can support smarter dashboards, anomaly detection, and scenario analysis for investors and risk managers.Practical implications and takeawaysIf you’re building financial AI tools, consider a reflection-driven extraction approach. The study suggests that self-reflection loops can meaningfully improve data quality when dealing with complex, regulation-heavy sources.For researchers, the open dataset is a valuable benchmark. It provides a realistic financial domain with rich semantics and a transparent evaluation framework to compare new methods.For practitioners in finance teams, a well-constructed KG can power faster, more reliable insights—like cross-linking a risk narrative in the text with precise numeric tables, or tracing how a policy change could affect multiple product lines.If you’re curious about the nuts and bolts, the pipeline highlights a practical recipe:Start with robust document parsing, especially for table-rich filings.Use a schema-guided approach to keep extraction consistent and easily evaluable.Add a reflection loop to catch edge cases and refine the results before finalizing the KG.ConclusionFinance thrives on clarity—knowing where money comes from, where it goes, and how different parts of a company connect. FinReflectKG takes a big step toward that clarity by turning sprawling SEC filings into a structured, navigable knowledge graph. With three extraction modes, a reflection-driven feedback loop, and a thoughtful, multi-faceted evaluation regime, it balances speed, accuracy, and reliability in a domain where both humans and machines stand to gain from better information architecture.
If you’re excited about data-driven finance, this work offers both a valuable resource (the dataset) and a practical blueprint (the extraction framework) for building smarter, more trustworthy financial AI tools. Practical takeaways: consider reflective, schema-guided extraction when dealing with complex, document-heavy domains; leverage open benchmarks to validate new ideas; and keep a focus on transparent evaluation to earn trust in AI-powered finance.
The post FinReflectKG: Turning giant SEC filings into a Smart Financial Knowledge Graph appeared first on Jacob Robinson.


