A New Way to Think About SEC Filings

For decades, we've organized information using files. The metaphor of a file - a container that holds related information - has served us well since the early days of computing. When you think about an SEC filing like a 10-K annual report, you might imagine it as a collection of files: the main document, exhibits, financial statements, and various attachments. This mirrors how we used to handle physical documents, where a company's annual report would be a stack of paper documents filed away in manila folders.

The EDGAR system, the SEC's electronic filing database, maintains this file-centric view. When a company submits a filing, it uploads multiple files - HTML documents, XBRL data, PDFs, and other attachments. As developers, we often find ourselves writing code to download these files, parse their contents, and extract the information we need. This approach, while familiar, creates unnecessary complexity. We spend more time dealing with file formats, parsing logic, and content extraction than focusing on the actual information we care about.

The Shift to Data Objects

What if we stopped thinking about SEC filings as collections of files and instead viewed them as data objects? A data object represents the actual information a company is disclosing to the public, independent of how it's stored or formatted. For a 10-K filing, the data object encapsulates everything material about the company's annual performance and position - financial statements, business description, risk factors, management's discussion and analysis - as structured data.

This abstraction shift has profound implications:

Focus on Content, Not Format: Instead of writing code to handle different file formats and parse various document structures, we can work directly with the information we care about. The data object provides a clean interface to the filing's contents.
Simplified Analysis: When filings are represented as data objects, it becomes much easier to compare information across different companies and time periods. The messy details of how each company formats their files become irrelevant.
Better Data Integration: Data objects can be easily transformed into whatever format your analysis requires - databases, spreadsheets, visualizations, or reports. The abstraction creates a clean separation between the information itself and how it's presented.
Future-Proof: As filing requirements and formats evolve, the data object model can remain stable. Changes in how companies submit their filings don't need to affect how we work with the information.

Practical Impact

Consider a common task: analyzing the risk factors section across multiple companies' 10-K filings. With a file-based approach, you'd need to:

Download the main filing document for each company
Parse the HTML or text content
Locate the risk factors section
Extract and clean the text
Structure the information for analysis

With a data object approach, you simply access the risk_factors attribute of each company's 10K data object. The messy details of file handling and parsing are abstracted away, letting you focus on actual analysis.

This is the core insight behind edgartools - treating SEC filings as data objects rather than collections of files. By raising the level of abstraction, we can build more powerful tools for financial analysis while writing less code.

Data Objects in Edgartools

Everything in EdgarTools is built as a data object that contains data and presents the data to you. The Filing class however, it starts as a primary way to access the files contained within an SCC filing, but there is a function called data_object (aliased as obj) that you call to turn the important information in the filing into a data object.

This creates a new object that represents information that the company intended to release to the public and is dependent on the form type. Behind the scenes the library downloads all the files required, and parses the data to recreate the semantic business information into a data object that you can use to easily view what you need.

Form	Data Object	Description
10-K	`TenK`	Annual report
10-Q	`TenQ`	Quarterly report
8-K	`EightK`	Current report
MA-I	`MunicipalAdvisorForm`	Municipal advisor initial filing
Form 144	`Form144`	Notice of proposed sale of securities
C, C-U, C-AR, C-TR	`FormC`	Form C Crowdfunding Offering
D	`FormD`	Form D Offering
3,4,5	`Ownership`	Ownership reports
13F-HR	`ThirteenF`	13F Holdings Report
NPORT-P	`FundReport`	Fund Report
EFFECT	`Effect`	Notice of Effectiveness
And other filing with XBRL	`XBRLData` or `XBRLInstance`	Container for XBRL data

Let's see this in practice with a 10-K filing. In a 10K filing, you can simply call .obj() and get a TenK data object. From this you can get access to the financials or individual items of the 10-K filing

c = Company("NVDA")
f = c.latest("10-K")
# Create a Tenk data object
tenk = f.obj() # or f.data_object()

You can get the Income Statement from the TenK data object

tenk.financials.income

Looking Forward

The shift from files to data objects represents a broader trend in software development: moving from storage-centric to semantic-centric models. Just as databases helped us move beyond thinking about data as files to thinking about it as relations and tables, data objects help us focus on what information means rather than how it's stored.

For SEC filings, this abstraction shift is particularly powerful because it aligns with how analysts and investors actually think about company disclosures. They care about the information being disclosed, not the files that contain it. By modeling filings as data objects, we create tools that better match how people think about and use this information.