Object: Document Objects
- Purpose:
This module provides the implementation for the project’s
Documentobject(s).These objects contain a document’s page contents as well as any metadata associated with the document.
Important
The
Documentclass is adocp-based implementation of LangChain’s Document object to decrease library dependencies and provide us flexibility to configure the object as needed.However, this object must be (and remain) compatible with LangChain’s text splitters and Chroma objects, as they are passed directly into the these objects.
- Platform:
Linux/Windows | Python 3.11+
- Developer:
J Berendt
- Email:
- Comments:
n/a
- class Document(page_content: str, *, metadata: dict = None)[source]
Bases:
objectObject used to store a document’s content and metadata.
- Parameters:
page_content (str) – A single string containing a page’s text content.
metadata (dict, optional) – Any metadata to be associated to the document. Defaults to None.
- property metadata: dict
Accessor to a document’s metadata.
- property page_content: str
Accessor to a document’s page contents as a single string.