OfficeIMO

API Reference

Class

ReaderChunk

Namespace OfficeIMO.Reader
Assembly OfficeIMO.Reader
Modifiers sealed

A normalized extraction chunk produced by DocumentReader.

Inheritance

  • Object
  • ReaderChunk

Constructors

Properties

public String Id { get; set; } #

Stable, ASCII-only identifier.

public ReaderInputKind Kind { get; set; } #

The kind of input that produced this chunk.

public ReaderLocation Location { get; set; } #

Source location information for citations and debugging.

public String SourceId { get; set; } #

Stable identifier for the source document. For file-based reads this is deterministic for a given normalized path.

public String SourceHash { get; set; } #

Optional content hash for the source document (for incremental upserts).

public String ChunkHash { get; set; } #

Optional content hash for this chunk (for incremental upserts).

public Nullable<DateTime> SourceLastWriteUtc { get; set; } #

Optional source last-write timestamp (UTC) when available.

public Nullable<Int64> SourceLengthBytes { get; set; } #

Optional source length in bytes when available.

public Nullable<Int32> TokenEstimate { get; set; } #

Estimated token count (best-effort heuristic) for prompt budgeting.

public String Text { get; set; } #

Plain text representation of the chunk.

public String Markdown { get; set; } #

Optional Markdown representation of the chunk.

public IReadOnlyList<ReaderTable> Tables { get; set; } #

Optional structured tables extracted from this chunk.

public IReadOnlyList<ReaderVisual> Visuals { get; set; } #

Optional structured visual fence metadata extracted from this chunk.

public IReadOnlyList<String> Warnings { get; set; } #

Optional warnings about truncation or unsupported content.