All Decisions

ADR-0012: Metadata Semantics and Customization for ProcessedItems

DateFebruary 8, 2026
CategoryData Architecture
Tags
metadatadata-model

Context

  • ProcessedItems are created during the clarification step (ADR-0008) and assigned to a bucket (decision type).
  • Each ProcessedItem requires metadata to describe its nature (what kind of item), execution constraints (where/when it can be done), and searchable attributes (what it is about).
  • Users need stable, user-modifiable sets of Categories and Contexts, but customization (rename, removal) must not orphan existing items or cause data corruption.
  • We need a clear database schema that links ProcessedItems to metadata definitions while handling user modifications gracefully.
  • Search, filtering, and automation rely on consistent metadata interpretation across items and time.

Decision

Each ProcessedItem carries a 3-tuple of metadata: Category × Contexts × Tags.

  • Category — optional, at most one per item: intrinsic item type that describes structure/handling (e.g., task, project, reference). Defined in a user-customizable categories table with defaults supplied at database initialization.
  • Contexts — optional set, zero or more: externally imposed, binary execution constraints from a stable, user-customizable set (e.g., @home, @office, @computer). Defined in a user-customizable contexts table with defaults supplied at database initialization. Each ProcessedItem-to-Context binding is explicit in a junction table.
  • Tags — optional, many-per-item: free-form descriptive metadata for retrieval/grouping only. Tags are always fully user-defined and unrestricted; they do not affect execution flow.

Metadata Definitions (User-Customizable)

Categories table:

  • id (PK): Unique identifier.
  • name (unique): Human-readable name (e.g., "task", "project", "reference").
  • is_default: Boolean flag indicating if this category is a built-in default. Used to distinguish user-created vs. app-supplied definitions.
  • created_at, updated_at: Timestamps.

Contexts table:

  • id (PK): Unique identifier.
  • name (unique): Human-readable name (e.g., "@home", "@computer").
  • is_default: Boolean flag indicating if this context is a built-in default.
  • created_at, updated_at: Timestamps.

Default Metadata Sets

The application ships with opinionated defaults:

Default Categories:

  • task (executable, single-step item)
  • project (multi-step, coordinated group of tasks)
  • idea (future-facing capture, not yet actionable)
  • decision (choice pending resolution)
  • problem (something to solve)
  • commitment (promise or obligation)

Default Contexts:

  • @home (available when at home)
  • @office (available when at office/workplace)
  • @computer (requires computer/laptop access)
  • @phone (executable on mobile device)
  • @with-person (requires collaboration or presence of specific person)

These defaults are marked is_default = true at initialization and are guidance, not prescription. Users may rename, remove, or add custom definitions.

ProcessedItem Metadata Binding

processed_items_categories junction/bridge:

  • processed_item_id (FK → processed_items)
  • category_id (FK → categories)
  • Database constraint: exactly zero or one row per processed_item (enforcing "at most one category").

processed_items_contexts junction/bridge:

  • processed_item_id (FK → processed_items)
  • context_id (FK → contexts)
  • Database constraint: no duplicate (processed_item_id, context_id) pairs.

Tags are stored inline as a denormalized comma-separated or JSON string in processed_items.tags column (or as a separate junction table if future filtering requires it). Fully user-defined with no validation.

User Modification Behavior

Renaming a Category or Context:

  • Update the name column in the respective table.
  • All ProcessedItems referencing that definition automatically use the new name (foreign key ensures consistency).
  • No orphaning; existing bindings remain valid.

Removing a Category:

  • If is_default = true, deletion is discouraged but allowed. UI should warn users before allowing default category removal.
  • Delete rows from processed_items_categories where category_id matches the removed category.
  • Affected ProcessedItems revert to having no category (the processed_items_categories junction row is removed, not the ProcessedItem itself).
  • Rationale: ProcessedItems are immutable work artifacts; removing metadata should not delete them.

Removing a Context:

  • If is_default = true, deletion is discouraged but allowed via UI warning.
  • Delete rows from processed_items_contexts where context_id matches the removed context.
  • Affected ProcessedItems no longer have that context constraint; they remain otherwise unchanged.

Adding or Customizing Categories and Contexts:

  • New user-defined entries are added to the respective tables with is_default = false.
  • No impact on existing ProcessedItems unless explicitly assigned.

Semantic Rules

  • Cardinality: Category is optional and single-valued per item; Contexts are optional and multi-valued.
  • Execution Logic: Contexts directly inform execution decisions (e.g., filtering "what can I do right now" by available contexts). Tags do not.
  • Stability and Trustworthiness: Categories and Contexts should be small, stable sets (e.g., 3–10 items each). Users learn and rely on them; frequent changes reduce effectiveness.
  • Tags vs. Contexts: Contexts represent immutable, binary conditions (I am/am not @home). Tags are flexible metadata for retrieval and grouping (e.g., "urgent", "personal", "learning").

Scope boundaries:

  • This ADR defines metadata semantics, cardinality constraints, and user customization mechanics.
  • Database schema details (exact column types, indices) are deferred to schema migration ADRs.
  • UI/UX surfaces for managing metadata are out of scope; only behavior is specified.
  • Sync/export of metadata (if implemented later) must preserve these semantics and handle customization conflicts.

What is explicitly not covered:

  • Specific UI widgets or interaction design for metadata assignment.
  • Automation rules that combine metadata dimensions.
  • Migration of legacy data from other systems.

Rationale

  • Clear data model for ProcessedItems: Each item lands in a processing bucket (ADR-0008) and must be described. Metadata (Category, Contexts, Tags) are attributes of that fixed item; States denote lifecycle and are orthogonal to item type and should be handled separately.
  • Stable, trustworthy foundation for execution: Contexts represent real, binary conditions that users will learn and rely on. A small, stable set (3–10 entries) is more useful and maintainable than unbounded free-form metadata for execution decisions.
  • User control without data loss: Allowing users to rename, add, or remove Categories and Contexts must not delete their work. Junction tables with foreign keys ensure that modification of metadata definitions automatically propagates to all items; deletion of unused definitions removes only the metadata, not the ProcessedItems themselves.
  • Separating retrieval from execution: Tags are free-form and used for search and grouping; Contexts are structured and used for execution decisions. This separation prevents tag pollution (e.g., using "high-priority" as both a tag and a context) and keeps execution logic predictable.
  • Database integrity: Explicit junction tables (processed_items_categories, processed_items_contexts) make cardinality constraints enforceable via unique/primary keys, preventing accidental duplicate assignments or multiple categories per item.
  • Defaults + customization: Shipping with opinionated defaults (marked is_default = true) reduces initial configuration friction. Users can customize freely while the system continues to make reasonable assumptions about new items and filtering.

Consequences

Positive effects:

  • Consistent filtering and retrieval: Queries like "show me all tasks with @home context" are reliable and performant; the junction tables make it straightforward to filter by any combination of metadata.
  • Graceful metadata customization: Users can rename Categories and Contexts without orphaning data. Removal of unused metadata definitions does not corrupt ProcessedItems.
  • Clear ProcessedItem identity: Each item has an explicit, queryable semantic type (category) and execution constraints (contexts). Downstream features (UI lists, automation rules, reflection) have a stable contract.
  • Reduced initial cognitive load: Shipping with sensible defaults means users are not blank-sheet confused; they can onboard with the suggested categories and contexts, then customize as they learn their own workflow.

Known downsides:

  • Junction table overhead: Storing metadata via junction tables requires more schema and multi-table queries than denormalized storage. Mitigated by using database views or application-level caching for common queries.
  • Modification safety vs. flexibility trade-off: Discouraging removal of default categories/contexts may feel overly cautious to some users. Addressed by allowing deletion but warning in the UI.
  • Edge cases in customization: Renaming an item after removing its category is straightforward, but UX messaging must be clear. For example, a removal dialog should show "5 items will lose this category" to confirm the user's intent.
  • Tag storage format choice: Tags are denormalized (string or JSON) rather than in a separate table. A future ADR may revisit this if filtering by individual tags becomes critical.

Follow-up work:

  • Schema migration ADR to define exact column types, indexes, and constraints.
  • UI guidelines for safely presenting metadata customization (add, rename, remove dialogs).
  • Query helper/DAO patterns to encapsulate common filters (e.g., "tasks with available contexts").
  • Sync/export specification to preserve metadata definitions and bindings if cloud sync is added.

Alternatives Considered

  • Junction tables for Categories (many-to-many): Rejected — one category per item is intentional and simpler. A direct foreign key in processed_items suffices; the junction pattern is overkill.
  • Storing Contexts inline as a list/array column: Rejected — explicit junction table makes cardinality clear, enables efficient queries, and is easier to migrate if schema changes.
  • Embedding metadata definitions in code constants: Rejected — users need to customize. Runtime database tables allow users to add, rename, or remove without recompiling or app updates.
  • Tags as a separate table with junction rows: Deferred — tags are currently free-form and unindexed. A future ADR may add a tags table if filtering by individual tags becomes a core use case.
  • Merging Categories and Contexts into a single "metadata" dimension: Rejected — they have different semantics and cardinality (single vs. multi-valued) and different stability requirements. Conflating them would muddy execution logic.

Notes

  • Default Removal Safety: The is_default flag allows the UI to warn users before deleting defaults, but deletion is allowed. This respects user autonomy while providing guardrails.
  • Category vs. Context Confusion: In onboarding or UI copy, emphasize: Categories answer "what is this?"; Contexts answer "when/where can I do this?".
  • Performance Consideration: Queries like "items with category X and context Y" will scan junction tables. For large datasets (10k+ items), consider application-level caching of metadata definitions and indexes on (processed_item_id, category_id) and (processed_item_id, context_id).
  • Soft Deletes: A future phase may implement soft deletion of metadata (added deleted_at timestamp) to support undo or audit trails.
  • Revisit Conditions: This ADR should be revisited if:
    • Users consistently customize metadata in ways the defaults don't support (signals a new default should be added).
    • Queries become slow due to junction table scans (triggers caching or indexing ADR).
    • Sync/offline scenarios require handling of conflicting metadata definitions across devices.