smart-memory

Flexible memory management node for n8n with support for multiple storage backends

Package Information

Released: 10/5/2025

Downloads: 1,368 weekly / 3,835 monthly

Latest Version: 1.0.32

Available Nodes

Smart Memory

No description available

Documentation

n8n Smart Memory node: Conversational Memory Management System

Overview

Smart Memory is a sophisticated memory management system designed for conversational AI applications, particularly chatbots. It provides structured storage and retrieval of conversation histories, enabling language models to maintain context across multiple interactions while optimizing memory usage through intelligent windowing and organization strategies.

Conceptual Architecture

Core Purpose

The system addresses a fundamental challenge in conversational AI: maintaining coherent, contextually-aware conversations across multiple messages while managing memory constraints. It transforms raw conversational data into a structured format optimized for language model consumption, handling the complexity of different chat types (private conversations vs. group discussions) and maintaining relevant metadata.

High-Level Design Philosophy

The architecture follows these key principles:

Chat-Centric Organization: Memory is organized by conversation (chat) rather than by user, reflecting natural conversation boundaries
Adaptive Structure: Different chat types (private, group, supergroup, channel) receive appropriate memory structures
Sliding Window Strategy: Maintains recent context while preventing unbounded memory growth
Storage Abstraction: Separates memory logic from storage implementation, enabling different backends
LLM-Ready Output: Structures data in formats that language models can directly consume

How Smart Memory Functions

Memory Structure Design

The system organizes conversational memory into a hierarchical structure:

Chat Level Information:

Unique chat identifier (for memory retrieval)
Chat type classification (private, group, supergroup, channel)
Chat name/title (for context)

User Information Strategy:

Private Chats: User information stored once at the chat level (avoiding redundancy since only two participants exist)
Group Chats: User information attached to each message (necessary since multiple users participate)

Message Collection:

Ordered sequence of conversation turns
Each message contains:
- Role identifier (user or assistant)
- Message content
- Optional timestamp (when temporal context matters)
- Optional user information (for group contexts)
- Optional reply context (tracking conversational threads)

Processing Pipeline

When new conversational data arrives, the system executes the following sequence:

1. Input Validation Phase

The system validates incoming data to ensure:

Chat identifier exists and is valid
Message content is present and non-empty
User information is complete (when required)
Data types match expected formats

2. Memory Retrieval Phase

Based on the chat identifier, the system:

Queries the storage layer for existing conversation memory
Returns null if this is the first interaction for this chat
Otherwise, returns the complete memory structure

3. Message Processing Phase

For New Conversations:

Creates a fresh memory structure
Initializes chat metadata (ID, type, name)
For private chats: extracts and stores user information at the chat level
Prepares an empty message collection

For Existing Conversations:

Retrieves current memory state
Validates consistency of chat metadata
Prepares to append new message

4. Message Construction Phase

The system builds a structured message object:

Role Assignment:

Determines if the message is from the user or the assistant
Uses configurable role identifiers (allowing customization)

Content Extraction:

Extracts text content from various possible fields
Handles different message types (text, captions, bot responses)
Preserves original content without modification

Metadata Attachment:

Conditionally adds timestamp (if temporal tracking is enabled)
For group chats: attaches user identification (display name, username)
For replies: captures reply context (who was replied to, what was said)

5. Reply Context Processing

When messages are replies to previous messages:

Extracts information about the original message
Identifies the user being replied to
Captures the content of the replied-to message
Structures this as contextual metadata attached to the new message

This enables language models to understand conversational threading without requiring the entire conversation history to be repeatedly processed.

6. Memory Windowing Phase

After adding the new message, the system applies a sliding window strategy:

Window Size Enforcement:

Maintains only the most recent N messages
Older messages are automatically pruned
Window size is configurable per use case

Rationale:

Prevents unbounded memory growth
Focuses language model attention on recent context
Reduces token consumption in LLM API calls
Balances context retention with computational efficiency

7. Storage Persistence Phase

The updated memory structure is persisted to the storage backend:

Uses the chat identifier as the storage key
Completely replaces previous memory state
Storage operation is atomic (succeeds or fails completely)

8. Output Formatting Phase

The system prepares output optimized for downstream consumption:

Structures data for direct LLM prompt injection
Optionally estimates token count (for API planning)
Removes empty/null fields (reducing payload size)
Maintains consistent JSON structure

Operation Modes

The system supports three primary operations:

Insert Operation

Adds new messages to conversation memory with two modes:

Append Mode:

Adds the new message to the existing message sequence
Preserves all previous messages (within the window)
Standard operation for ongoing conversations

Replace Mode:

Discards existing memory completely
Creates fresh memory with just the new message
Useful for conversation resets or context clearing

Get Operation

Retrieves the current memory state for a conversation:

Returns complete memory structure
Includes all messages within the window
Provides metadata for context understanding

Clear Operation

Removes all memory for a conversation:

Deletes the entire memory structure
Used for privacy compliance or session termination
Irreversible operation

Field Mapping and Expression Evaluation

The system handles diverse input formats through configurable field mapping:

Dynamic Field Resolution:

Maps generic field identifiers to actual data locations
Evaluates simple expressions to extract nested data
Constructs composite fields (e.g., full name from first_name + last_name)

Example Transformations:

Combining separate name fields into a single display name
Extracting optional fields with fallback defaults
Handling missing data gracefully

This abstraction allows the system to work with various messaging platforms without hardcoding platform-specific structures.

Chat Type Differentiation

The system optimizes memory structure based on conversation context:

Private Conversations:

Single user information stored at chat level
No need to identify message authors (only two participants)
More compact memory structure
Reduced redundancy

Group Conversations:

User information attached to each message
Essential for multi-participant context
Enables "who said what" tracking
Supports natural language references ("as John mentioned...")

This adaptive approach reduces memory overhead while maintaining necessary context.

Token Estimation

For LLM API planning, the system provides token estimation:

Estimation Method:

Converts entire memory structure to JSON string
Applies a heuristic ratio (approximately 4 characters per token)
Returns conservative estimate

Use Cases:

Pre-flight checks before LLM API calls
Monitoring memory size growth
Implementing dynamic windowing based on token limits
Cost estimation for API usage

Output Cleaning

Before returning memory structures, the system removes noise:

Recursive Cleaning Process:

Removes undefined and null values
Eliminates empty strings
Filters empty arrays and objects
Processes nested structures recursively

Benefits:

Reduces payload size
Simplifies downstream processing
Prevents language models from processing empty fields
Creates cleaner prompt context

Integration Pattern

The system operates as a stateful service within workflow environments:

Typical Flow:

Conversational input arrives (user message or bot response)
Smart Memory processes and stores the message
System retrieves current memory state
Memory is formatted into LLM prompt
LLM generates response using full context
Bot response is stored back into memory
Cycle repeats for next interaction

Session Management:

Each conversation has a unique identifier (chat ID)
Memory persists across workflow executions
Storage backend handles persistence layer
Memory lifecycle is independent of workflow lifecycle

Configuration Philosophy

The system is designed to be configuration-driven rather than code-driven:

Configurable Aspects:

Memory window size (how much history to retain)
Storage backend selection (where to persist memory)
Field mapping rules (how to extract data)
Feature toggles (timestamps, reply context, token estimation)
Role identifiers (how to label participants)

This approach enables adapting the system to different use cases without code modifications.

Memory Consistency and Isolation

Isolation Guarantees:

Each chat maintains independent memory
No cross-conversation contamination
Concurrent conversations are safely isolated

Consistency Model:

Last-write-wins for memory updates
No transaction support (appropriate for conversational context)
Storage layer responsible for atomicity

Use Case Scenarios

Scenario 1: Customer Support Bot

A user initiates a support conversation:

First message creates new memory with user context
Subsequent questions append to conversation history
Support agent (bot) responses are marked with assistant role
Recent 15 messages maintained as sliding window
When resolved, memory can be cleared

Scenario 2: Group Discussion Bot

Multiple users discuss in a group chat:

Each message includes user identification
Bot can reference who said what
Reply context maintains conversation threads
Window keeps last 20 messages across all users
Bot maintains coherent multi-user context

Scenario 3: Long-Running Personal Assistant

A user has extended conversation with AI:

Private chat structure used (compact format)
Large window size (30-40 messages) for deeper context
Timestamps enabled for temporal reasoning
Token estimation prevents API limit issues
Memory persists across days/weeks

Implementation-Agnostic Concepts

Storage Abstraction Layer

The system defines a storage interface that any backend can implement:

Required Operations:

Retrieve memory by chat identifier
Save/update memory for chat identifier
Delete memory for chat identifier

Supported Backends:

In-memory storage (volatile, for development/testing)
Persistent storage solutions (databases, etc.)

The core logic is entirely independent of storage implementation, enabling deployment flexibility.

Validation Strategy

Input validation occurs at boundaries:

Validates chat identifiers are non-empty
Ensures message content exists
Verifies window size is within acceptable range
Type-checks critical fields

This defensive approach prevents corrupted memory states.

Error Handling Philosophy

The system follows a fail-fast approach:

Invalid input throws errors immediately
Storage failures propagate to caller
No silent data corruption
Errors include descriptive messages

This makes debugging and monitoring straightforward.

Design Rationale

Why Sliding Windows?

Language models have token limits and attention limitations. By maintaining a sliding window, the system:

Keeps memory bounded and predictable
Focuses on recent, relevant context
Prevents token limit violations
Improves response quality through focused context

Why Chat-Centric Storage?

Organizing by conversation rather than user:

Reflects natural conversation boundaries
Simplifies retrieval (one key lookup)
Supports both individual and group contexts
Aligns with how messaging platforms organize data

Why Different Structures for Chat Types?

Private and group chats have fundamentally different characteristics:

Private: Two participants, no ambiguity about speakers
Group: Multiple participants, must track "who said what"

Adapting the structure optimizes both memory efficiency and context clarity.

Why Optional Fields?

Not all use cases need all features:

Timestamps add overhead but enable temporal reasoning
Reply context increases complexity but improves coherence
Token estimation has computational cost but aids planning

Optional fields allow tuning the system to specific requirements.

Summary

Smart Memory transforms the challenge of conversational context management into a solved problem through:

Structured Memory: Organizing conversations into LLM-ready formats
Intelligent Windowing: Balancing context retention with resource constraints
Adaptive Architecture: Optimizing structure based on conversation type
Storage Abstraction: Enabling flexible deployment options
Configuration-Driven: Adapting to diverse use cases without code changes

The system serves as an intermediary layer between raw conversational data and language model consumption, handling the complexity of context management so that developers can focus on building conversational experiences rather than managing conversation state.