Package Information
Available Nodes
Documentation
n8n Smart Memory node: Conversational Memory Management System
Overview
Smart Memory is a sophisticated memory management system designed for conversational AI applications, particularly chatbots. It provides structured storage and retrieval of conversation histories, enabling language models to maintain context across multiple interactions while optimizing memory usage through intelligent windowing and organization strategies.
Conceptual Architecture
Core Purpose
The system addresses a fundamental challenge in conversational AI: maintaining coherent, contextually-aware conversations across multiple messages while managing memory constraints. It transforms raw conversational data into a structured format optimized for language model consumption, handling the complexity of different chat types (private conversations vs. group discussions) and maintaining relevant metadata.
High-Level Design Philosophy
The architecture follows these key principles:
- Chat-Centric Organization: Memory is organized by conversation (chat) rather than by user, reflecting natural conversation boundaries
- Adaptive Structure: Different chat types (private, group, supergroup, channel) receive appropriate memory structures
- Sliding Window Strategy: Maintains recent context while preventing unbounded memory growth
- Storage Abstraction: Separates memory logic from storage implementation, enabling different backends
- LLM-Ready Output: Structures data in formats that language models can directly consume
How Smart Memory Functions
Memory Structure Design
The system organizes conversational memory into a hierarchical structure:
Chat Level Information:
- Unique chat identifier (for memory retrieval)
- Chat type classification (private, group, supergroup, channel)
- Chat name/title (for context)
User Information Strategy:
- Private Chats: User information stored once at the chat level (avoiding redundancy since only two participants exist)
- Group Chats: User information attached to each message (necessary since multiple users participate)
Message Collection:
- Ordered sequence of conversation turns
- Each message contains:
- Role identifier (user or assistant)
- Message content
- Optional timestamp (when temporal context matters)
- Optional user information (for group contexts)
- Optional reply context (tracking conversational threads)
Processing Pipeline
When new conversational data arrives, the system executes the following sequence:
1. Input Validation Phase
The system validates incoming data to ensure:
- Chat identifier exists and is valid
- Message content is present and non-empty
- User information is complete (when required)
- Data types match expected formats
2. Memory Retrieval Phase
Based on the chat identifier, the system:
- Queries the storage layer for existing conversation memory
- Returns null if this is the first interaction for this chat
- Otherwise, returns the complete memory structure
3. Message Processing Phase
For New Conversations:
- Creates a fresh memory structure
- Initializes chat metadata (ID, type, name)
- For private chats: extracts and stores user information at the chat level
- Prepares an empty message collection
For Existing Conversations:
- Retrieves current memory state
- Validates consistency of chat metadata
- Prepares to append new message
4. Message Construction Phase
The system builds a structured message object:
Role Assignment:
- Determines if the message is from the user or the assistant
- Uses configurable role identifiers (allowing customization)
Content Extraction:
- Extracts text content from various possible fields
- Handles different message types (text, captions, bot responses)
- Preserves original content without modification
Metadata Attachment:
- Conditionally adds timestamp (if temporal tracking is enabled)
- For group chats: attaches user identification (display name, username)
- For replies: captures reply context (who was replied to, what was said)
5. Reply Context Processing
When messages are replies to previous messages:
- Extracts information about the original message
- Identifies the user being replied to
- Captures the content of the replied-to message
- Structures this as contextual metadata attached to the new message
This enables language models to understand conversational threading without requiring the entire conversation history to be repeatedly processed.
6. Memory Windowing Phase
After adding the new message, the system applies a sliding window strategy:
Window Size Enforcement:
- Maintains only the most recent N messages
- Older messages are automatically pruned
- Window size is configurable per use case
Rationale:
- Prevents unbounded memory growth
- Focuses language model attention on recent context
- Reduces token consumption in LLM API calls
- Balances context retention with computational efficiency
7. Storage Persistence Phase
The updated memory structure is persisted to the storage backend:
- Uses the chat identifier as the storage key
- Completely replaces previous memory state
- Storage operation is atomic (succeeds or fails completely)
8. Output Formatting Phase
The system prepares output optimized for downstream consumption:
- Structures data for direct LLM prompt injection
- Optionally estimates token count (for API planning)
- Removes empty/null fields (reducing payload size)
- Maintains consistent JSON structure
Operation Modes
The system supports three primary operations:
Insert Operation
Adds new messages to conversation memory with two modes:
Append Mode:
- Adds the new message to the existing message sequence
- Preserves all previous messages (within the window)
- Standard operation for ongoing conversations
Replace Mode:
- Discards existing memory completely
- Creates fresh memory with just the new message
- Useful for conversation resets or context clearing
Get Operation
Retrieves the current memory state for a conversation:
- Returns complete memory structure
- Includes all messages within the window
- Provides metadata for context understanding
Clear Operation
Removes all memory for a conversation:
- Deletes the entire memory structure
- Used for privacy compliance or session termination
- Irreversible operation
Field Mapping and Expression Evaluation
The system handles diverse input formats through configurable field mapping:
Dynamic Field Resolution:
- Maps generic field identifiers to actual data locations
- Evaluates simple expressions to extract nested data
- Constructs composite fields (e.g., full name from first_name + last_name)
Example Transformations:
- Combining separate name fields into a single display name
- Extracting optional fields with fallback defaults
- Handling missing data gracefully
This abstraction allows the system to work with various messaging platforms without hardcoding platform-specific structures.
Chat Type Differentiation
The system optimizes memory structure based on conversation context:
Private Conversations:
- Single user information stored at chat level
- No need to identify message authors (only two participants)
- More compact memory structure
- Reduced redundancy
Group Conversations:
- User information attached to each message
- Essential for multi-participant context
- Enables "who said what" tracking
- Supports natural language references ("as John mentioned...")
This adaptive approach reduces memory overhead while maintaining necessary context.
Token Estimation
For LLM API planning, the system provides token estimation:
Estimation Method:
- Converts entire memory structure to JSON string
- Applies a heuristic ratio (approximately 4 characters per token)
- Returns conservative estimate
Use Cases:
- Pre-flight checks before LLM API calls
- Monitoring memory size growth
- Implementing dynamic windowing based on token limits
- Cost estimation for API usage
Output Cleaning
Before returning memory structures, the system removes noise:
Recursive Cleaning Process:
- Removes undefined and null values
- Eliminates empty strings
- Filters empty arrays and objects
- Processes nested structures recursively
Benefits:
- Reduces payload size
- Simplifies downstream processing
- Prevents language models from processing empty fields
- Creates cleaner prompt context
Integration Pattern
The system operates as a stateful service within workflow environments:
Typical Flow:
- Conversational input arrives (user message or bot response)
- Smart Memory processes and stores the message
- System retrieves current memory state
- Memory is formatted into LLM prompt
- LLM generates response using full context
- Bot response is stored back into memory
- Cycle repeats for next interaction
Session Management:
- Each conversation has a unique identifier (chat ID)
- Memory persists across workflow executions
- Storage backend handles persistence layer
- Memory lifecycle is independent of workflow lifecycle
Configuration Philosophy
The system is designed to be configuration-driven rather than code-driven:
Configurable Aspects:
- Memory window size (how much history to retain)
- Storage backend selection (where to persist memory)
- Field mapping rules (how to extract data)
- Feature toggles (timestamps, reply context, token estimation)
- Role identifiers (how to label participants)
This approach enables adapting the system to different use cases without code modifications.
Memory Consistency and Isolation
Isolation Guarantees:
- Each chat maintains independent memory
- No cross-conversation contamination
- Concurrent conversations are safely isolated
Consistency Model:
- Last-write-wins for memory updates
- No transaction support (appropriate for conversational context)
- Storage layer responsible for atomicity
Use Case Scenarios
Scenario 1: Customer Support Bot
A user initiates a support conversation:
- First message creates new memory with user context
- Subsequent questions append to conversation history
- Support agent (bot) responses are marked with assistant role
- Recent 15 messages maintained as sliding window
- When resolved, memory can be cleared
Scenario 2: Group Discussion Bot
Multiple users discuss in a group chat:
- Each message includes user identification
- Bot can reference who said what
- Reply context maintains conversation threads
- Window keeps last 20 messages across all users
- Bot maintains coherent multi-user context
Scenario 3: Long-Running Personal Assistant
A user has extended conversation with AI:
- Private chat structure used (compact format)
- Large window size (30-40 messages) for deeper context
- Timestamps enabled for temporal reasoning
- Token estimation prevents API limit issues
- Memory persists across days/weeks
Implementation-Agnostic Concepts
Storage Abstraction Layer
The system defines a storage interface that any backend can implement:
Required Operations:
- Retrieve memory by chat identifier
- Save/update memory for chat identifier
- Delete memory for chat identifier
Supported Backends:
- In-memory storage (volatile, for development/testing)
- Persistent storage solutions (databases, etc.)
The core logic is entirely independent of storage implementation, enabling deployment flexibility.
Validation Strategy
Input validation occurs at boundaries:
- Validates chat identifiers are non-empty
- Ensures message content exists
- Verifies window size is within acceptable range
- Type-checks critical fields
This defensive approach prevents corrupted memory states.
Error Handling Philosophy
The system follows a fail-fast approach:
- Invalid input throws errors immediately
- Storage failures propagate to caller
- No silent data corruption
- Errors include descriptive messages
This makes debugging and monitoring straightforward.
Design Rationale
Why Sliding Windows?
Language models have token limits and attention limitations. By maintaining a sliding window, the system:
- Keeps memory bounded and predictable
- Focuses on recent, relevant context
- Prevents token limit violations
- Improves response quality through focused context
Why Chat-Centric Storage?
Organizing by conversation rather than user:
- Reflects natural conversation boundaries
- Simplifies retrieval (one key lookup)
- Supports both individual and group contexts
- Aligns with how messaging platforms organize data
Why Different Structures for Chat Types?
Private and group chats have fundamentally different characteristics:
- Private: Two participants, no ambiguity about speakers
- Group: Multiple participants, must track "who said what"
Adapting the structure optimizes both memory efficiency and context clarity.
Why Optional Fields?
Not all use cases need all features:
- Timestamps add overhead but enable temporal reasoning
- Reply context increases complexity but improves coherence
- Token estimation has computational cost but aids planning
Optional fields allow tuning the system to specific requirements.
Summary
Smart Memory transforms the challenge of conversational context management into a solved problem through:
- Structured Memory: Organizing conversations into LLM-ready formats
- Intelligent Windowing: Balancing context retention with resource constraints
- Adaptive Architecture: Optimizing structure based on conversation type
- Storage Abstraction: Enabling flexible deployment options
- Configuration-Driven: Adapting to diverse use cases without code changes
The system serves as an intermediary layer between raw conversational data and language model consumption, handling the complexity of context management so that developers can focus on building conversational experiences rather than managing conversation state.