Managing Large Values in Redis Without Consuming Excessive Memory

Redis is a high performance in-memory data store that excels in speed and simplicity. However, when dealing with large values especially in scenarios where memory is limited it is important to implement strategies to effectively manage memory usage while maintaining performance.

This blog explores practical methods to handle large values in Redis without exhausting your memory resources.

In this blog post:

  1. Why large values pose a challenge in Redis?
  2. What is considered a large value in Redis?
  3. Strategies to manage large values in Redis
  4. Redis Monitoring with Atatus

Why large values pose a challenge in Redis?

Redis stores all data in memory, which makes it incredibly fast. However, this design also means that large values can quickly consume available memory, leading to performance degradation or system instability. For scenarios like log storage or temporary data caching, it becomes crucial to balance memory usage with application requirements.

Key Challenges:

  • Memory Limits: Large values can lead to memory exhaustion, especially in systems with limited RAM.
  • Fragmentation: Storing large contiguous blocks of data can cause memory fragmentation.
  • Durability: Ensuring data persistence while managing large values can be tricky.

What is considered a large value in Redis?

Generally, values in Redis are considered "large" when they exceed 10KB in size. However, this is contextual and depends on:

  1. Your total memory capacity
  2. Number of keys you are storing
  3. Access patterns
  4. Overall application requirements

Examples of Large Values:

Large values - Examples
Large values - Examples

(i). JSON Document:

{
    "orderId": "12345",
    "customer": {
        "id": "C789",
        "name": "John Doe",
        "address": {...},
        "orderHistory": [...],
    },
    "items": [
        {
            "productId": "P1",
            "name": "Product 1",
            "description": "Long product description...",
            "specifications": {...},
            "reviews": [
                // Hundreds of customer reviews
                {...},
                {...}
            ]
        },
        // Many more items
    ],
    "tracking": {
        "history": [
            // Detailed tracking history with timestamps
            {...},
            {...}
        ]
    }
}

Size: Could be 50KB-500KB depending on the content.

(ii). Image Data:

# Base64 encoded image string
image_data = "/9j/4AAQSkZJRgABAQEAYABgAAD/2wBDAAgGBgcGBQgHBwcJCQgKDBQNDAsLDBkSEw8UHR..."
# Typically 100KB-2MB for a medium resolution image

(iii). Session Data:

session_data = {
    "user_id": "12345",
    "cart_items": [...],  # Large array of items
    "browsing_history": [...],  # Hundreds of previously viewed items
    "recommendations": [...],  # Personalized product recommendations
    "preferences": {...},  # User settings and preferences
    "activity_log": [...],  # User activity tracking
}
# Could be 20KB-100KB depending on user activity

(iv). Log Entries:

log_entry = {
    "timestamp": "2024-12-19T10:00:00Z",
    "application": "ecommerce-platform",
    "environment": "production",
    "request": {
        "headers": {...},
        "body": "...",  # Large request payload
        "params": {...}
    },
    "response": {
        "status": 200,
        "headers": {...},
        "body": "..."  # Large response payload
    },
    "performance_metrics": {...},
    "stack_trace": "...",
    "related_events": [...]
}
# Could range from 30KB-200KB per entry

(v). Cached API Response:

{
    "data": {
        "products": [
            // Hundreds of products with detailed information
            {
                "id": "1",
                "name": "Product",
                "description": "...",
                "categories": [...],
                "variants": [...],
                "images": [...],
                "pricing": {...},
                "inventory": {...},
                "metadata": {...}
            }
        ],
        "pagination": {...},
        "filters": {...},
        "metadata": {...}
    }
}

Size: Could be 100KB-1MB for large product catalogs

Strategies to manage large values in Redis

Efficient Redis Value Management
Efficient Redis Value Management

(i). Splitting data into smaller chunks

Instead of storing a single large value, split it into smaller chunks. This helps Redis manage memory more efficiently and reduces the risk of fragmentation.

Implementation:

  • Divide large values into fixed-size chunks (e.g., 1 MB each).
  • Use a naming pattern for the chunks.

Example:

SET log:123:chunk:1 "chunk data 1"
SET log:123:chunk:2 "chunk data 2"
HSET log:123:metadata total_chunks 2

When retrieving the data:

HGET log:123:metadata
GET log:123:chunk:1
GET log:123:chunk:2

Example (Python):

import json

class RedisChunker:
    def __init__(self, redis_client, chunk_size=1024*1024):  # 1MB chunks
        self.redis = redis_client
        self.chunk_size = chunk_size

    def store_chunked(self, key, data):
        chunks = [data[i:i + self.chunk_size]
                 for i in range(0, len(data), self.chunk_size)]

        # Store metadata
        self.redis.hset(f"{key}:metadata", mapping={
            "total_chunks": len(chunks),
            "size": len(data)
        })

        # Store chunks
        for i, chunk in enumerate(chunks, 1):
            self.redis.set(f"{key}:chunk:{i}", chunk)

    def retrieve_chunked(self, key):
        # Get metadata
        metadata = self.redis.hgetall(f"{key}:metadata")
        if not metadata:
            return None

        # Retrieve and combine chunks
        chunks = []
        for i in range(1, int(metadata[b'total_chunks']) + 1):
            chunk = self.redis.get(f"{key}:chunk:{i}")
            chunks.append(chunk)

        return b''.join(chunks)

# Usage example
chunker = RedisChunker(redis_client)
large_data = "..." * 1024 * 1024  # Large string
chunker.store_chunked("my_large_value", large_data.encode())

(ii). Use compression

Compressing large values before storing them in Redis can significantly reduce memory usage.

Example (Python):

import zlib

# Compress data before storing
compressed_value = zlib.compress(large_value.encode())
redis_client.set("log:123", compressed_value)

# Decompress when retrieving
compressed_value = redis_client.get("log:123")
original_value = zlib.decompress(compressed_value).decode()

Compression algorithms like zlib, gzip, or brotli offer high compression ratios, saving 50% to 90% of memory depending on the data.

(iii). Use Redis streams

Redis Streams are ideal for log-like data. They are memory-efficient and support high-throughput ingestion.

Commands:

  • Add logs to a stream:
XADD logs MAXLEN ~10000 * message "Log message here"
  • Read logs:
XRANGE logs - +

Streams automatically trim older entries using the MAXLEN parameter, which helps in memory management.

Example (Python):

from datetime import datetime
import json

class RedisStreamManager:
    def __init__(self, redis_client, stream_name, max_len=10000):
        self.redis = redis_client
        self.stream = stream_name
        self.max_len = max_len

    def add_entry(self, data: dict):
        # Add timestamp if not present
        if 'timestamp' not in data:
            data['timestamp'] = datetime.utcnow().isoformat()

        # Convert to string format
        entry = {
            k: json.dumps(v) if isinstance(v, (dict, list)) else str(v)
            for k, v in data.items()
        }

        # Add to stream with automatic trimming
        return self.redis.xadd(
            self.stream,
            entry,
            maxlen=self.max_len,
            approximate=True
        )

    def read_range(self, start='-', end='+', count=None):
        entries = self.redis.xrange(self.stream, start, end, count)

        # Parse entries
        result = []
        for entry_id, entry_data in entries:
            parsed_data = {
                k.decode(): self._parse_value(v.decode())
                for k, v in entry_data.items()
            }
            result.append((entry_id.decode(), parsed_data))

        return result

    def _parse_value(self, value: str):
        try:
            return json.loads(value)
        except json.JSONDecodeError:
            return value

# Usage example
stream_manager = RedisStreamManager(redis_client, "app_logs")
stream_manager.add_entry({
    "level": "INFO",
    "message": "User login",
    "metadata": {"user_id": 123, "ip": "192.168.1.1"}
})

(iv). Offload large data to external storage

For large values that aren’t frequently accessed, use Redis as a metadata store and move the actual data to external storage (e.g., databases, object stores).

Example:

  • Store metadata in Redis:
HSET log:123 metadata '{"size": "100MB", "location": "s3://mybucket/log-123"}'
  • Fetch the actual value from S3 or another backend when needed.

(v). Use binary encoding

Binary encoding formats like Protocol Buffers or MessagePack can reduce the size of structured data stored in Redis.

Example:

  • Original JSON: {"key1": "value1", "key2": "value2"} (50 bytes).
  • Encoded Binary: <binary> (20 bytes).

This approach works well for numeric or structured data.

(vi). Enable disk persistence

Redis offers persistence options to ensure data durability, but it’s primarily for backup and recovery rather than active disk usage.

Persistence Modes:

  • RDB (Snapshotting): Periodically saves the entire dataset to disk.
save 60 10000  # Save every 60 seconds if 10,000 keys are updated.
  • AOF (Append-Only File): Logs every write operation to disk.
appendonly yes
appendfsync everysec

(vii). Use a Redis cluster

If memory limits are a concern, scale horizontally by deploying a Redis Cluster. A cluster distributes keys across multiple nodes, increasing the total available memory.

Example: Use consistent hashing to ensure related keys (e.g., chunks of a large value) are stored on the same node.

(viii). Set key expiration

For logs or temporary data, set expiration times to automatically delete old values:

SET log:123 "large value" EX 3600  # Expires in 1 hour

(ix). Monitor memory usage

Regularly monitor Redis memory to identify bottlenecks and optimize key usage:

  • Check memory usage for a specific key:
MEMORY USAGE key_name
  • Inspect overall memory stats:
INFO memory

(x). Optimize memory settings

Compact Memory: Periodically run the code

MEMORY PURGE

Limit Memory: Prevent Redis from exhausting system memory by setting limits

maxmemory 1.5gb
maxmemory-policy noeviction

Managing large values in Redis can be simplified with the right techniques. Chunking data, using compression, using Redis streams, or move some of the data to external storage are effective ways to handle large values without overloading memory. Regular monitoring and tuning of Redis configurations can further enhance performance and reliability. By implementing these strategies, Redis stays efficient and reliable, even when managing large amounts of data in your system.

Redis Monitoring with Atatus

Atatus Redis Monitoring is a best-practice solution for ensuring the health, performance, and stability of your Redis infrastructure. Atatus is a popular monitoring and analytics platform that provides various tools for monitoring various services and systems, including Redis.

Atatus Redis Monitoring offers:

  • Problem Diagnostics - Troubleshoot issues quickly by drilling down into detailed Redis metrics.
  • Latency Analysis - Pinpoint delays in data retrieval and updates, ensuring optimal responsiveness of your Redis operations.
  • Throughput Insights - Monitor request and command rates to fine-tune Redis for maximum throughput, preventing performance bottlenecks.
  • Memory Utilization - Keep track of memory usage trends to prevent out-of-memory crashes and optimize data storage.
  • Capacity Planning - Make informed decisions about scaling resources based on usage patterns.
  • Connection Tracking - Track connections in real-time to manage resource allocation and ensure efficient network usage.

Adopting Atatus will ensure that you keep your data-store in check while also improving on performance and reliability of the Redis Package.

Get a 14-day free trial of Atatus today!

Atatus

#1 Solution for Logs, Traces & Metrics

tick-logo APM

tick-logo Kubernetes

tick-logo Logs

tick-logo Synthetics

tick-logo RUM

tick-logo Serverless

tick-logo Security

tick-logo More

Pavithra Parthiban

Pavithra Parthiban

A technical content writer specializing in monitoring and observability tools, adept at making complex concepts easy to understand.
Chennai