DynamoDB - Sorting Records By Date And Pagination With Redis Indexes


Have you ever tried to sort a collection of DynamoDB records by date? It is kind of a pain, especially if you want to do pagination with it. Last week I was busy over engineering the Schematical.com daily writing functionality which is stored on DynamoDB and wanted it all to be sorted nicely by date WITHOUT needing to resort to a scan. Additionally since I have barely missed a day all year the posts were building up and pagination was needed to ensure the pages loaded smoothly.

DynamoDB is an incredible technology because it comes out of the box with Sharding but that feature is a double edged sword as it limits what types of operations you can do across large datasets like sorting. You can still do some of that with things like a scan operation or AWS Glue’s DynamoDB connection but those are both slow and cost a lot of money.

My solution was to create my own sorted index. It occurred to me I could just store the sorted list in another DynamoDB record but I decided to test out an AWS ElastiCache Redis Serverless Instance. You only get charged for the compute power you use when you run a query which is awesome. The drawback is that they charge around $0.25 per hour as opposed to DynamoDB’s $0.25 per months. So it is drastically more expensive to store data in Redis than DynamoDB.

So how do we keep our Redis costs down? By only storing the record’s ID in redis, not the full record, like the post body which makes up the bulk of the record. Furthermore for pagination we store the Record Id in a sorted list using the ZADD and ZRANGE.

Here is the code I use to putPosts into Redis and DynamoDB:

const putPost = async (post: any) => {
    // console.log("post", post);
    let publicDate = post.PublicDate;
    const redis = getRedis();
    const exists = await redis.zscore(REDIS_POSTS_KEY, post.PostId);
    let mode = 'NX';
    if (exists) {
        mode = 'XX';
    }
    const redisPost = {
        PostId: post.PostId,
        // Body: post.Body,
        Title: post.Title,
        Tags: post.Tags,
        Description: post.Description,
        PublicDate: publicDate,
        ImageSRC: post.ImageSRC
    }
    const publicDateDt = new Date(post.PublicDate);
    const score = `${publicDateDt.getFullYear()}${publicDateDt.getMonth()}${publicDateDt.getDate()}`
    await redis.zadd(REDIS_POSTS_KEY, mode, score, JSON.stringify(redisPost));

    const commandPayload = {
        TableName: SchematicalComPost,
        Item: {
            PostId: post.PostId,
            Body: post.Body,
            Title: post.Title,
            Tags: post.Tags,
            Description: post.Description,
            PublicDate: publicDate,
            ImageSRC: post.ImageSRC
        }
    };
    console.log("commandPayload: ", commandPayload);
    const command = new PutCommand(commandPayload);

    const response: PutCommandOutput = await docClient.send(command);
    console.log("response:", response);
    return response.Attributes;
}

Notice it changes the mode based on whether or not that key exists in my list.

And here is my full paginated code to retrieve the code in a paginated fashion:

const getPosts = async (options?: { all?: boolean, page?: number, pageSize?: number, populateDynamo?: boolean }): Promise<any> => {
    const redis = getRedis();
    let start;
    let end;
    let posts;
    if (!options?.all) {
        const today = new Date();
        const year = today.getFullYear();
        const month = ('0' + (today.getMonth() + 1)).slice(-2); // Adding 1 because getMonth() returns zero-based month index
        const day = ('0' + today.getDate()).slice(-2);

        const score = `${year}${month}${day}`;
        const latestPostArr = await redis.zrange(REDIS_POSTS_KEY, score, 1, 'BYSCORE', 'REV', 'LIMIT', 0, 1);//, 'WITHSCORES');
        const startPos = await redis.zrank(REDIS_POSTS_KEY, latestPostArr[0]);
        start = (startPos || 0) - ((options?.page || 0) * PAGE_SIZE);
        end = start - (options?.pageSize || PAGE_SIZE);
        posts = await redis.zrange(REDIS_POSTS_KEY, end, start - 1); //,);//, 'WITHSCORES')
    } else {

        start = ((options?.page || 0) * PAGE_SIZE);
        end = start + (options?.pageSize || PAGE_SIZE);
        posts = await redis.zrange(REDIS_POSTS_KEY, start, end, 'REV'); //,);//, 'WITHSCORES')
    }

    posts = posts.map((post) => JSON.parse(post)).reverse();

    if (options?.populateDynamo) {
        const commandOptions = {
            RequestItems: {
                SchematicalComPost: {
                    Keys: posts.map((post) => ({
                        PostId: post.PostId, // {S: post.PostId},
                        PublicDate: post.PublicDate, // {S: post.PublicDate}
                    }))
                }
            }
        };
        const command = new BatchGetCommand(commandOptions);

        const results = await docClient.send(command);
        if(!results?.Responses?.SchematicalComPost){
            throw new Error("Missing `results?.Responses`");
        }
        posts = results.Responses.SchematicalComPost.map((post, i) => {
            if(!results?.Responses?.SchematicalComPost){
                throw new Error("Missing `results?.Responses?.SchematicalComPost`");
            }
            return (post || null)
        });
    }

    return posts;
}

If you are at all curious about the Terraform scripts I used to setup the infrastructure you can check out OpenSource Terraform Scripts.

Now would I use serverless redis with one of my larger clients that does $100k or more in revenue an hour? No, they have consistent enough traffic that it makes sense to use a provisioned elasticache cluster. I would( and do) still use Redis Elasticache Indexes with my clients just provisioned over serverless.

Additionally you can see I used a single Redis Serverless Instance for both the dev and the production environment. I would NOT do that for my client’s website, but for my tiny little website I figured I would keep it contained. If you are interested in engaging my services as a Web Application Architect check out my Group Coaching Program or 1 on 1 Consulting at schematical.com and check out my FREE eBook 20 Things You Can DoTo Save Money On Your Amazon Web Services Bill Today

~Cheers


Buy me a coffee if you found this helpful.

© Copright 2024 Schematical
17 S. Fairchild St. FL 7, Madison, WI 53703
DISCLAIMER: The information provided is for general informational purposes only. Posts and other information may not be updated to account for changes in the law and should not be considered advice. None of the articles or posts on this website are intended to create an attorney-client relationship. You should consult with legal and/or financial advisors for legal and tax advice tailored to your specific circumstances.