NEW: DynamoDB Streams Filtering in Serverless Framework
Pawel Zubkiewicz

Pawel Zubkiewicz @pzubkiewicz

About: AWS Serverless Hero ⚡ Serverless evangelist & AWS Cloud Architect. Runs Serverless Poland 🇵🇱 website & community.

Location:
Wroclaw, Poland
Joined:
Mar 20, 2021

NEW: DynamoDB Streams Filtering in Serverless Framework

Publish Date: Dec 6 '21
104 11

From this article, you will learn how to utilize recently released functionality of Streams Filtering with DynamoDB and Lambda.

We will move deeper than a basic sample of DynamoDB event action filtering. You will learn how to combine it with your business logic. I will be using DynamoDB single-table design setup for that.

What's new?

If you haven't heard, just before #reInvent2021 AWS dropped this huge update.

What's changed?

Before the update

Every action made in a DynamoDB table (INSERT, MODIFY, REMOVE) triggered an event that was sent over DynamoDB Streams to a Lambda function. Regardless of the action type, a Lambda function was always invoked. That had two repercussions:

  • You had to implement filter logic inside your Lambda code (if conditions) before executing your business logic (i.e. filter INSERT actions to send welcome email whenever new User was added into the table).
  • You paid for every Lambda run, even though in most cases you were interested only in some events.

That situation was multiplied in single-table design, where you store multiple types in a single table, so in reality you have many INSERTs with subtypes (ie. new user, new address, new order etc.)

After the update

Now, you can filter out events that are not relevant to your business logic. By defining filter criteria, you control which events can invoke a Lambda function. Filtering evaluates events based on values that are in the message.

This solves above-mentioned problems:

  • Logic evaluation is pushed on AWS (no more ifs in Lambda code)
  • No more needless Lambda execution.

All of that thanks to the small JSON snippet defining filter criteria.

Refactoring to the Streams Filtering

Since you're reading this article, it's safe to assume you're like me, already using DynamoDB Streams to invoke your Lambda functions.

Therefor, let me take you through the refactoring process. It's a simplified version of the code that I use on production.

In my DynamoDB table, I store two types of entities: Order and Invoice. My business logic requires me to do something only when Invoice is modified.
Business logic conditions
As you can see, it's just the single case out of six. Imagine what happens when you have more types in your table, and your business logic requires you to perform other actions as well.

Old event filtering

Let's start from those ugly if statements that I had before the update because I had to manually filter events.

My Lambda's handler started with execution of parseEvent method:

const parseEvent = (event) => {
  const e = event.Records[0] // batch size = 1
  const isInsert = e.eventName === 'INSERT'
  const isModify = e.eventName === 'MODIFY'

  const isOrder = e.dynamodb.NewImage?.Type?.S === 'Order'
  const isInvoice = e.dynamodb.NewImage?.Type?.S === 'Invoice'

  const newItemData = e.dynamodb.NewImage
  const oldItemData = e.dynamodb.OldImage

  return {
    isInsert, isModify, isOrder, isInvoice, newItemData, oldItemData
  }
}
Enter fullscreen mode Exit fullscreen mode

Next step, I had to evaluate the condition in my handler:

const {
    isInsert, isModify, isOrder, isInvoice, newItemData, oldItemData
  } = parseEvent(event)


if (isModify && isInvoice) {
  // perform business logic
  // uses newItemData & oldItemData values
}
Enter fullscreen mode Exit fullscreen mode

New event filtering

New functionality allows us to significantly simplify that code by pushing condition evaluation on AWS.

Just to recap, my business logic requires me to let in only MODIFY events that was performed on Invoice entities. Fortunately, I keep Type value on my entities in DynamoDB Table (thanks Alex 🤝).

The DynamoDB event structure is well-defined, so basically what I need to do is make sure that:

  • eventName equals to MODIFY, and
  • dynamodb.NewImage.Type.S equals to Invoice.

All of that is defined in filterPatterns section of Lambda configuration. Below is a snippet from Serverless Framework serverless.yml config file. Support for filterPatterns was introduced in version 2.68.0 - make sure you are using it or newer.

    functionName:
      handler: src/functionName/function.handler
      # other properties
      events:
      - stream:
          type: dynamodb
          arn: !GetAtt DynamoDbTable.StreamArn
          maximumRetryAttempts: 1
          batchSize: 1
          filterPatterns:
            - eventName: [MODIFY]
              dynamodb:
                 NewImage:
                   Type:
                     S: [Invoice]
Enter fullscreen mode Exit fullscreen mode

And that's all you need to do to filter your DynamoDB Stream.

Amazing, isn't it?

Gotchas

Bear in mind that there can be several filters on a single source. In such case, each filter works independently of the other. Simply put, there is OR not AND logic between them.

I learned that the hard way by mistakenly creating two filters:

          filterPatterns:
            - eventName: [MODIFY]
            - dynamodb:
                 NewImage:
                   Type:
                     S: [Invoice]
Enter fullscreen mode Exit fullscreen mode

by adding - in front of dynamodb:. It resulted in the wrong filter:

{
  "filters": [
    {
      "pattern": "{\"eventName\":[\"MODIFY\"]}"
    },
    {
      "pattern": "{\"dynamodb\":{\"NewImage\":{\"Type\":{\"S\":[\"Invoice\"]}}}}"
    }
  ]
}
Enter fullscreen mode Exit fullscreen mode

That one catches all MODIFY actions OR anything that has Invoice as Type in NewImage object, so DynamoDB INSERT actions as well!

Correct filter:

{
  "filters": [
    {
      "pattern": "{\"eventName\":[\"MODIFY\"],\"dynamodb\":{\"NewImage\":{\"Type\":{\"S\":[\"Invoice\"]}}}}"
    }
  ]
}
Enter fullscreen mode Exit fullscreen mode

You can view filter in Lambda console, under Configuration->Triggers section.

Global Tables

As kolektiv mentioned in the comments below, this functionality does not work with Global Tables.

One more catch, you can't use filtering with global tables, your filter will not be evaluated and your function will not be called. Confirmed with aws support.

Thanks for pointing that out.

How much does it cost?

Nothing.

There is no information about any additional pricing. Also, Jeremy Daly confirmed that during re:Invent 2021.

In reality, this functionality saves you money on maintenance because it's easier to write & debug Lambda code, and on operations, as functions are executed only responding to business relevant events.

Low coupling

Before the update, people implemented event filtering logic in a single Lambda function. Thus, struggling from high coupling (unless they utilized some kind of dispatcher pattern).

Now, we can have several independent Lambda functions, each with its filter criteria, attached to the same DynamoDB Stream. That results in lower coupling between code that handles different event types. This will be very much appreciated by all single-table design practitioners.

Update

I forgot to mention that you can do more than just evaluate string equals condition in the filter. There are more possibilities, delivered by several comparison operators.

Here is a table stolen borrowed from AWS Docs (If it's not OK to included it here please let me know.):

Comparison operator Example Rule syntax
Null UserID is null "UserID": [ null ]
Empty LastName is empty "LastName": [""]
Equals Name is "Alice" "Name": [ "Alice" ]
And Location is "New York" and Day is "Monday" "Location": [ "New York" ], "Day": ["Monday"]
Or PaymentType is "Credit" or "Debit" "PaymentType": [ "Credit", "Debit"]
Not Weather is anything but "Raining" "Weather": [ { "anything-but": [ "Raining" ] } ]
Numeric (equals) Price is 100 "Price": [ { "numeric": [ "=", 100 ] } ]
Numeric (range) Price is more than 10, and less than or equal to 20 "Price": [ { "numeric": [ ">", 10, "<=", 20 ] } ]
Exists ProductName exists "ProductName": [ { "exists": true } ]
Does not exist ProductName does not exist "ProductName": [ { "exists": false } ]
Begins with Region is in the US "Region": [ {"prefix": "us-" } ]

Summary

I hope this short article convinced you to refactor your Lambda functions that are invoked by DynamoDB Streams. It's really simple and makes a huge difference in terms of code clarity and costs.

Comments 11 total

  • kolektiv
    kolektivDec 7, 2021

    One more catch, you can't use filtering with global tables, your filter will not be evaluated and your function will not be called. Confirmed with aws support.

    • Kirk Kirkconnell
      Kirk KirkconnellDec 14, 2021

      Are you saying the two cannot be used at the same time or that you cannot use the event filter to filter Global Tables traffic from source to destination regions? These are two VERY different things.

      • kolektiv
        kolektivDec 17, 2021

        You cannot use stream filtering on a Global Table. Your Global Table will continue replicate/sync but your stream filter will not be evaluated and your trigger will not fire.

        • simi-obs
          simi-obsDec 24, 2021

          I am sorry but this answer IMO is very misleading. You did not actually answer Kirk's question properly. He is correct when he says you cannot use the event filter to filter Global Tables traffic from source to destination regions

          But you can actually use this feature (of event filtering for lambdas). I confirmed with AWS. Here is the link: repost.aws/questions/QUgOGCJJhAStm...

          • kolektiv
            kolektivDec 24, 2021

            this is the reply from AWS support:

            -->
            That said, DynamoDB streams capture any modification to a DynamoDB table for example an insert,update or delete. We can attach a trigger to the stream, specifically a lambda function. This lambda function will be invoked every time a modification is made to the table. There is no option to filter this action on only certain items, the reason for this is that the streams are required to keep the replica table in the different region in sync with the base table.

            We can however add logic to our trigger function to discard any items that do not contain the required/desired tag/value. However the function will still be triggered if the item updated/inserted or deleted does not contain the value/tag you want to filter on.
            <--

            So you can see according to AWS, on a global table your Lambda still gets called and ignores the filter.

            • Leeroy Hannigan
              Leeroy HanniganDec 25, 2021

              This is not correct information. The filtering is on the Event Source Mapping on the Lambda side which is completely decoupled from DynamoDB Stream. Event filtering works regardless, as Global Table replication system is completely separate from your Lambda trigger.

              On a side-note, try this filter @koletiv

              {
                "filters": [
                  {
                    "pattern": "{\"dynamodb\":{\"NewImage\":{\"region\":{\"S\":[\"us-west-2\"]}}}}"
                  }
                ]
              }
              
              Enter fullscreen mode Exit fullscreen mode
  • Vince Fulco (It / It's)
    Vince Fulco (It / It's)Dec 8, 2021

    Great article. Thank you.

    One typo, "Therefor, let me take you through the refactoring process."-->"Therefore, let me take you through the refactoring process."

  • Dimsum Chen
    Dimsum ChenJan 28, 2022

    Can filtering be used to compare the NewImage value of an attribute with the OldImage value of an attribute?

Add comment