I'm working on a serverless stack.
I have a delimited CSV file with 5-6 million lines. I want to import this into AWS DynamoDB.
Is it possible somehow to import this into AWS DynamoDB, without burning my monthly salary down the toilet? How to solve this on the cheap?
This data dump will be constantly updated and I want to import the new lines into DB. I think the most efficient way would be to make a data dump from my DynamoDB, find a key, compare it with the new third-party data dump's key (this key needs to be extracted from a column by regex), and only import the new lines into my DB. Do you have some recommendation (big data framework or some AWS service) which can help me with this? Or I'm open if there's a shell or a Go script to do the same thing.
Thank You!
This is a fun problem. Can I ask some questions? How often would you need to dump data into DDB and what is the SLA between a new revision showing up and when you need to get the data into DDB?
If updated even a few times a day you could set the WCU to 1000 units for an hour (~3.6 million writes) for $0.65. The naive solution seems worth implementing.