Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

  1. Leave the data in STD: $15,185

  2. Move current unlinked (>=128 KB) to infrequent Access*: $14,394

    1. Using the unlinked data >= 128KB: (680T - 91U) in standard + 91U in IA

    2. Consider an avg of 28,808/month unlinked files for moving data to IA (PUTs cost)

  3. Eventually move unlinked data to Glacier deep archive*: $13,323

    1. Using the unlinked data >= 128KB: (680T - 91U) in standard + 91U in Glacier Deep Archive

    2. Consider an avg of 28,808/month unlinked files for moving data to IA (PUTs cost)

    3. Consider the avg of 28,808/month for lifecycle transitions to Glacier and Glacier Deep Archive

  4. Move everything to INT**: $11,775

    1. Consider 44% in INT-Standard (300H/680T)

    2. Consider 56% in INT-IA ((680T-300H)/680T)

    3. Includes the 44M objects monitoring fee

  5. Move to INT + unlinked eventually in INT-deep archive**: $10,734

    1. Assumes that unlinked data is part of the cold data

    2. Consider 44% in INT-Standard (300H/680T)

    3. Consider 43% in INT-IA ((680TD - 300H - 91U)/680T)

    4. Consider 13% in INT-Deep Archive (91U/680T)

    5. Includes the 44M objects monitoring fee

    6. Assumes that we have 28,808 tags per month (GET/PUT request + tag costs)

  6. Move to INT + delete unlinked data***: $10,544

    1. Assumes that unlinked data is part of the cold data

    2. Consider 50% in INT-Standard (300H/(680T - 91))

    3. Consider 50% in INT-IA

    4. Includes the 44M - 8M objects monitoring fee, no tags, no lifecycle transitions

...

** Note that it does not include the initial fee for moving all the 44M objects to INT of about $400 (through a lifecycle transition), once an object is in INT there are no fees for other transitions. Additionally in order to detect UNLINKED file handles we use a few additional services heavily, such as kinesis firehose to stream data to S3, Athena for querying, Step Functions to orchestrate and EventBridge for triggering periodically. The service that is most expensive is kinesis firehose as we are pushing a lot of data through, checking the billing for the last few months though this adds up to a few tens of dollars so we excluded it from the estimates, see the cost explorer for kinesis.

*** Note that there is the possibility that a lot of data becomes unlinked in a single month (e.g. a large amount of data is copied in another bucket). We would pay the overhead of this data for a few months that will move to cheaper tiers and eventually archived into deep archive. Given that this is not a common use case (e.g. It might happen once or twice every few years), the complexities around how we store file handles and the current limitations of S3 we do not plan to support a “speed up” of the archival. The user can still request the deletion of their data through the appropriate APIs.

It is clear that just moving everything to INT is cost effective for our use case where our access patterns are unknown since we have data that even if still linked is rarely accessed (e.g. older projects, older versions of entities, older tables, messages etc). On top of it, archiving unlinked data might be worth it even though the cost savings is not comparable to just using the INT storage class, we have around 1TiB of data each month in average that gets unlinked.

...