Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

  • Why can we make “copies” of file handles in the first place? Maybe we should not allow it in S3 or GC buckets? Who is using this? If we get rid of this does it make it simpler? → This seems to be used heavily by scientists (from DW data), unrealistic to get rid of it and/or deduplicate them.

  • I have no idea when the last access date is updated for the INT class nor a way to test it quickly, does tagging the object reset this and moves the object back to INT-Standard? If so we might have a weird lifecycle where an object that is uploaded might go: INT-Standard → (Day 30) INT-IA → UNLINKED → (Day 60, tagging) INT-Standard → (Day 90) INT-IA → (Day 150) INT-Archive → (Day 240) INT-DeepArchive. Basically we might have a month where we go back paying the INT-Standard, in the long run the cost would be amortized since objects will eventually move to the archive tier. → ✅ I verified that tagging does NOT in fact push the object back to the INT-Standard. In my own bucket I have objects that are INT-IA, I tagged those objects from the console (therefore fetching the tags as well) and the metrics reported after a day did not change the count of object in INT-IA. This is great news as the lifecycle is now clear and more cost effective: INT-Standard → (Day 30) INT-IA → UNLINKED → (Day 60, Tagging) → (Day 90) INT-Archive → (Day 180) INT-DeepArchive.

  • Should we delete the copies of the file handles if at least one is AVAILABLE? If so, if all copies are UNLINKED should we just keep one around? Which one? → No, copies effectively make the ownership a 1 to N relationship

  • It is really not clear if lifecycle transitions generate S3 notifications, for example if we wanted to get a notification when a temporary object is automatically expired so that we can remove the file handle record. It looks like this was not possible a few years ago, but the reference in the documentation disappeared, we will have to test this or contact support.

  • I didn’t even start thinking about GC or other type of file handles. The vast majority of unlinked data is in prod though.

...