Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

dt

count

size

2020-11-05-00-00

41625517

704212857157112 (640.4TB)

2020-11-08-00-00

41654050

708573173690177 (644.4TB)

The inventory reports also if a file was uploaded as multipart, this would provide us with how many objects are uploaded not through the standard synapse upload API:

...

2020-11-08-00-00

5712315

531513092323401 (483.4TB)

2020-11-05-00-00

5705476

527196059092709 (479.4TB)

This result is surprising, only 5.7M seems to be multipart uploads but we do have an order of magnitude more than that, what is going on?

On further analysis we checked a few of those files and we could see that they were in fact normal multipart uploads in the DB with the relative file handles. The reason for this inconsistency is that we encrypted the S3 bucket back in 2019, this most likely was done using a PUT copy of the same object. This belief is enforced reinforced by the fact that the modified dates on those objects seem to be consistent with the timeline of the encryption, while the original upload date in synapse was done prior. If the python API was used most likely all the objects that were smaller than a certain size were copied “copied” over without multipart.

Synapse File Handles

...