improve download speeds to be comparable to AWS
Description
Our data sizes are increasing, specifically for whole genome sequencing (500GB-1TB per file). Our download speeds are significantly slower than using AWS clients - my tests show at least 4x slower. In cases where we have data in external S3 buckets, our users are circumventing Synapse to get improved download speeds, and are (rightfully) complaining that data that does reside in Synapse S3 storage is slow to download when they know what kinds of speeds S3 can offer.
We have made significant improvements to data upload, and has indicated that is likely is possible to make similar improvements to download as well, but not trivially.
Environment
Activity
,
Please assist us with the release of the client by looking at the plots I provided. If they look good, please 'Close Issue' or let me know if I can close.
Could you provide plot?
Final test on t3.xlarge instance
file | 1mb_syn | 1mb_aws | 2mb_syn | 2mb_aws | 4mb_syn | 4mb_aws | 8mb_syn | 8mb_aws | 16mb_syn | 16mb_aws | 32mb_syn | 32mb_aws | 64mb_syn | 64mb_aws | 128mb_syn | 128mb_aws | 256mb_syn | 256mb_aws | 512mb_syn | 512mb_aws | 1024mb_syn | 1024mb_aws | 2048mb_syn | 2048mb_aws | 4096mb_syn | 4096mb_aws |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 0.53 | 0.85 | 0.65 | 0.75 | 0.51 | 1.03 | 0.71 | 0.67 | 0.59 | 0.81 | 0.95 | 1.17 | 1.11 | 1.15 | 1.49 | 1.61 | 3.25 | 2.96 | 4.54 | 4.37 | 8.27 | 7.74 | 15.16 | 14.93 | 32.58 | 29.57 |
1 | 0.49 | 0.86 | 0.57 | 0.77 | 0.65 | 0.73 | 0.72 | 0.68 | 0.96 | 0.98 | 0.86 | 0.95 | 1.03 | 1.19 | 1.58 | 1.49 | 3.21 | 2.61 | 4.11 | 4.38 | 7.91 | 7.91 | 16.30 | 15.40 | 31.69 | 29.92 |
2 | 0.54 | 0.75 | 0.56 | 0.78 | 0.57 | 0.72 | 1.04 | 1.24 | 0.62 | 1.06 | 0.88 | 1.52 | 1.53 | 1.14 | 1.54 | 1.72 | 3.17 | 2.38 | 5.04 | 4.33 | 9.14 | 7.75 | 16.96 | 14.97 | 31.68 | 29.68 |
3 | 0.48 | 0.78 | 0.52 | 0.60 | 0.82 | 0.79 | 0.70 | 1.02 | 0.88 | 0.88 | 0.85 | 1.22 | 1.08 | 1.34 | 1.66 | 1.82 | 3.00 | 2.66 | 5.64 | 4.35 | 7.77 | 7.97 | 15.64 | 15.12 | 31.60 | 32.15 |
4 | 0.53 | 0.76 | 0.46 | 0.70 | 0.65 | 0.77 | 0.87 | 0.85 | 0.80 | 0.94 | 0.82 | 1.21 | 1.53 | 1.29 | 1.46 | 1.46 | 2.68 | 2.59 | 4.62 | 4.28 | 8.22 | 7.78 | 16.64 | 15.30 | 29.28 | 29.95 |
average | 0.51 | 0.80 | 0.55 | 0.72 | 0.64 | 0.81 | 0.81 | 0.89 | 0.77 | 0.93 | 0.87 | 1.21 | 1.26 | 1.22 | 1.55 | 1.62 | 3.06 | 2.64 | 4.79 | 4.34 | 8.26 | 7.83 | 16.14 | 15.15 | 31.37 | 30.25 |
Looks good to me! There is very slight drop-off, but its basically comparable.
It doesn't look like we can speed up the MD5 caculations by adding byte chunks while it's still downloading so I'm gonna move on.
My next step is working in a responsiveness issue where if user signals to stop the download process via KeyboardInterrupt (Ctrl +C), the workers will continue to do work until whatever is left on the queue is exhausted. This can be problematic because it may be another 30 seconds or so before the download progress bar stops moving, causing users to misinterpret that the program did not respond. If they continue to spam Ctrl + C, the program will not clean up threads properly, leaving orphaned threads that take up resources.
Tom, I’d be helpful if you formatted the table better, i.e.limit the precision of numbers.