review contributed download improvements/methods


A collaborator with Gates is investigating how to download from Synapse in various ways. He has written a CLI that implements download in a few ways. - would you like to evaluate this with me?




Kenneth Daily
February 25, 2020, 9:31 PM

From Patrick today:

I don't recall the exact numbers but I believe the current code (heavily using asyncio) was the fastest, or at least on par with multiple threads.
I decided to go with asyncio over threads since it greatly simplifies the code and fits the use case (heavy IO).
This package can also use a file view (--with-view, which speeds it up a lot on huge projects. The file view is used to build a cache of all the dataFileHandleIds.

I built some pure async API methods ( that really helped too, and wrapped some of the existing methods ( to make them async.

We have been using this package daily in one of our production environments without any issues so it should be pretty solid. This code was rushed a bit though and I'm sure there are some improvements that could be made.

I'd be happy to review and chat about this with you and/or your new engineer, just let me know.

Kenneth Daily
February 25, 2020, 9:06 PM

A heads up - Patrick (the code's author) has made some changes that don't expose some of the utilities he previously had. I have a fork of the repository that is still at the last state when we were discussing the speed issues:

Bruce Hoff
February 25, 2020, 8:59 PM

I think we should assign this to Jordan Kiang once he's on board. For now I will assign it to myself.

Kenneth Daily
October 4, 2019, 10:21 PM

Update from the contributor:

The real speed improvements we've seen are on huge Projects. The one we are testing has 56,000+ files and around 80GB of data.

So far this class is the winner (it uses the entity view, thanks for the tip!).

Also using asyncio appears to make a big difference in speed.

Robert Minneker
October 4, 2019, 5:03 PM

Thanks for reaching out, I would be happy to look into this. I ran some tests of the spccore multithreaded download I wrote on Wednesday, so it would be interesting to compare speed among other things.

Your pinned fields
Click on the next to a field label to start pinning.


Jordan Kiang


Kenneth Daily


Kenneth Daily