Support range requests on synapse gets

Description

A user is requesting support for range requests on synapse gets to avoid having to retrieve a whole file.

https://www.synapse.org/#!Synapse:syn5637528/discussion/threadId=6961

We make use of range requests internally in the implementation of multi threaded download so this should not be too difficult (at least for S3 downloads) although there may some interactions with the synapse cache.

Environment

None

Activity

Show:
Bruce Hoff
April 17, 2020, 9:32 PM

, yes you did the right thing by creating a Jira issue.

Jordan Kiang
April 17, 2020, 8:53 PM

made me aware of this request on the forum. I’m not sure how we typically handle random outside feature requests. I don’t necessarily think its high priority or anything but figured it should be logged?

Bruce Hoff
April 17, 2020, 8:38 PM

thanks for the context.

Also, I think 's assessment of the LOE is spot on.

Ryan Luce
April 17, 2020, 8:09 PM
Edited

- from the request on the forum - “Is it possible to use the synapseclient to do a HTTP range request to get a portion of a file? For example, if there is a large bam file that I want to different worker nodes to operate on different regions of the bam, I do not want each worker to have to download the entire bam file.”

Information on what a bam file is from https://software.broadinstitute.org/software/igv/BAM

“A BAM file (.bam) is the binary version of a SAM file. A SAM file (.sam) is a tab-delimited text file that contains sequence alignment data. These formats are described on the SAM Tools web site: http://samtools.github.io/hts-specs/.

BAM, rather than SAM, is the recommended format for IGV. Starting with IGV 2.0.11, IUPAC ambiguity codes in BAM files are supported.”

If there haven’t been other requests regarding this type of feature, I’m not thinking its something we should prioritize. I’m going to ask folks on Slack.

 

Bruce Hoff
April 17, 2020, 8:06 PM

, I've never heard such a request. did the requester give additional context, e.g. what sort of file are they hoping to read part of? How big is it? If we decide to do this we should make sure that the range request is performant, that the time to download doesn't increase as the 'range' goes deeper into the file.

Assignee

Bruce Hoff

Reporter

Jordan Kiang

Labels

None

Validator

Bruce Hoff

Development Area

Synapse Core Infrastructure

Release Version History

None

Slack Channel

None

Components

Priority

Minor