While Synapse provides physical storage for files ( using Amazon’s Amazon S3), not all data ‘in’ Synapse is stored on Synapse controlled locations, however, you can configure your own custom storage locations as well. For example, data files can physically reside on a user-owned in your own S3 bucket, SFTP servers, or a local file server using a proxy servers. Creating a custom storage location allows users you greater ownership and control of their your files, especially in cases where there is when you have a large amount of data or cases where there are when additional restrictions that need to be set on the data.
...
Setting Up an External AWS S3 Bucket
There are This article will describe two ways to setup an External AWS S3 Bucket.:
Setup with AWS Console - : Manual setup using the AWS Console.
Setup with AWS Cloudformation - : Automated setup using AWS Cloudformation.
Follow To begin, follow the documentation on Amazon Web Service (AWS) site to Create a Bucket. Buckets are do not required need to be located in the US.View AWS Bucket Instructions
Make the following adjustments to customize it the bucket to work with Synapse:
When the AWS instructions prompt you to
Create a Bucket - Select a Bucket Name and Region
, use a unique name. For example,thisisthenameofmybucket
.Select the newly created bucket and click the Permissions tab.
Select the Bucket Policy button and copy one of the below policies (read-only or read-write permissions). Change the name of
Resource
from “synapse-share.yourcompany.com” to the name of your new bucket (twice) and ensure that thePrincipal
is"AWS":"325565585839"
. This is Synapse’s account number.
...
To allow authorized Synapse users to upload data to your bucket set , read-write permissions need to be set on that bucket (so you can allow Synapse to upload and retrieve files):
Code Block |
---|
{ "Statement": [ { "Action": [ "s3:ListBucket*", "s3:GetBucketLocation" ], "Effect": "Allow", "Resource": "arn:aws:s3:::thisisthenameofmybucket", "Principal": { "AWS": "325565585839" } }, { "Action": [ "s3:*Object*", "s3:*MultipartUpload*" ], "Effect": "Allow", "Resource": "arn:aws:s3:::thisisthenameofmybucket/*", "Principal": { "AWS": "325565585839" } } ] } |
For readFor read-write permissionswrite permissions, you also need to create an object that proves to the Synapse service that you own this bucket. This can be done by creating a file named owner.txt that contains a line separated list a line separated list of user identifiers that are allowed to register and upload to the bucket. Valid user identifiers are a numeric Synapse user id ID or the numeric id of a team the user is that you are a member of.
The id ID of the user or the team can be obtained by navigating to the user profile or to the team page. The id ID is the numeric value shown in the browser URL bar after the Profile: or Team: prefixes:.
...
You can upload the file with the Amazon Web Console or the AWS command line client.
Web
Navigate to your bucket on the Amazon Console and select Upload to upload your text file.
...
Command line
Code Block |
---|
# copy your owner.txt file to your s3 bucket aws s3 cp owner.txt s3://nameofmybucket/nameofmyfolder |
Web
...
...
Read-only permissions
If you do not want to allow authorized Synapse users to upload data to your bucket but provide read access instead, you can change the permissions to read-only:
Code Block |
---|
{ "Statement": [ { "Action": [ "s3:ListBucket*", "s3:GetBucketLocation" ], "Effect": "Allow", "Resource": "arn:aws:s3:::synapse-share.yourcompany.com", "Principal": { "AWS": "325565585839" } }, { "Action": [ "s3:GetObject*", "s3:*MultipartUpload*" ], "Effect": "Allow", "Resource": "arn:aws:s3:::synapse-share.yourcompany.com/*", "Principal": { "AWS": "325565585839" } } ] } |
...
Enable cross-origin resource sharing (CORS)
In Permissions, click CORS configuration. In the CORS configuration editor, edit the configuration so that Synapse is included in the AllowedOrigin
tag. An example CORS configuration that would allow this is:
...
Setup with AWS Cloudformation
For convienance convienance, AWS Cloudformation can be used to provision a custom AWS S3 bucket for use with Synapse. Using this approach will result in the exact same bucket as described in Setup with AWS Console.
...
Code Block |
---|
#set storage location library(synapser) library(rjson) synLogin() projectId <- 'syn12345' destination <- list(uploadType='S3', concreteType='org.sagebionetworks.repo.model.project.ExternalS3StorageLocationSetting', bucket='nameofyourbucket') destination <- synRestPOST('/storageLocation', body=toJSON(destination)) projectDestination <- list(concreteType='org.sagebionetworks.repo.model.project.UploadDestinationListSetting', settingsType='upload') projectDestination$locations <- list(destination$storageLocationId) projectDestination$projectId <- projectId projectDestination <- synRestPOST('/projectSettings', body=toJSON(projectDestination)) |
Web
Navigate to your Project/Folder -> Tools -> Change Storage Location. In the resulting pop-up, select the Amazon S3 Bucket
option and fill in the relevant information, where Bucket is the name of your external bucket, Base Key is the name of the folder in your bucket to upload to, and Banner is a short description such as who owns the storage location:
...
To add files to Synapse that are already in your bucket, see Adding Files in Your S3 Bucket to Synapse below.
Command line
Code Block |
---|
# copy your owner.txt file to your s3 bucket gsutil cp owner.txt gs://nameofmybucket/nameofmyfolder |
Web
Navigate to your bucket on the Google Cloud Console and select the Upload files button to upload your text file into the folder where you want your data.
Make sure to enable cross-origin resource sharing (CORS)
Follow the instructions for [Setting CORS on a bucket](https://cloud.google.com/storage/docs/configuring-cors . You may have to install the gsutil application.
...
Code Block |
---|
#set storage location library(synapser) library(rjson) synLogin() projectId <- 'syn12345' destination <- list(uploadType='GOOGLECLOUDSTORAGE', concreteType='org.sagebionetworks.repo.model.project.ExternalGoogleCloudStorageLocationSetting', bucket='nameofyourbucket', baseKey='nameOfSubfolderInBucket' # optional, only necessary if using a subfolder in your bucket } ) destination <- synRestPOST('/storageLocation', body=toJSON(destination)) projectDestination <- list(concreteType='org.sagebionetworks.repo.model.project.UploadDestinationListSetting', settingsType='upload') projectDestination$locations <- list(destination$storageLocationId) projectDestination$projectId <- projectId projectDestination <- synRestPOST('/projectSettings', body=toJSON(projectDestination)) |
Web
Navigate to your Project/Folder -> Tools -> Change Storage Location. In the resulting pop-up, select the Google Cloud Storage Bucket option and fill in the relevant information, where Bucket is the name of your external bucket, Base Key is the name of the folder in your bucket to upload to, and Banner is a short description such as who owns the storage location.
...