Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

The above example shows required parameters:

  • Department - An A department tag. Can be any arbitarty arbitrary text.

  • Project - An A project tag. Can be any arbitarty arbitrary text.

  • OwnerEmail - An A bucket owner tag. A valid email.

  • SynapseUserName - The Synapse account user name. Note: Department, Project, OwnerEmail are only used to tag the bucket and can be arbitrary.

...

After executing the cloudformation command, view the AWS cloudformation dashboard to verify whether the bucket was provisioned successfully.

Set S3 Bucket as Upload Location

By default, your Project/Folder uses Synapse storageyour Project/Folder uses Synapse’s default S3 storage location. You can use the external bucket configured above via our the web or programmatic clients or web client.

Python

...

Web

Navigate to your Project/Folder -> Tools -> Change Storage Location. In the resulting pop-up, select the Amazon S3 Bucket option and fill in the relevant information, where Bucket is the name of your external bucket, Base Key is the name of the folder in your bucket to upload to, and Banner is a short description such as who owns the storage location:

...

Python

Code Block
languagepy
# Set storage location
import synapseclient
import json
syn = synapseclient.login()
PROJECT = 'syn12345'

destination = {'uploadType':'S3',
               'concreteType':'org.sagebionetworks.repo.model.project.ExternalS3StorageLocationSetting',
               'bucket':'nameofyourbucket'}
destination = syn.restPOST('/storageLocation', body=json.dumps(destination))

project_destination ={'concreteType': 'org.sagebionetworks.repo.model.project.UploadDestinationListSetting',
                      'settingsType': 'upload'}
project_destination['locations'] = [destination['storageLocationId']]
project_destination['projectId'] = PROJECT

project_destination = syn.restPOST('/projectSettings', body = json.dumps(project_destination))

R

Code Block
languager
#set storage location
library(synapser)
library(rjson)
synLogin()
projectId <- 'syn12345'

destination <- list(uploadType='S3',
                    concreteType='org.sagebionetworks.repo.model.project.ExternalS3StorageLocationSetting',
                    bucket='nameofyourbucket')
destination <- synRestPOST('/storageLocation', body=toJSON(destination))

projectDestination <- list(concreteType='org.sagebionetworks.repo.model.project.UploadDestinationListSetting',
                           settingsType='upload')
projectDestination$locations <- list(destination$storageLocationId)
projectDestination$projectId <- projectId

projectDestination <- synRestPOST('/projectSettings', body=toJSON(projectDestination))
Web

Navigate to your Project/Folder -> Tools -> Change Storage Location. In the resulting pop-up, select the Amazon S3 Bucket option and fill in the relevant information, where Bucket is the name of your external bucket, Base Key is the name of the folder in your bucket to upload to, and Banner is a short description such as who owns the storage location:

...

)

Adding Files in Your S3 Bucket to Synapse

...

If the bucket is read-only or you already have content in the bucket, you will have to add representations of the files in Synapse programmatically. This is done using a FileHandle, which is a Synapse representation of the file.

Python

Code Block
languagepy
# create filehandle
fileHandle = {'concreteType': 'org.sagebionetworks.repo.model.file.S3FileHandle',
              'fileName'    : 'nameOfFile.csv',
              'contentSize' : "sizeInBytes",
              'contentType' : 'text/csv',
              'contentMd5' :  'md5',
              'bucketName' : destination['bucket'],
              'key' : 's3ObjectKey',
              'storageLocationId': destination['storageLocationId']}
fileHandle = syn.restPOST('/externalFileHandle/s3', json.dumps(fileHandle), endpoint=syn.fileHandleEndpoint)

f = synapseclient.File(parentId=PROJECT, dataFileHandleId = fileHandle['id'])

f = syn.store(f)
R
Code Block
languager
# create filehandle
fileHandle <- list(concreteType='org.sagebionetworks.repo.model.file.S3FileHandle',
                   fileName    = 'nameOfFile.csv',
                   contentSize = 'sizeInBytes',
                   contentType = 'text/csv',
                   contentMd5 =  'md5',
                   storageLocationId = destination$storageLocationId,
                   bucketName = destination$bucket,
                   key ='s3ObjectKey')
fileHandle <- synRestPOST('/externalFileHandle/s3', body=toJSON(fileHandle), endpoint = 'https://file-prod.prod.sagebase.org/file/v1')

f <- File(dataFileHandleId=fileHandle$id, parentId=projectId)

f <- synStore(f)

Please see See the REST docs for more information on setting external storage location settings using our REST API.

...

  • Select the newly created bucket and click the Permissions tab.

  • Select the Add members button and enter the member synapse-svc-prod@uplifted-crow-246820.iam.gserviceaccount.com. This is Synapse’s service account. Give the account the permissions “Storage Storage Legacy Bucket Reader” Reader and “Storage Storage Object Viewer” Viewer for read permission. To allow Synapse to upload files, additionally grant the “Storage Storage Legacy Bucket Writer” Writer permission.

For read-write permissions, you also need to create an object that proves to the Synapse service that you own this bucket. This can be done by creating a file named owner.txt that contains a line separated list of user identifiers that are allowed to register the bucket and uploading it to your bucket. Valid user identifiers are: a Synapse user id ID or the id ID of a team the user is that you are a member of.

The id ID of the user or the team can be obtained by navigating to the user profile or to the team page, the id ID is the numeric value shown in the browser URL bar after the Profile: or Team: prefixes:

...

To add files to Synapse that are already in your bucket, see Adding Files in Your S3 Bucket to Synapse below.

Command line

Code Block
# copy your owner.txt file to your s3 bucket
gsutil cp owner.txt gs://nameofmybucket/nameofmyfolder

Web

Navigate to your bucket on the Google Cloud Console and select the Upload files button to upload your text file into the folder where you want your data.

...

Enable cross-origin resource sharing (CORS)

Follow the instructions for [Setting CORS on a bucket](https://cloud.google. com/storage/docs/configuring-cors . You may have to install the gsutil application.

...

By default, your Project uses the Synapse default storage location. You can use the external bucket configured above via our programmatic clients or web client.

Python

Code Block
# Set storage location
import synapseclient
import json
syn = synapseclient.login()
PROJECT = 'syn12345'

destination = {'uploadType':'GOOGLECLOUDSTORAGE', 
               'concreteType':'org.sagebionetworks.repo.model.project.ExternalGoogleCloudStorageLocationSetting',
               'bucket':'nameofyourbucket',
               'baseKey': 'nameOfSubfolderInBucket' # optional, only necessary if using a subfolder in your bucket
               }
destination = syn.restPOST('/storageLocation', body=json.dumps(destination))

project_destination ={'concreteType': 'org.sagebionetworks.repo.model.project.UploadDestinationListSetting', 
                      'settingsType': 'upload'}
project_destination['locations'] = [destination['storageLocationId']]
project_destination['projectId'] = PROJECT

project_destination = syn.restPOST('/projectSettings', body = json.dumps(project_destination))

R

Code Block
#set storage location
library(synapser)
library(rjson)
synLogin()
projectId <- 'syn12345'

destination <- list(uploadType='GOOGLECLOUDSTORAGE', 
                    concreteType='org.sagebionetworks.repo.model.project.ExternalGoogleCloudStorageLocationSetting',
                    bucket='nameofyourbucket',
                    baseKey='nameOfSubfolderInBucket' # optional, only necessary if using a subfolder in your bucket
               }
)
destination <- synRestPOST('/storageLocation', body=toJSON(destination))

projectDestination <- list(concreteType='org.sagebionetworks.repo.model.project.UploadDestinationListSetting', 
                           settingsType='upload')
projectDestination$locations <- list(destination$storageLocationId)
projectDestination$projectId <- projectId

projectDestination <- synRestPOST('/projectSettings', body=toJSON(projectDestination))

Web

Navigate to your Project/Folder -> Tools -> Change Storage Location. In the resulting pop-up, select the Google Cloud Storage Bucket option and fill in the relevant information, where Bucket is the name of your external bucket, Base Key is the name of the folder in your bucket to upload to, and Banner is a short description such as who owns the storage location.

...

If the bucket is read-only or you already have content in the bucket, you will have to add representations of the files in Synapse programmatically. This is done using a FileHandle, which is a Synapse representation of the file.

Python

Code Block
languagepy
externalFileToAdd = 'googleCloudObjectKey' # put the key for the file to add here

# create filehandle
fileHandle = {'concreteType': 'org.sagebionetworks.repo.model.file.GoogleCloudFileHandle', 
              'fileName'    : 'nameOfFile.csv',
              'contentSize' : "sizeInBytes",
              'contentType' : 'text/csv',
              'contentMd5' :  'md5',
              'bucketName' : destination['bucket'],
              'key' : externalFileToAdd,
              'storageLocationId': destination['storageLocationId']}
fileHandle = syn.restPOST('/externalFileHandle/googleCloud', json.dumps(fileHandle), endpoint=syn.fileHandleEndpoint)
f = synapseclient.File(parentId=PROJECT, dataFileHandleId = fileHandle['id'])
f = syn.store(f)

R

Code Block
externalFileToAdd <- 'googleCloudObjectKey' # put the key for the file to add here

# create filehandle
fileHandle <- list(concreteType='org.sagebionetworks.repo.model.file.GoogleCloudFileHandle', 
                   fileName    = 'nameOfFile.csv',
                   contentSize = 'sizeInBytes',
                   contentType = 'text/csv',
                   contentMd5 =  'md5',
                   storageLocationId = destination$storageLocationId,
                   bucketName = destination$bucket,
                   key = externalFileToAdd)
fileHandle <- synRestPOST('/externalFileHandle/googleCloud', body=toJSON(fileHandle), endpoint = 'https://file-prod.prod.sagebase.org/file/v1')
f <- File(dataFileHandleId=fileHandle$id, parentId=projectId)
f <- synStore(f)

...

To setup an SFTP as a storage location, the settings on the Project need the Project need to be changed, specifically the storageLocation needs to be set. This is best done using either R or Python but has alpha support in the web browser. Customize the code below to set the storage location as your SFTP server:

Python

Code Block
import synapseclient
import json
syn = synapseclient.login()

destination = { "uploadType":"SFTP",
    "concreteType":"org.sagebionetworks.repo.model.project.ExternalStorageLocationSetting",
    "description":"My SFTP upload location",
    "supportsSubfolders":True,
    "url":"sftp://your-sftp-server.com",
    "banner":"A descriptive banner, tada!"}

destination = syn.restPOST('/storageLocation', body=json.dumps(destination))

project_destination = {"concreteType":"org.sagebionetworks.repo.model.project.UploadDestinationListSetting",
    "settingsType":"upload"}
project_destination['projectId'] = PROJECT
project_destination['locations'] = [destination['storageLocationId']]

project_destination = syn.restPOST('/projectSettings', body = json.dumps(project_destination))

R

Code Block
library(synapseClient)
synapseLogin()
projectId <- 'syn12345'

destination <- list(uploadType='SFTP',
                    concreteType='org.sagebionetworks.repo.model.project.ExternalStorageLocationSetting',
                    description='My SFTP upload location',
                    supportsSubfolders=TRUE,
                    url='https://your-sftp-server.com',
                    banner='A descriptive banner, tada!')

destination <- synRestPOST('/storageLocation', body=destination)

projectDestination <- list(concreteType='org.sagebionetworks.repo.model.project.UploadDestinationListSetting',
                           settingsType='upload')
projectDestination$locations <- list(destination$storageLocationId)
projectDestination$projectId <- projectId

projectDestination <- synRestPOST('/projectSettings', body = projectDestination)

Using a Proxy to Access a Local File Server or SFTP Server

For files stored outside of Amazon, an additional proxy is needed to validate the pre-signed URL and then proxy the requested file contents. View more information here about the process as well as about creating a local proxy or a SFTP proxy.

Set Project Settings for a Local Proxy

You must have a key (“your_secret_key”) to allow Synapse to interact with the filesystem.

Python

Code Block
languagepy
import synapseclient
import json
syn = synapseclient.login()
PROJECT = 'syn12345'

destination = {"uploadType":"PROXYLOCAL",
               "secretKey":"your_secret_key",
               "proxyUrl":"https://your-proxy.prod.sagebase.org",
               "concreteType":"org.sagebionetworks.repo.model.project.ProxyStorageLocationSettings"}
destination = syn.restPOST('/storageLocation', body=json.dumps(destination))

project_destination ={"concreteType": "org.sagebionetworks.repo.model.project.UploadDestinationListSetting",
                      "settingsType": "upload"}
project_destination['locations'] = [destination['storageLocationId']]
project_destination['projectId'] = PROJECT

project_destination = syn.restPOST('/projectSettings', body = json.dumps(project_destination))

R

Code Block
languager
library(synapser)
synLogin()
projectId <- 'syn12345'

destination <- list(uploadType='PROXYLOCAL',
                    secretKey='your_secret_key',
                    proxyUrl='https://your-proxy.prod.sagebase.org',
                    concreteType='org.sagebionetworks.repo.model.project.ProxyStorageLocationSettings')
destination <- synRestPOST('/storageLocation', body=toJSON(destination))

projectDestination <- list(concreteType='org.sagebionetworks.repo.model.project.UploadDestinationListSetting',
                           settingsType='upload')
projectDestination$locations <- list(destination$storageLocationId)
projectDestination$projectId <- projectId

projectDestination <- synRestPOST('/projectSettings', body=toJSON(projectDestination))

Set Project Settings for a SFTP Proxy

You must have a key (“your_secret_key”) to allow Synapse to interact with the filesystem.

Python
Code Block
import synapseclient
import json
syn = synapseclient.login()
PROJECT = 'syn12345'

destination = {"uploadType":"SFTP",
               "secretKey":"your_secret_key",
               "proxyUrl":"https://your-proxy.prod.sagebase.org",
               "concreteType":"org.sagebionetworks.repo.model.project.ProxyStorageLocationSettings"}
destination = syn.restPOST('/storageLocation', body=json.dumps(destination))

project_destination ={"concreteType": "org.sagebionetworks.repo.model.project.UploadDestinationListSetting",
                      "settingsType": "upload"}
project_destination['locations'] = [destination['storageLocationId']]
project_destination['projectId'] = PROJECT

project_destination = syn.restPOST('/projectSettings', body = json.dumps(project_destination))
R
Code Block
library(synapser)
synLogin()
projectId <- 'syn12345'

destination <- list(uploadType='SFTP',
                    secretKey='your_secret_key',
                    proxyUrl='https://your-proxy.prod.sagebase.org',
                    concreteType='org.sagebionetworks.repo.model.project.ProxyStorageLocationSettings')
destination <- synRestPOST('/storageLocation', body=toJSON(destination))

projectDestination <- list(concreteType='org.sagebionetworks.repo.model.project.UploadDestinationListSetting',
                           settingsType='upload')
projectDestination$locations <- list(destination$storageLocationId)
projectDestination$projectId <- projectId

projectDestination <- synRestPOST('/projectSettings', body=toJSON(projectDestination))

...