Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Code Block
titleR code
myFile <- synStore(File("/path/to/file", parentId="syn123"), 
                   used=list('syn445865', 'http://www.google.com'), 
                   activityName="Updated dbgap ids")

Example Table Interaction

User creates a new Table:

Ideally there would be some way of passing a csv/tsv file to create a table:

Code Block
languagepy
table = syn.createTableFromCSV(Table('path/to/file', parent='syn1231')) 

but as a first pass it may make more sense to create an interface that mimics the REST API that we can later build on to create a convenience function for the this expanded functionality.

The construction of a Table consists of creating column models by performing a POST on /column which returns a ColumnModel and then creating the entity by posting a TableEntity to /entity/.  The Table entity should contain a list of column ids obtainable from the list of columnModels.

 

Code Block
languagepy
#Create a local representation of a table
table synapseclient.Table(name=foobar, parent=syn123, columns=[{columnType:int, name:'age'}, {columnType:string, name:'gender', enumValues=['m', 'f']}])
table = syn.store(table)

We might want to eventually make convenience functions for storing columnModels locally or an easier way of representing them.

User Requests data from a table using a query

Code Block
languagepy
#A query always returns a Table object that the user can extract the data from
table = syn.queryTable(table='syn12312', 'select *')
df = table.as_df() #Returns a Pandas object of the data frame
array = table.as_matrix() #Returns a 2D numpy array
lists = table.as_dict() # Returns a dict of lists (this would be a generic python solution that is not dependent on external libraries)
 
#User can now make modification or additions to the table but storing the changes/additions will require adding a df/array/lists back to the table

User Adds Data to existing Table

Code Block
languagepy
#Assume we have a local df extracted from a table above
df.append(row of values)
df.ix[1,1] = 'bar'
 
table.from_df(df)
table = syn.store(table)

 

 

Command Set

We conceptual divide the client commands into three levels (1) Common functions, (2) Advanced functions and (3) low-level Web API functions.  The first collection of commands captures the majority of functionality of interest to users. The second collection rounds out the functionality with less frequently used functions.  The third set comprises simple, low level wrappers around the Synapse web service interface.  By including this third set users can access web services in advance of having specialized commands in the analytic clients.

 

Command

comments

R Syntax

Python Syntax

Command Line Syntax

1 – Common functions

    

Create a Synapse file handle in memory, specifying the path to the file in the local file system, the name in Synapse, and the Folder in Synapse.  This step 'stages' a file to be sent to Synapse. Additional parameters (...) are interpreted as properties or annotations, in the manner of synSet(), defined below.  If 'synapseStore' is TRUE then file is uploaded to S3, else only the file location is saved.

Note:  synapseStore=T is not allowed if "path" is a URL rather than a local file path.

The specified file doesn't move or get copied.

File(path, parentId, synapseStore=T, ...)

 

example:

File(path="/path/to/file", parentId="syn101")

File(path, parentId, synapseStore=True, **kwargs)

 

example:

File('/foo/baz/bar.txt', 'syn123')

NA

Create a Synapse file handle in memory which will hold a serialized version of one or more in-memory object.  Additional parameters (...) are interpreted as properties or annotations, in the manner of synSet(), defined below. If 'synapseStore' is TRUE then file is uploaded to S3, else only the file location is saved.

The object is not serialized at this time. 

(We are hoping people will like calling the object a File, even though it's a collecton of in-memory objects.)

File(parentId)

 

example:

file<-File(parentId="syn101")

file<-addObject(file, obj)

 

Will not be implemented in python.

NA

Create a Synapse Record in memory, specifying the name and the Folder in Synapse.  This step 'stages' a Record to be sent to Synapse. 

Additional parameters (...) are interpreted as properties or annotations, in the manner of synSet(), defined below.

 

Files aren't moved or copied.

TODO:  How do you specify file annotations (as distinct from Strings)?  Shall we introduce in-memory wrappers around files and urls to help distinguish them?

Record(name=NULL, parentId="syn101", ...)

example:
Record(name="foo", parentId="syn101")

Record(name="foo", parentId="syn101", **kwargs)

 

Create a Folder or Project in memory. Name and parentId are optional.

 

Folder(name=NULL, parentId=NULL, ...)

Project(name=NULL, ...)

example:
Folder(name="foo", parentId="syn101")

Folder(name="foo", parentId="syn101", **kwargs)

Project(name="foo", **kwargs)

 

Set an entity's attribute (property or annotation) in memory.  Client first checks properties, then goes to annotations; (setting to NULL deletes it in R, using DEL operator in python deletes it)

TODO:  we want to include files and (for R) in memory objects

synAnnot(entity, name)<-value

entity.parentId="syn101"

synapse update id --parentId syn101

Gets an entity's attribute value (property or annotation) from the object already in memory.

 

synAnnot(entity, name); returns NULL if undefined

entity.name; throws exception if value is undefined

 

Create or update an entity (File, Folder, etc.) in Synapse.  May also specify (1) the list of entities 'used' to generate this one, (2) the list of entities 'executed' to generate this one, (3) the name of the generation activity, and (4) the description of the generation activity, (5) whether a name collision in an attempted 'create' should become an 'update', (6) whether to 'force' a new version to be created, and (7) whether the data is restricted (which will put a download 'lock' on the data and contact the Synapse Access and Compliance team for review.

TODO:  Give some examples.

synStore(entity, used=NULL, executed=NULL, activityName=NULL, activityDescription=NULL, createOrUpdate=T, forceVersion=T, isRestricted=F)

 

 

synStore(entity, activity=NULL, createOrUpdate=T, forceVersion=T, isRestricted=F)

synapse.store(entity, used, executed, activityName=None, activityDescription=None, createOrUpdate=True, forceVersion=True, isRestricted=False)

 

synapse.store(entity, activity, createOrUpdate=True, forceVersion=True, isRestricted=False)

synapse create --name NAME --parentid PARENTID --description DESCRIPTION

--type TYPE

--file PATH

--update=T/F

--forceVersion=T/F

 

--annotations={foo=bar, bar=foo}

Delete an object from Synapse.  In the case of entities, move to the trash can.

 

synDelete(id)

synDelete(object)

synapse.delete(synid), synapse.delete(entity), synapse.delete(wiki), synapse.delete(evaluation), synapse.delete(activity) ##TODO, synapse.delete(submission)

 

Get an entity (file, folder, etc.) from the Synapse server, with its attributes (properties, annotations) and, optionally, with its associated file(s).  ifcollision is one of "keep.both", "keep.local", or "overwrite.local", telling the system what to do if a different file is found at the given local file location.

'download' and 'load' are ignored for objects other than Files.  If a downloadLocation is not provided a default location is used.  Collisions with existing files are handled according to the 'ifcollision' parameter.  Note, 'downloadLocation' must be a directory.

synGet(id, version, downloadFile=T, downloadLocation=NULL, ifcollision="keep.both", load=F)

synapse.get(id, version, downloadFile=True, downloadLocation=None, ifcollision="keep.both")

synapse get ID -v NUMBER

Get the downloaded location of the file associated with a File object.

If synGet was called with download=FALSE, getFilePath() NULL.

getFilePath(file)

getFileURL(file)

file.path

file.url

TODO

Open the web browser to the page for this entity.

 

onWeb(entityId) / onWeb(entity)

synapse.onweb(entityId) / synapse.onweb(entity)

synapse onweb id

log-in

If fields are omitted, then values are retrieved from the configuration file or from cached API keys.

synapseLogin(username = "", password = "", sessionToken = "", apiKey = "", rememberMe = False)

synapseLogin()

synapse.login(email=None, password=None, sessionToken=None, apiKey=None, rememberMe=False, silent=False)

synapse.login()

synapse login -u USER -p PASSWORD

log-out

localOnly=T delete any local copies of sessionToken or apiKey

localOnly=F: -> (1) if client has sessionToken, then call "DELETE /session"; (2) do the localOnly part

synapseLogout(localOnly=F)

synapse.logout(local=False, clearCache=False)

synapse logout

invalidate API key

invlidate API key

invalidateAPIKey()

invalidateAPIKey()

 

2 –Advanced functions

    

Execute query

TODO:  pagination, e.g. the function returns an iterator. Look at current implementation in R client.

synQuery(queryString)

synapse.query(queryString)

synapse query

Find the Entities having attached file(s) which have the given md5.

Returns an EntityHeader list.

synMD5Query(md5)

synapse.md5Query(md5)

NA

we talked about this, but is it needed?

 

synGetEntity()

  

we talked about this, but is it needed?

 

synStoreEntity()

  

Retrieve the wiki for an object (Entity or Evaluation)

 

synGetWiki(owner)

synGetWiki(owner, id)

synapse.getWiki(owner, subpageId)

Examples:

synapse.getWiki(entity)

synapse.getWiki(evalution, 2342)

 

Retrieve wiki headers of evaluation or entity

 

synGetWikiHeaders(owner)

Synapse.getWikiHeaders(owner)

where owner is an evaluation or entity

 

Wiki construction

 

WikiPage(owner, title, markdown, attachments)

WikiPage(owner, title, markdown, attachments, parentWikiId)

'attachments' is a list of local file paths

Wiki(owner, title, markdown, attachmentFileHandleIds, parentWikiId=None)

 
  

synStore(wiki)

synapse.store(Wiki)

 
  

synGetAnnotations()

synapse.getAnnotations(entity/entityId)

 
  

synSetAnnotations()

synapse.setAnntotations(entity/entityId, annotations)

 
  

synGetProperties()

NA

NA

Access properties, throwing exception if property is not defined.

 

synSetProperties()

NA

NA

  

synGetAnnotation()

  
  

synSetAnnotation()

  

Access property, throwing exception if property is not defined.

 

synGetProperty()

NA

NA

Access property, throwing exception if property is not defined. Setting to NULL deletes.

 

synSetProperty()

NA

NA

Create an Activity (provenance object) in memory.

 

Activity(name, description, used, executed)

Activity(name, description, used, exectuted)

NA

Set the list of entities/urls 'used' (not 'executed') by an Activity.

 

used(activity)<-refererenceList

activity$used<-refererenceList

  

Set the list of entities/urls 'executed' (not 'used') by an Activity.

 

executed(activity)<-refererenceList

activity$executed<-referenceList

  

Get the list of entities/urls 'used' (not 'executed') by an Activity.

 

used(activity)

activity$used

  

Get the list of entities/urls 'executed' (not 'used') by an Activity.

 

executed(activity)

activity$executed

  

Create or update the Activity in Synapse

 

synStore(activity)

synapse.store(Activity)

NA

Get the Activity which generated the given entity.

 

synGetActivity(entity) / synGetActivity(entityId)

synapse.getActivity(entity/entityId)

NA

Empty trash can

    

Restore from trash can

    

Run code, capturing output, code and provenance relationship.

 

synapseExecute(executable, args, resultParentId, codeParentId, resultEntityProperties = NULL,  resultEntityName=NULL, replChar=".")

synapse.execute(executable, args, resultParentId, codeParentId, resultEntityProperties = None,  resultEntityName=None, replChar=".")

NA

Create evaluation object

 

Evaluation(name, description, status)

Evaluation(name, description, status, contentSource)

NA

Retrieve an Evaluation.

 

synGetEvaluation(evaluationId)

  

Submit for evaluation

 

submit(evaluation, entity, submissionName, teamName)

synapse.submit(evaluation, entity, name=None, teamName=None)

synapse submitEvaluation

Adds participant to evaluation, userId is optional

  

Synapse.addEvaluationParticipant(evaluation, userId=None):

 

Get the participants in an evaluation

 

synGetParticipants(evaluationId,limit,offset)

  

Returns an iterator of submissions

 

synGetSubmissions(evaluationId, myown=F, status, limit, offset)

Synapse.getSubmissions(evaluation, status=None):

 

Get specific submission

 

synGetSubmission(id, downloadFile=T, downloadLocation=NULL, ifcollision="keep.both", load=F)

Synapse.getSubmission(id, downloadFile=True, downloadLocation=None, ifcollision="keep.both"):

 

Get status of of submission

 

synGetSubmissionStatus(id)

synGetSubmissionStatus(submission)

Synapse.getSubmissionStatus(submission):

 

Get a user profile (own or other's)

When retrieving own profile, all fields are returned.  When retrieving other's profile, only public fields are returned.

synGetUserProfile()

synGetUserProfile(principalId)

  

3 – Web API Level functions

    

Execute GET request

See details below.

synRestGET(uri, endpoint)

synapse.restGET(uri, endpoint=None)*

 

Execute POST request

See details below.

synRestPOST(uri, body, endpoint)

synapse.restPOST(uri, body, endpoint=None)*

 

Execute PUT request

See details below.

synRestPUT(uri, body, endpoint)

synapse.restPUT(uri, body, endpoint=None)*

 

Execute DELETE request

See details below.

synRestDELETE(uri, endpoint)

synapse.restDELETE(uri, endpoint=None)*

 

Get the current set of web service endpoints.

 

synGetEndpoints()

  

Set the web service endpoints.

If no arguments are passed, then reset to the default endpoints.

synSetEndpoints(repo, auth, file, portal)

synSetEndpoints()

  
     

*The endpoint defaults to repoEndpoint and it would be useful to be able to pass arbitrary named arguments that are just passed on to the underlying http library.  For python for example the stream and file parameters could be useful to pass along the the filehandle requests for get and put. 

Endpoints

At the time of this writing, there are three endpoints for web service calls in our production system:

...

These are used to call the web APIs linked below.

Web APIs

The URIs, request bodies and request methods are defined by the Synapse Web APIs.  The URIs omit the endpoints given above, e.g. to retrieve entity metadata the endpoint would be "https://repo-prod.prod.sagebase.org/repo/v1" while the URI might be "/entity/syn123456".  The web APIs define request and response bodies in terms of JSON objects.  In the analytic clients these are expressed as named lists or nested named list, e.g. in R the JSON object {"foo":"bar", "bas":"bah"} is passed in as list(foo="bar", bas="bah").

The Web APIs are defined here:

Synapse REST APIs

 

Common Configuration File

Upon client initialization, the client searches for a configuration file in a standard place.  Specifically, it looks for an INI-formated '~/.synapseConfig' file.  Parsering algorithms are available for both R and Python.  

The following can be specified in the configuration file:

  • Username, password, session token, or API key

  • File cache location (should be private to the user)

  • Endpoints for each of the Synapse services

 

Code Block
languagebash
titleExample
firstline1
[authentication] username = example@user.com password = samplePassword sessionToken = 1234567890asdfghjkl apikey = Some+API+key+retrieved+from+either+the+web+portal+or+via+a+REST+GET+call+to+/secretKey==   [cache] location = ~/.synapseCache   [endpoints] repoEndpoint = https://repo-prod.prod.sagebase.org/repo/v1 authEndpoint = https://auth-prod.prod.sagebase.org/auth/v1 fileHandleEndpoint = https://file-prod.prod.sagebase.org/file/v1 portalEndpoint = https://synapse.org/  

 

Appendix:  Current implementation of the file cache in the R Client:

  • files are cached (meatadata used to be cached in entity.json)

  • cache is mix of read/write

  • each entity version has a location within the cache is based on its URI (e.g. .synapseCache/proddata.sagebase.org/<entityId>/<locationId>/version/<version>)

    • files.json specifies what resides within the archive
    • <fileName> file which R Client currently assumes to be a zip (this is immutable by convention until storeEntity is called)  (TODO:  What happens when it is not a zip archive)
    • <fileName>_unpacked directory within which all unzipped content lives
      • this subdirectory is writable (by convention)
      • re-stores file if not an archive (both as <fileName> and <fileName>_unpacked/<fileName>)

...