Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

The analytical clients provide for client-side caching of Synapse files, to avoid unnecessarily retrieving (large or many) files that have are already been downloaded.   By default, downloaded files are put in the system default download folder (e.g. on the Macintosh it's ~/Downloads/).   Alternatively, the a file may be downloaded to a folder location specified by the user.  When synStore() is called to create an entity having a file in an external cache location, the local copy of the file is not moved.  It serves as a "cached copy" in its current location, as described below.

When synGet() is called, the client first retrieves the File metadata, including the MD5 hash.   The client forms the download location based on the cached location (default or specified by the user) and the file name.  If a local copy of the file does not exist, it's downloaded again.  If the file is present, the client computes the MD5 of the local copy to determine whether it differs from the version in Synapse.  If so, the client must (based on user choice) either (1) create new, unique file name, (2) confirm overwrite or (3) keep the original file.  (If the latter, a subsequent synStore() would overwrite the Synapse version with the local copy.)

When synStore() is called to update an entity then the MD5 hash of the file is recomputed.  The file is uploaded if and only if the newly computed MD5 hash differs from that of the previously retrieved entity.

The effect of this architecture is that repeated uploads or downloads of an entity's file are avoided when the local copy is not modified.  This strategy does not avoid repeated downloads of an entity's file to a variety of local folders.  We feel that the potential efficiency gains of doing this are outweighed by the complexity of tracking multiple, mutable copies of a file.available locally.  When a file is uploaded or downloaded the client keeps track of the location along with information to determine if it is later changed.  Specifically, the client maintains a "Cache Map" whose keys are Synapse FileHandle IDs and whose values are lists of local file locations. Each file location has (1) a path on the local file system, and (2) a 'last modified' time.   The use of this map is as follows:

 

Case: synStore (defined below) is called to upload a new file to Synapse.

Action: An entry is made in the Cache Map.

 

Case: synStore is called for a File object whose file has already been uploaded to Synapse.

Action: The File object contains the file path to the local copy of the file and the FileHandle ID.  The associated 'last modified' time in the File Cache is compared to the current 'last modified time' for the file.  If the timestamps are the same no upload occurs.  Otherwise the file is uploaded (generating a new FileHandle ID) and the Cache Map entry is updated with the new FileHandleID and timestamp.

 

Case: synGet is called for a File object which has not been downloaded.

Action: The File metadata are retrieved, including the FileHandleID.   Since there is no entry in the Cache Map, the file is downloaded and an entry made in the Cache Map.

 

Case: synGet is called for File object which has been downloaded previously with a different target location.

Action: The File metadata are retrieved, including the FileHandle ID.   An entry is found in the Cache Map for the given FileHandle ID, but not for the given location.  If any currently downloaded file in the Cache Map for the FileHandle ID has an unchanged 'last modified' timestamp, it is copied to the new location, else the file is downloaded from Synapse to the new location.  Either way a new Cache Map entry is created for the new file.

 

Case: synGet is called for a File object which has been downloaded previously with the same target location.

Action:  The File metadata are retrieved, including the FileHandle ID.   An entry is found in the Cache Map for the given FileHandle ID and location, and the 'last modified' is retrieved.  If this timestamp matches the current 'last modified' timestamp for the file, no download occurs. If the local file is *missing* then the file is downloaded.  Otherwise, the action depends of the "ifcollision" mode specified for synGet:  

(1) ifcollsion=overwrite.local:  The file is downloaded to the target location and the Cache Map entry is updated with a new timestamp;

(2) ifcollision=keep.local:  No download occurs.  The File references the locally modified file at the given location;

(3) ifcollision=keep.both:  The file is downloaded to the target location, but given a modified local file name.  A second entry for the FileHandle ID is made in the Cache Map.

 

File Usage Examples

In each example, we have a project in which the File will reside:

...