Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Case

Action

synStore is called to upload a new file to Synapse.A new entry is made in the Cache Map.

synStore is called to upload a File object which has already been uploaded to Synapse.

(i.e. same name, Synapse ID, file path, parent, etc.)

  1. The associated 'last modified' time in the File Cache is compared to the 'last modified' time for the file. 
  2. If the timestamps are the same no upload occurs. 
  3. Otherwise the file is uploaded (generating a new FileHandle ID) 
  4. A new Cache Map entry is created with the FileHandleID and timestamp. 

The old entry is left in place, since some other in-memory File object may reference the same local file.

synGet is called for a File object which has not been downloaded locally.
  1. The File metadata are retrieved, including the FileHandleID.
  2. Since there is no entry in the Cache Map, the file is downloaded and an entry made in the Cache Map.
synGet is called for a File object which has been downloaded locally with a different target location.
  1. The File metadata are retrieved, including the FileHandle ID.  
  2. An entry is found in the Cache Map for the given FileHandle ID, but not for the given location. 
  3. If any currently downloaded file in the Cache Map for the FileHandle ID has an unchanged 'last modified' timestamp, it is copied to the new location.
  4. Otherwise, the file is downloaded from Synapse to the new location.  
  5. A new Cache Map entry is created for the new file.  

We do NOT make the new File object point to the cached file, since unexpected behavior would result when multiple File objects modify the same on-disk file.

synGet is called for a File object which has been downloaded locally with the same target location.
  1. The File metadata are retrieved, including the FileHandle ID.   
  2. An entry is found in the Cache Map for the given FileHandle ID and location.  
  3. If the cached timestamp matches the 'last modified' timestamp for the file, no download occurs. 
  4. If the local file is *missing*, then the file is downloaded.  
  5. Otherwise, the action depends on the "ifcollision" mode specified for synGet:  
    1. "overwriteLocal":  The file is downloaded to the target location and the Cache Map entry is updated with a new timestamp;
    2. "keepLocal":  No download occurs.  The File references the locally modified file at the given location;
    3. "keepBoth":  The file is downloaded to the target location, but given a modified local file name.  A second entry for the FileHandle ID is made in the Cache Map.

...

Cache Location

When a file is downloaded, specifying the file location is optional.   If it isn't specified  By default, the file is placed in a default ' cache folder '. along with a Cache Map file.  

The organization of the file cache is:

<cache root>/ <intermediate folder> / <file handle id>/<file name>

...

CACHE_ROOT/[Intermediate Folder]/[File Handle ID]/[File Name]

where:

  • CACHE_ROOT is user configurable and defaults to ~/.synapseCache
  • [Intermediate Folder] is the [File Handle ID] mod 1000.  This extra level is to reduce fan-out when the number of downloaded files is much greater than 1000.

...

where

<cache root> is user configurable and defaults to ~/.synapseCache

<file handle id> is the file id part of the file handle

...

  • [File Handle ID] is the S3 file ID used to upload/download the file
  • [File Name] is the file name given by the file handle

...

  • .  If there is a

...

  • collision and "

...

  • ifcollision

...

  • " is "keepBoth", then the name is

...

  • modified by appending a number (i.e. file.txt may become file(1).txt

...

  • )

Cache Map Design

Cache Entry

There is a file for each Synapse FileHandle ID that has been downloaded or uploaded.  The file has the path:

<cache root> / <intermediate folder> / <file handle id> / .cacheMap

where <intermediate folder> 's name is <file handle id> mod 1000is located in the cache folder (above) at the same level as [File Name].

The file contains the location and last-modified time stamp of each downloaded or uploaded file.  The format data is that of stored in a JSON map whose keys are file paths and whose values are time stampstimestamps, e.g.

Code Block
{
 "/path/to/file.txt": "2013-03-14T15:09:26.000Z",
 "/alt/folder/file.txt":  "2013-04-06T15:36:41.000Z"
 }

 

Note: time stamp timestamp is in ISO8601 format, in UTC (aka "Zulu") time zone.

...