Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

An annotations system is a form of structured data classification.  With annotations, structure is imposed by first defining the data categories of interest.   Each file is then assigned a value for each category.  Annotations work best for data discovery when the categories are well defined and understood by both the data providers and data consumers.  With such a system, data consumers can formulate queries to find data of interest using the data categories.

Tagging

Tagging involves adding one or more short descriptive strings (or tags) to a data file.  Unlike annotations, tags are a form of unstructured classification since no categories are defined.  Instead values are added to each file without considering predefined categories.  The lack of structure is both a strength and weakness for data providers and consumers.  Since there is no structure, data providers can add tags at their own discretion.  However, this means there are no guidelines to help providers add tags of value.  This means the value of the tags can be inconsistent across data providers and time.  To discover data of interest, data consumers only need to provide one or more tag value.  This is simpler than building a filter with key/value pairs.  However, it is not possible to find data by category since such categories were not defined.  It is difficult to consistently add valuable tags which means it can be difficult to find data of interest.

 

Currently, Synapse does not have services for project organizers to formally define a project's organization. For example, there are no services to define a project's annotation “schema” or file hierarchy.  Instead, project organizers maintain a project's organization by first attempting to communicate with data providers (wikis, emails, etc...).  When this does not work, organizers rely on tools/features to help find and correct erroneous data.  The current set of tools/features are lacking:

...