Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Last updated on 2023-09-15

This page is intended to describe describes the workflow required to build, edit, and update the data model for MODEL-AD.

Table of Contents
minLevel1
maxLevel3
outlinefalse
typelist
printablefalse

Schematic

Summary

Data Modeling at Sage requires using two in-house tools: Schematic and the Data Curator App (DCA).

Schematic

Summary

SCHEMATIC is an acronym for Schema Engine for Manifest Ingress and Curation. The Python based tool is a schema-based, metadata ingress ecosystem, intended to streamline of biomedical dataset annotation, metadata validation and submission to a data repository for various data contributors.

...

https://github.com/adknowledgeportal/data-models

Sage Data Models for Reference

Recommendations

  • Draw a diagram. A diagram is a useful reference when developing the model.

  • Start small with a basic skeleton and then build.

  • Use schematic in dev mode to convert model to JSON-LD regularly to check for errors

...

Data models are formatted in JavaScript Object Notation-LinkedData. JSON-LD in schematic is its support by http://schema.orgdataset discoverability in search engines like: ​Dataset Search

Guide to Developing Data Models in JSON-LD

...

The vocabulary should be relevant to the type of data that you are modeling.

...

Ontology Resources

Metadata Dictionary

AD Knowledge Portal Metadata Dictionary

https://sagebio.shinyapps.io/amp-ad-metadata-dictionary/

Data Curator App

http://dca.app.sagebionetworks.org

https://dca-dev.app.sagebionetworks.org

https://github.com/adknowledgeportal/data_curator

https://github.com/adknowledgeportal/data-models

https://sagebionetworks.jira.com/wiki/spaces/SCHEM/pages/2458648589/Setting+up+a+DCC+Asset+Store#How-do-I-Structure-My-DCC-Synapse-Project-to-Work-with-the-Data-Curator-App%3F

Projects

Folder Structure

https://dca-docs.scrollhelp.site/DCA/Working-version/Project-Agnostic/organize-your-data-upload#OrganizeyourDataUpload-FlattenedDataLayoutExample

Code Block
.
├── biospecimen_experiment_1
    ├── manifest1.csv
├── biospecimen_experiment_2
    ├── manifestA.csv
├── single_cell_RNAseq_batch_1
    ├── manifestX.csv
    ├── fileA.txt
    ├── fileB.txt
    ├── fileC.txt
    └── fileD.txt
└── single_cell_RNAseq_batch_2
    ├── manifestY.csv
    └── file1.txt

Study Content

/wiki/spaces/AKP/pages/1057882353

  • Study Description in wiki

  • Methods description in each data folder

/wiki/spaces/EPD1/pages/2900819969

AMP-AD

https://github.com/adknowledgeportal/test-data-model/blob/main/model-ad/model-ad.data.model.jsonld

https://github.com/adknowledgeportal/data_curator
https://github.com/adknowledgeportal/test-data-model AD Portal DCA Test Project - Table -models/blob/main/README.md#editing-data-models

AD data model → modular

repo:

branch: test-split-csvs

folders:

modules/

..biosopecimen/

..mouse/

Jira Legacy
serverSystem JIRA
serverIdba6fb084-9827-3160-8067-8ac7470f78b2
keyADM-836

Term = Attribute in the data model where Parent = DataProperty

test-split0csvs branch

MODEL-AD

ELITE

Annotate study folder with contentType = 'dataset'

Flattened file structure

Create Project

Maintain File permission access easily

Top level: assay folders

All data files of one type in assay folder

These assay folder names will be displayed

data_folder/

Schematic Configuration needed config.yml

master_file view ‘synID’

which refers to this:

Fileview - Files and Folders https://www.synapse.org/#!Synapse:syn51753858/tables/

https://github.com/Sage-Bionetworks/data_curator_config

needs to point to this fileview and the data model

fork repo

edit dca-template-config.json

add MODEL-AD folder and edit configuration as needed send a pull request

ADKP example

Fileview DCA Asset View that DCA uses

folder contentType = ‘dataset’

One project for all of AD

Templates

Lref gdrive file
urlhttps://drive.google.com/drive/folders/1M90FJX2seyb1s-QzKIHRrSCDuLC97NJO

...

https://dca-docs.scrollhelp.site/DCA/Working-version/ELITE/validate-and-submit-your-metadata

...


Resources

...

https://portal.includedcc.org/dashboard

https://docs.google.com/spreadsheets/d/1w6zDfz3_yrCjjrqfpXBGNmd0LZL4B03gr1KfzJtk5Cs/edit#gid=674286209

...

https://linkml.io/schemasheets/#examples

https://linkml.io/linkml/intro/tutorial.html

Glossary

Template

Manifest - metadata table submitted for dataset

Data Model

Lref gdrive file
urlhttps://docs.google.com/document/d/1nZGLRKW5LXpY-LBrtrgs4MyO-fb0kDDeouEOvW36xo0/edit#heading=h.o7ihd22lafi

...

https://github.com/Sage-Bionetworks/1kD-model

...

/wiki/spaces/SCHEM/pages/2473623559

...

OWL Tutorial

...

https://schema.org/

...

https://learnxinyminutes.com/docs/yaml

...

http://vowl.visualdataweb.org/webvowl.html

...

/

...

https://webprotege.stanford.edu/#projects/cb219a51-dd90-4921-bec4-c836bd96f680/edit/Properties?selection=ObjectProperty(%3Chttp://example.com/BallpointPenOntology/hasCharacteristic%3E)

Glossary

Template

Manifest - metadata table submitted for dataset