Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Last updated on 2023-09-15

This page is intended to describe describes the workflow required to build, edit, and update the data model for MODEL-AD.

Table of Contents
minLevel1
maxLevel3
outlinefalse
typelist
printablefalse

Schematic

Summary

Data Modeling at Sage requires using two in-house tools: Schematic and the Data Curator App (DCA).

Schematic

Summary

SCHEMATIC is an acronym for Schema Engine for Manifest Ingress and Curation. The Python based tool is a schema-based, metadata ingress ecosystem, intended to streamline of biomedical dataset annotation, metadata validation and submission to a data repository for various data contributors.

...

https://github.com/adknowledgeportal/data-models

Sage Data Models for Reference

Recommendations

  • Draw a diagram. A diagram is a useful reference when developing the model.

  • Start small with a basic skeleton and then build.

  • Use schematic in dev mode to convert model to JSON-LD regularly to check for errors

...

schematic schema convert model.csv

What is JSON-LD?

Data models are formatted in JavaScript Object Notation-LinkedData. JSON-LD in schematic is its support by http://schema.orgdataset discoverability in search engines like: ​Dataset Search

Guide to Developing Data Models in JSON-LD

...

The vocabulary should be relevant to the type of data that you are modeling.

Metadata Dictionary

AD Knowledge Portal Metadata Dictionary

...

Ontology Resources

...

Metadata Dictionary

AD Portal DCA Test Project

Annotate study folder with contentType = 'dataset'

Upload Data

Knowledge Portal Metadata Dictionary

https://dca-docssagebio.scrollhelpshinyapps.site/DCA/Working-version/Project-Agnostic/organize-your-data-upload#OrganizeyourDataUpload-FlattenedDataLayoutExamplehttpsio/amp-ad-metadata-dictionary/

Data Curator App

http://dca-docs.app.scrollhelp.site/DCA/Working-version/Project-Agnostic/uploading-datasagebionetworks.org

https://dca-docsdev.scrollhelp.site/DCA/Working-version/ELITE/validate-and-submit-your-metadata

Resources

https://ontofox.hegroup.org/ app.sagebionetworks.org

https://linkmlgithub.iocom/linkml/intro/tutorial.htmladknowledgeportal/data_curator

https://docsgithub.google.com/spreadsheetsadknowledgeportal/d/1vDdcqt3Lgehyq1iCnlF1H9JZi63pLj-u/edit#gid=1939820452data-models

https://portalsagebionetworks.includedccjira.org/dashboard
https://linkml.io/schemasheets/#examples com/wiki/spaces/SCHEM/pages/2458648589/Setting+up+a+DCC+Asset+Store#How-do-I-Structure-My-DCC-Synapse-Project-to-Work-with-the-Data-Curator-App%3F

Projects

Folder Structure

https://dca-docs.googlescrollhelp.comsite/spreadsheetsDCA/d/1w6zDfz3_yrCjjrqfpXBGNmd0LZL4B03gr1KfzJtk5Cs/edit#gid=674286209
https://docs.google.com/presentation/d/129pSx58qDm7Y1OQmSSHKDq6tsoD3pW_gDRNXiX2rd0w/edit#slide=id.g4d21a8c2ba_0_11 Working-version/Project-Agnostic/organize-your-data-upload#OrganizeyourDataUpload-FlattenedDataLayoutExample

Code Block
.
├── biospecimen_experiment_1
    ├── manifest1.csv
├── biospecimen_experiment_2
    ├── manifestA.csv
├── single_cell_RNAseq_batch_1
    ├── manifestX.csv
    ├── fileA.txt
    ├── fileB.txt
    ├── fileC.txt
    └── fileD.txt
└── single_cell_RNAseq_batch_2
    ├── manifestY.csv
    └── file1.txt

Study Content

/wiki/spaces/SCHEMAKP/pages/24531763261057882353

  • Study Description in wiki

  • Methods description in each data folder

/wiki/spaces/SCHEMEPD1/pages/2458419217

Glossary

Template

Manifest - metadata table submitted for dataset2900819969

AMP-AD

https://github.com/adknowledgeportal/test-data-model/blob/main/model-ad/model-ad.data.model.jsonld

https://github.com/adknowledgeportal/data_curator
https://github.com/adknowledgeportal/test-data-model -models/blob/main/README.md#editing-data-models

AD data model → modular

repo:

branch: test-split-csvs

folders:

modules/

..biosopecimen/

..mouse/

Jira Legacy
serverSystem JIRA
serverIdba6fb084-9827-3160-8067-8ac7470f78b2
keyADM-836

Term = Attribute in the data model where Parent = DataProperty

test-split0csvs branch

MODEL-AD

ELITE

Annotate study folder with contentType = 'dataset'

Flattened file structure

Create Project

Maintain File permission access easily

Top level: assay folders

All data files of one type in assay folder

These assay folder names will be displayed

data_folder/

Schematic Configuration needed config.yml

master_file view ‘synID’

which refers to this:

Fileview - Files and Folders https://www.synapse.org/#!Synapse:syn36759435syn51753858/tables/
Add CSV + JSONLD to github – test-data-model

https://github.com/adknowledgeportal/test-data-model
https://github.com/adknowledgeportal/Sage-Bionetworks/data_curator/blob/18dc00723f2e95a98525ff695401ac67e7785475/schematic_config.yml#L31
Data Model Validation Rules

/wiki/spaces/SCHEM/pages/2645262364

...

extract individual and specimen ID from filenames

http://regex101.com

...

_config

needs to point to this fileview and the data model

fork repo

edit dca-template-config.json

add MODEL-AD folder and edit configuration as needed send a pull request

ADKP example

Fileview DCA Asset View that DCA uses

folder contentType = ‘dataset’

One project for all of AD

Templates

Lref gdrive file
urlhttps://

...

drive.google.com/

...

drive/

...

folders/

...

1M90FJX2seyb1s-QzKIHRrSCDuLC97NJO

https://

...

/wiki/spaces/SCHEM/pages/2473623559

...

OWL Tutorial

...

https://schema.org/

dca-docs.scrollhelp.site/DCA/Working-version/Project-Agnostic/organize-your-data-upload#OrganizeyourDataUpload-FlattenedDataLayoutExample

https://dca-docs.scrollhelp.site/DCA/Working-version/Project-Agnostic/uploading-data

https://dca-docs.scrollhelp.site/DCA/Working-version/ELITE/validate-and-submit-your-metadata


Resources

https://linkml.io/

...

schemasheets/#examples

https://

...

linkml.

...

io/

...

linkml/

...

intro/tutorial.html

Lref gdrive file
urlhttps://

...

docs.google.com/document/d/1nZGLRKW5LXpY-LBrtrgs4MyO-fb0kDDeouEOvW36xo0/edit#heading=h.o7ihd22lafi

https://

...

learnxinyminutes.com/

...

docs/yaml/

https://webprotege.stanford.edu/#projects/cb219a51-dd90-4921-bec4-c836bd96f680/edit/Properties?selection=ObjectProperty(%3Chttp://example.com/BallpointPenOntology/hasCharacteristic%3E)

“Manifest Templates”

...

Glossary

Template

Manifest - metadata table submitted for dataset