Demonstration of simulating cases and performing SPARQL queries

This page demonstrates how to run the demonstration query.

First we run the simulate command of PDXIntegrator to produce an RDF file with “randomized” cases. By default, 5 random cases are produced.

$ java -jar target/PdxIntegrator.jar simulate

This will produce a file called simulatedCases.rdf in the current working directory. This file will use the RDF/XML format, but the program will also emit the same RDF data in the Turtle format. For instance,

@prefix PDXNET: <http://pdxnetwork/pdxmodel#> .
@prefix NCIT:  <http://purl.obolibrary.org/obo/NCIT#> .
@prefix UBERON: <http://purl.obolibrary.org/obo/UBERON#> .

PDXNET:PAT-1  PDXNET:age_group  "0-4 years" ;
    PDXNET:consent       PDXNET:consent_YES ;
    PDXNET:ethnicity     "Sephardic" ;
    PDXNET:gender        PDXNET:female ;
    PDXNET:hasDiagnosis  NCIT:C130038 ;
    PDXNET:hasTumor      PDXNET:TUMOR-PAT-1 ;
    PDXNET:patient_id    "PAT-1" .

PDXNET:TUMOR-PAT-1  PDXNET:hasSubmitterTumorId
            "TUMOR-PAT-1" ;
    PDXNET:stage                NCIT:C19251 ;
    PDXNET:tissueOfOrigin       UBERON:35975 ;
    PDXNET:tumorCategory        NCIT:C3352 ;
    PDXNET:tumorGrade           NCIT:C121173 ;
    PDXNET:tumorHistology       NCIT:C130038 .

PDXNET:PAT-0  PDXNET:age_group  "15-19 years" ;
    PDXNET:consent       PDXNET:consent_NO ;
    PDXNET:ethnicity     "hispanic or latino" ;
    PDXNET:gender        PDXNET:male ;
    PDXNET:hasDiagnosis  NCIT:C7326 ;
    PDXNET:hasTumor      PDXNET:TUMOR-PAT-0 ;
    PDXNET:patient_id    "PAT-0" .

PDXNET:TUMOR-PAT-0  PDXNET:hasSubmitterTumorId
            "TUMOR-PAT-0" ;
    PDXNET:stage                NCIT:C19251 ;
    PDXNET:tissueOfOrigin       UBERON:4146 ;
    PDXNET:tumorCategory        NCIT:C8509 ;
    PDXNET:tumorGrade           NCIT:C48934 ;
    PDXNET:tumorHistology       NCIT:C7326 .

SPARQL Queries

We will use the corresponding RDF/XML file to perform demonstration SQPARL queries. For this, we use the query command, which produces output like this.

PREFIX pdxnet: <http://pdxnetwork/pdxmodel_>
PREFIX ncit: <http://purl.obolibrary.org/obo/NCIT_>
SELECT ?patient_id ?consent ?diagnosis
WHERE {
  ?x pdxnet:patient_id ?patient_id .
  ?x pdxnet:consent ?consent .
  ?x pdxnet:hasDiagnosis ?diagnosis .
  }
LIMIT 5
Lock : main
Lock : main
----------------------------------------------------------
| patient_id | consent                      | diagnosis  |
==========================================================
| "PAT-846"  | pdxnet:consent_NO            | ncit:C5235 |
| "PAT-1256" | pdxnet:consent_ACADEMIC_ONLY | ncit:C4887 |
| "PAT-127"  | pdxnet:consent_NO            | ncit:C5656 |
| "PAT-179"  | pdxnet:consent_YES           | ncit:C7811 |
| "PAT-1477" | pdxnet:consent_ACADEMIC_ONLY | ncit:C7965 |
----------------------------------------------------------

###########  Next Query  ###########  Next Query

PREFIX pdxnet: <http://pdxnetwork/pdxmodel_>
PREFIX ncit: <http://purl.obolibrary.org/obo/NCIT_>
PREFIX uberon: <http://purl.obolibrary.org/obo/UBERON_>
SELECT ?patient_id ?currentTreatmentDrug ?diagnosis
WHERE {
  ?x pdxnet:patient_id ?patient_id .
  ?x pdxnet:currentTreatmentDrug ?currentTreatmentDrug .
  ?x pdxnet:gender pdxnet:female .
  ?x pdxnet:hasDiagnosis ?diagnosis .
  }
LIMIT 5
Lock : main
Lock : main
--------------------------------------------------------------------------
| patient_id | currentTreatmentDrug                         | diagnosis  |
==========================================================================
| "PAT-846"  | "Goserelin[DB00014;65807-02-5]"              | ncit:C5235 |
| "PAT-1256" | "Sargramostim[DB00020;123774-72-1]"          | ncit:C4887 |
| "PAT-1477" | "Peginterferon alfa-2a[DB00008;198153-51-4]" | ncit:C7965 |
| "PAT-1770" | "Cetuximab[DB00002;205923-56-4]"             | ncit:C7061 |
| "PAT-1676" | "Cetuximab[DB00002;205923-56-4]"             | ncit:C8834 |
--------------------------------------------------------------------------

###########  Next Query  ###########  Next Query

PREFIX pdxnet: <http://pdxnetwork/pdxmodel_>
PREFIX ncit: <http://purl.obolibrary.org/obo/NCIT_>
PREFIX uberon: <http://purl.obolibrary.org/obo/UBERON_>
SELECT ?patient_id ?currentTreatmentDrug ?diagnosis ?age_lowerrange ?age_upperrange
WHERE {
  ?x pdxnet:patient_id ?patient_id .
  ?x pdxnet:currentTreatmentDrug ?currentTreatmentDrug .
  ?x pdxnet:gender pdxnet:female .
  ?x pdxnet:hasDiagnosis ?diagnosis .
  ?x pdxnet:ageBinLowerRange ?age_lowerrange .
  ?x pdxnet:ageBinUpperRange ?age_upperrange .
  FILTER (?age_lowerrange > 55) .
  }
LIMIT 5
Lock : main
Lock : main
-----------------------------------------------------------------------------------------------------------
| patient_id | currentTreatmentDrug                       | diagnosis   | age_lowerrange | age_upperrange |
===========================================================================================================
| "PAT-1256" | "Sargramostim[DB00020;123774-72-1]"        | ncit:C4887  | 75             | 79             |
| "PAT-1770" | "Cetuximab[DB00002;205923-56-4]"           | ncit:C7061  | 105            | 109            |
| "PAT-75"   | "Denileukin diftitox[DB00004;173146-27-5]" | ncit:C5631  | 75             | 79             |
| "PAT-1765" | "Pegfilgrastim[DB00019;208265-92-3]"       | ncit:C7964  | 80             | 84             |
| "PAT-851"  | "Leuprolide[DB00007;53714-56-0]"           | ncit:C27754 | 65             | 69             |
-----------------------------------------------------------------------------------------------------------

Development plans

Currently, there are prototype versions of all modules but one. We will go through the entire PDX-MI ontology specification in this document : https://docs.google.com/document/d/1M81y8wbT5gegUe35RZwS92bvHLYJrVPhaFnnkECgbto/edit and will implement RDF patterns, and will test the ability to query the data with SPARQL. Once this is mature and tested, we will adapt the code to provide ETL and Q/C functionalities.

Visualization

This is a nice tool for visualizing RDF graphs: http://visgraph3.org/