Advertisement
If you have a new account but are having problems posting or verifying your account, please email us on hello@boards.ie for help. Thanks :)
Hello all! Please ensure that you are posting a new thread or question in the appropriate forum. The Feedback forum is overwhelmed with questions that are having to be moved elsewhere. If you need help to verify your account contact hello@boards.ie

Help with Sparql queries

Options

Comments

  • Registered Users Posts: 695 ✭✭✭DaSilva


    Hey degzs, I just know a little about SPARQL and RDF as I only used them briefly, but maybe I can get you started.

    The query you have there can be run on the CSO's own SPARQL endpoint.
    If you go to http://data.cso.ie/sparql you can enter your SPARQL query in the Query Text textarea, choose JSON as the Results Format and hit Run Query to execute your SPARQL.

    My understanding of SPARQL is limited, but the actual Query you have can be explained like this.

    The GRAPH keyword seems to be a way of specifying the data-set to query. First you give it the URI of the data and then within the braces, the actual query to run. If you put the URI after GRAPH into the textbox called Default Data Set Name (Graph IRI) you can remove it from the query, leaving you with a simpler query:
    SELECT *
    WHERE {
      ?s ?p <http://data.cso.ie/census-2011/classification/gender/both> .
    }
    GROUP BY ?areaLabel
    ORDER BY desc(?population)
    

    The data you are querying is a bunch of triples, or to think of it another way, a table with 3 columns.
    That query is requesting all the column 1 and 2 cells (?s and ?p) where the 3rd column has the value "http://data.cso.ie/census-2011/classification/gender/both".
    Then it's trying to group and order the returned data. That query has problems though. First, the result set column names (at-least that's what I call them) ?areaLabel and ?population don't make sense. The names are for you only, ORDER BY desc(?population) carries as much meaning as ORDER BY decs(?abcefghi) for the computer. If you replace the ?areaLabel and ?population with either ?s or ?p, then you'll get somewhere, because ?s and ?p relate to data because they are part of the query.
    If you add those column names to the query it will work. Here is a query that will return all the Dail Constituencies and their populations.
    SELECT ?areaLabel ?population
    WHERE {
     ?s ?gender <http://data.cso.ie/census-2011/classification/gender/both> .
     ?s <http://purl.org/linked-data/sdmx#refArea> ?areaLabel .
     ?s <http://data.cso.ie/census-2011/property/population> ?population
    }
    GROUP BY ?areaLabel
    ORDER BY DESC(?population)
    
    Note, the GROUP BY clause doesn't actually have an affect on the query, but it is valid, so you don't get an error.


    Here's a way to actually query the data locally if that's what you'd like.
    1. Download Jena
    2. Extract and install it as per README file
    3. Download and extract a data set e.g. "per-county" from http://data.cso.ie/datasets/age-group-gender-population.html
    4. Run from a cmd prompt "tdbloader.bat --loc population_by_county C:\path_to\dataset.nt
    5. Direct query with "tdbquery.bat --loc population_by_county "SELECT * {?s ?p ?o}"
    6. Query from a file "tdbquery.bat --loc population_by_county --query query.sparql
    7. Get json output by adding "--results JSON" argument to tdbquery.bat queries


  • Registered Users Posts: 263 ✭✭degzs


    Thanks that is a great bit of help.

    Next I need to graph the data ( will try in D3.js which is a powerful visualization library.)

    I will need to use the force directed layout which I am having problems with.

    Thanks again


Advertisement