3. The Metaweb Query Language
This chapter explains the Metaweb Query Language, or MQL, which is used to express Metaweb queries. This chapter begins and ends with formal rules of MQL syntax, but the middle is an extended tutorial that teaches MQL by example. You are expected and encouraged to run queries and to experiment with your own queries, using a "query editor" program that submits your queries to Metaweb and displays the results.
This chapter teaches you to write MQL queries, but does not explain how to issue those queries to and retrieve responses from Metaweb servers : that is the topic of Chapter 4. Also, this chapter does not cover updates, or writes, to Metaweb. Updates are expressed using a variant of MQL that is covered in Chapter 5.
The Metaweb queries and responses we saw in Chapter 1 contained a lot of punctuation: curly braces, quotation marks, colons, and commas. Before we study more queries, it is important to understand this punctuation. Metaweb queries and responses use a plain-text data interchange format known as JavaScript Object Notation or, more commonly, JSON. If you are a JavaScript programmer, then this format will be familiar to you since it is a subset of the JavaScript language. [1] If you are not a JavaScript programmer, the format is easy-to-learn, and does not require the use of the JavaScript language.
JSON is formally described in RFC 4627 (http://www.ietf.org/rfc/rfc4627.txt), and is also documented at http://json.org. The JSON website includes pointers to code, in a variety of programming languages, for serializing data structures into JSON format and for parsing JSON text into data structures. [2]
A JSON-formatted string is a serialized form of an array or object. The array or object may contain numbers, strings, other arrays and objects, and the literal values null, true, and false. These JSON values are illustrated in Figure 3.1 and explained in the sub-sections that follow:
JSON supports three literal values. null is a JSON value representing "no value". The literals true and false a represent the two possible Boolean values.
A JSON number consists of an optional minus sign followed by an integer part followed by an optional decimal point and fractional part followed by an optional exponent. This format is the same as the format described for /type/float in Chapter 2. All numbers use decimal digits: octal and hexadecimal notation are not supported.
A JSON string is much like a string in Java or JavaScript: zero or more Unicode characters [3] between double quotation marks. See Figure 3.2.
A backslash is special: it is an escape character and is interpreted along with the character or characters that follow:
| Escape | Character |
|---|---|
\" |
A quotation mark that does not terminate the string |
\\ |
A single backslash character that is not an escape |
\/ |
A forward slash character. Although it is legal to escape the forward slash character, it is never necessary to do so. |
\b |
The Backspace character |
\f |
The Formfeed character |
\n |
The Newline character |
\r |
The Carriage Return character |
\t |
The Tab character |
\uXXXX |
The Unicode character whose encoding is the four hexadecimal digits |
An array is a comma-separated list of JSON values enclosed in square brackets. See Figure 3.3
Arrays may contain any JSON values, including objects and other arrays. The elements of a JSON array need not have the same type (though in MQL they always do). The following JSON array might be returned in response to a MQL query:
["Outlandos d'Amour", "Reggatta de Blanc", "Zenyatta Mondatta"]
A JSON array with no elements consists of just the square brackets: []. Empty arrays often appear in MQL queries.
A JSON object is named after the JavaScript object type, and is not very much like the objects of strongly-typed object-oriented programming languages. Instead, think of an object as:
-
an associative array;
-
a hashtable that maps strings to values;
-
a dictionary; or
-
an unordered set of named values.
JSON objects are written as a comma-separated list of name/value pairs, enclosed in curly braces. A name/value pair is a JSON string (the name) followed by a colon followed by any JSON value, which may include nested objects and arrays. See Figure 3.4
Here is an example JSON object (which also happens to be a Metaweb query):
{
"type" : "/music/artist",
"name" : "The Police",
"album" : []
}
JavaScript programmers should note that JSON requires property names to appear within double quotes, even though the JavaScript language does not. Arbitrary whitespace is allowed within JSON objects and arrays, but trailing commas (after the final array element or last name/value pair) are not. An empty JSON object, with no properties at all is simply a pair of curly braces: {}. As we'll see, empty objects are not uncommon in MQL queries.
This section is a tutorial that teaches Metaweb queries by example, and uses freebase.com as a source of interesting data to query. Try to follow along as you read it by trying out the queries presented. To do this, you need a simple way to submit a query to freebase.com and view the result. You can do this with the Freebase query editor at http://www.freebase.com/view/queryeditor/, or you can create your own simple query editor: save the code from Example 3.1 to a local file, and view it in your web browser. Figure 3.5 shows the resulting UI, displaying a simple query and response. (Remember that during the roll-out period, freebase.com queries require cookie-based authentication. So the query editor of Example 3.1 will only work in web browsers that have previously logged on to freebase.com and have authentication credentials stored in a cookie.
Example 3.1. qedit.html: Code for a Metaweb query editor
Metaweb Query Editor : Freebase - The World's Database
Let's begin by revisiting the simple query from Chapter 1. We would like to know what albums The Police have recorded:
{
"type" : "/music/artist",
"name" : "The Police",
"album" : []
}
Once we extract the result from its envelope, we're left with the following JSON object (some of the album names are omitted here for brevity):
{
"type": "/music/artist",
"name": "The Police",
"album": [
"Outlandos d'Amour",
"Reggatta de Blanc",
"Zenyatta Mondatta",
"Ghost in the Machine",
"Synchronicity",
]
}
To query Metaweb we tell it what we already know by specifying properties and their values:
"type" : "/music/artist",
"name" : "The Police",
And then we tell it what we want to know by specifying properties without values:
"album" : []
Sending an empty array in a MQL query tells Metaweb that we'd like to have the array filled in.
Let's look one more time at the simple "albums by The Police" query and response from above. This time the query and response are presented side-by-side to emphasize that the query and response objects have the same properties, but the response object has values filled in:
| Query | Result |
|---|---|
{
"type" : "/music/artist",
"name" : "The Police",
"album" : []
}
|
{
"type": "/music/artist",
"name": "The Police",
"album": [
"Outlandos d'Amour",
"Live in Boston",
"Reggatta de Blanc",
"Zenyatta Mondatta",
"Ghost in the Machine",
"Synchronicity",
"Every Breath You Take: The Singles",
"Greatest Hits"
]
}
|
This symmetry of queries and responses is a fundamental and elegant part of MQL. We'll use this two-column query/response format throughout the chapter.
Recall that all Metaweb objects have a unique identifier in their id property. Here's how we find the id for The Police:
| Query | Result |
|---|---|
{
"type" : "/music/artist",
"name" : "The Police",
"id" : null
}
|
{
"type" : "/music/artist",
"name" : "The Police",
"id" : "#9202a8c04000641f800000000006df1b",
}
|
This query includes the same name and type as the last query. But instead of specifying an empty array of albums, it specifies a null id. The null value is our query: this is what we want Metaweb to fill in. The response looks just like the query, but the null is replaced with an id string.
Now that we know the id of the object, let's turn our query around and ask about name and type of the object with that id:
{
"id": "#9202a8c04000641f800000000006df1b",
"name" : null,
"type" : null
}
We're telling Metaweb what we have (the id) and asking for the values (name and type) that we don't have. When we submit this query, though, it doesn't work. The response envelope looks like this:
{
"status": "200 OK",
"qname": {
"status": "/mql/status/error",
"messages": [
{
"status": "/mql/status/result_error",
"info": {
"count": 2,
"result": [
"#9202a8c04000641f80000000011ae833",
"#9202a8c04000641f8000000000000565"
]
},
"path": "type",
"query": {
"error_inside": "type",
"type": null,
"id": "#9202a8c04000641f800000000006df1b",
"name": null
},
"message": "Unique query may have at most one result. Got 2",
"type": "/mql/error"
}
]
}
}
The various status properties tell us that something is wrong with the query. The messages[0] object provides details. Its message property gives us an error message, and its info object provides details to go with the message. The query.error_inside and path properties tell us that the error is associated with the type property in our query.
What we learn from this response is that Metaweb could not respond to our query because we asked for a single type and it found two types. Let's try the query again. Now we're requesting a single name and an array of types for this uniquely specified object. This query works:
| Query | Result |
|---|---|
{
"id":"#9202a8c04000641f800000000006df1b",
"name" : null,
"type" : []
}
|
{
"id":"#9202a8c04000641f800000000006df1b",
"name" : "The Police",
"type" : [
"/common/topic",
"/music/artist"
]
}
|
The Metaweb object we asked about has the name "The Police" and it is a member of two types: /common/topic and /music/artist. Recall from Chapter 2 that /common/topic is a very generic type. Just about every Metaweb object that represents something an end user would have an interest in is a member of this type. The lesson to draw here is that objects almost always have more than one type, and any queries on the type property should use arrays. In general, it is always safe to use [] in place of null in your queries. If there is only one result the array returned in the response will simply have a single element. When you know that there can only be one result, however, it is usually more convenient and efficient to use null.
Uniqueness errors are a common pitfall for developers crafting Metaweb queries. Recall that /type/property allows certain properties to be specified as unique. id is unique: no object can have more than one id. The name property behaves as if it is unique (but is only unique per language). As we've seen, however, the type property, is not unique: objects can (and most objects do) have more than one type. If a property is not guaranteed to be unique, then you should always use square brackets when querying its value.
The id property is unique in another way. As we've seen, no object can have more than one id. More importantly, however, no two objects share the same id. Therefore, if a query includes an id, you can be confident that no more than one object will match. Therefore, a query like this one is correct:
{
"id": "#9202a8c04000641f800000000006df1b",
"name" : null,
"type" : []
}
Recall that an object can have only one name in any given language, and that the name property behaves like a unique property even though it is not really. For this reason, it is always safe to query name with null, as we do above, rather than [].
On the other hand, the query that we started this tutorial with is risky:
{
"type" : "/music/artist",
"name" : "The Police",
"album" : []
}
This query worked for us: Freebase only knows about one musical artist named "The Police". Note, however, that there is no guarantee that this will always be the case. There is nothing to prevent someone from adding another band named "The Police" to freebase.com. If such an addition were made, our query would suddenly fail.
Depending on the design of your application, a uniqueness failure in this situation might actually be exactly what you want. If you get two results when you expected one, then perhaps the right thing to do is fail and display an error message to the user. On the other hand, you could write your query more cautiously, using square brackets, so that multiple results can be returned:
[{
"type" : "/music/artist",
"name" : "The Police",
"album" : []
}]
In this case, you should be sure to check the number of results returned and take appropriate action (such as asking the user to choose) if you get more than one.
Let's find out more about our favorite band. What are the names of the tracks on the album Synchronicity?
| Query | Result |
|---|---|
{
"type" : "/music/artist",
"name" : "The Police",
"album" : {
"name" : "Synchronicity",
"track" : []
}
}
|
{
"type" : "/music/artist",
"name" : "The Police",
"album" : {
"name" : "Synchronicity",
"track" : [
"Synchronicity I",
"Walking in Your Footsteps",
"O My God",
"Mother",
"Miss Gradenko",
"Synchronicity II",
"Every Breath You Take",
"King of Pain",
"Wrapped Around Your Finger",
"Tea in the Sahara",
"Murder by Numbers"
]
}
}
|
The interesting thing about this query is that it includes a nested query. We're asking for an array of tracks from an album named "Synchronicity" recorded by a band named "The Police".
There are other ways to obtain the same information. Here's another query that gets us the same data:
{
"type" : "/music/album",
"name" : "Synchronicity",
"artist" : "The Police",
"track" : []
}
Rather than identifying the band first, and then querying an album recorded by that band, this query goes straight to the album, which it identifies by name and by artist. (It assumes this is enough to uniquely identify a single album and avoid uniqueness errors!)
In our queries so far, we've used null and [] to ask Metaweb to fill in a single value or an array of values. There are other ways to ask for information as well. Recall the following query:
{
"id" : "#9202a8c04000641f800000000006df1b",
"name" : null,
"type" : []
}
It asks for the name and types of a unique object. Both the name, and the individual elements of the type array are returned as strings. Recall, however, that the name of an object is of /type/text and that types are of /type/type. /type/text is a value type in the Metaweb object model, but we can treat values as objects if we want to. Let's modify the query to use {} and [{}] instead of null and []. {} asks for a single value, expanded as an object, and [{}] asks for an array of values expanded into objects:
{
"id": "#9202a8c04000641f800000000006df1b",
"name" : {},
"type" : [{}]
}
This query fails with a uniqueness error. The object we're querying has more than one name. The name property behaves specially when queried with null: it returns the value of the name in the default language. It only works to query name with {} if there is only one name, with no translations. To make the query work, we ask for both the name and type with [{}]:
| Query | Result |
|---|---|
{
"id": "#9202a8c04000641f800000000006df1b",
"name" : [{}],
"type" : [{}]
}
|
{
"id":"#9202a8c04000641f800000000006df1b",
"name":[{
"lang":"/lang/fr",
"type":"/type/text",
"value":"The Police"
},{
"lang":"/lang/en",
"type":"/type/text",
"value":"The Police"
},{
"lang":"/lang/es",
"type":"/type/text",
"value":"The Police"
}],
"type":[{
"id":"/music/artist",
"name":"Musical Artist",
"type":["/type/type","/freebase/type_profile"]
},{
"id":"/common/topic",
"name":"Topic",
"type":["/type/type","/freebase/type_profile"]
}]
}
|
We learn from this query that the name of the specified object is "The Police" in each of several different languages (some languages have been omitted here). We also learn that the object is of type /common/topic and /music/artist and that these types have common names (as opposed to the formal ids that we use in queries) "Topic" and "Musical artist".
Let's use this query technique to learn more about the tracks on the album Synchronicity. (The result is truncated for brevity.)
| Query | Result |
|---|---|
{
"type" : "/music/album",
"name" : "Synchronicity",
"artist" : "The Police",
"track" : [{}]
}
|
{
"type": "/music/album",
"name": "Synchronicity",
"artist": "The Police"
"track": [{
"type": [ "/music/track" ],
"name": "Synchronicity I",
"id": "#9202a8c04000641f8000000001275dbb"
},{
"type": [ "/music/track" ],
"name": "Walking in Your Footsteps",
"id": "#9202a8c04000641f8000000001275dc2"
},{
"type": [ "/music/track" ],
"name": "O My God",
"id": "#9202a8c04000641f8000000001275dc9"
}]
}
|
This query doesn't actually tell us much about the tracks themselves. We already know the type of the tracks. The id might be useful in future queries, but it doesn't tell us anything about the track. The name is useful, but we could have obtained that without using curly braces, just by querying "track":[].
When you ask Metaweb to fill in empty curly braces for you, it returns all the properties if the value is a value type. The name property of an object is of /type/text, and querying it with {} returns all of its properties. If the property is an object type instead of a value type, then Metaweb returns only the name, type and id properties (all of which are defined by /type/object and are common to all Metaweb objects). That is, instead of using [{}], we could write out the query explicitly like this:
{
"type" : "/music/album",
"name" : "Synchronicity",
"artist" : "The Police",
"track" : [{
"name" : null,
"id" : null,
"type" : []
}]
}
What if we want to know absolutely everything freebase.com knows about the tracks on Synchronicity? We write the query using a wildcard: [4]
{
"type" : "/music/album",
"name" : "Synchronicity",
"artist" : "The Police",
"track" : [{"*":null}]
}
"*" is a wildcard property name. It means "all property names". (Note that it is different from [] which means "all property values") The type /music/track defines a number of its own properties, and the expansion of the "*" wildcard also includes the universal properties defined by /type/object. Here, for example, is what freebase.com knows about the song "Walking in Your Footsteps":
{
"name":"Walking in Your Footsteps",
"type":["/music/track"],
"id":"#9202a8c04000641f8000000001275dc2",
"guid":"#9202a8c04000641f8000000001275dc2",
"creator":"/user/mwcl_musicbrainz",
"key":["a2313ee6-ccce-4ced-bc3c-af7d4b06f09f","TRACK179899"],
"permission":"/boot/all_permission",
"timestamp":"2006-12-10T00:39:58.0931Z",
"album":["Synchronicity"],
"length":[216.8],
"lyricist":[],
"lyrics":[],
"song":[],
"artist":null,
"composer":[],
"acquire_webpage":[]
}
If {} gives us too little useful information, And {"*":null} gives us more than we really need, then we must refine our query to express exactly what it is we would like to know. Here's how we ask for just the name and length of each of the tracks:
| Query | Result |
|---|---|
{
"type" : "/music/album",
"name" : "Synchronicity",
"artist" : "The Police",
"track" : [{
"name":null,
"length":null
}]
}
|
{
"type": "/music/album",
"name": "Synchronicity",
"artist": "The Police",
"track": [
{"name":"Synchronicity I", "length":203.533},
{"name":"Walking in Your Footsteps",
"length":216.8},
{"name":"O My God", "length":242.226},
{"name":"Mother", "length":185.6},
{"name":"Miss Gradenko", "length":120.0},
{"name":"Synchronicity II", "length":305.066},
{"name":"Every Breath You Take",
"length":254.066},
{"name":"King of Pain", "length":299.066},
{"name":"Wrapped Around Your Finger",
"length":313.733},
{"name":"Tea in the Sahara", "length":255.44},
{"name":"Murder by Numbers", "length":273.693}
]
}
|
In this tutorial we've said that we query the value of a property p with "p":null and "expand" that value into an object with "p":{}. This is helpful terminology, but it is actually the opposite of what is really going on. Everything in Metaweb is an object (or, in the case of literal values, can be viewed as an object). When you use curly braces, objects are naturally expressed as objects. When you use null, however, objects are compressed: instead of returning the complete object, Metaweb returns only the value of the object's default property. If the object is of value type, this default property is always the value property and is expressed as a string, number, or boolean literal. If the object is not an instance of a value type, then the default property is either name or id, both of which are expressed using string literals. Object types in the /type domain use id as their default property. All others object types use name.
Default properties are not only used when you ask Metaweb to fill in a null or a [] for you. They are also used when you express the information you already have. Consider the following query:
{
"type" : "/music/album",
"name" : "Synchronicity",
"artist" : "The Police",
"track" : []
}
This query could also be expressed more verbosely like this:
{
"type" : "/music/album",
"name" : {"value":"Synchronicity", "lang":"/lang/en"},
"artist" : {"type":"/music/artist", "name":"The Police"},
"track" : []
}
The verbose form of the query illustrates the fact that the succinct form relies on default properties. The name property is of /type/text, whose default property is value. The artist property is of type /music/artist, whose default property is name.
If you want to ask Metaweb to return a value, use one of the terms listed in Table 3.1 on the right-hand side of a property name:
Table 3.1. Asking for Values
| Term | Meaning |
|---|---|
null |
If the property is of value type, return the |
[] |
Like |
{} |
If the property is of value type, return an object that represents the value. This object will have If the property is of object type, return an object that includes its |
[{}] |
Like |
{"*":null} |
A query of this form returns an object and all of its properties. The meaning of "all of its properties" requires some explanation, however. Suppose Metaweb sees the query |
[{"*":null}] |
Like |
{"*":{}} |
A query of this form is like |
[{"*":{}}] |
Like |
Suppose we want to find the answers to the following questions: Which artists have recorded songs named Too Much Information? How long are the recordings, and on what albums were they released?
Here's a simple query to answer this question, along with the freebase.com response:
| Query | Result |
|---|---|
[{
"type":"/music/track",
"name":"Too Much Information",
"artist":null,
"album":null,
"length":null
}]
|
[{
"type" : "/music/track",
"name" : "Too Much Information",
"artist" : "The Police",
"album" : "Message in a Box (disc 3)",
"length" : 222.733
},{
"type" : "/music/track",
"name" : "Too Much Information",
"artist" : "The Police",
"album" : "Ghost in the Machine",
"length" : 222.733
}]
|
You should have no trouble understanding this query. It requests an array of tracks with the specified name, and asks Metaweb to fill in the artist, album, and length of each track. But there are other ways to ask for this information. The above track-centric query is simple, but returns an unordered and unstructured list of tracks. If multiple artists have recorded the same song, we might like the result to be organized by artist. Here's how to write an artist-centric version of the query, along with the more structured response from freebase.com:
| Query | Result |
|---|---|
[{
"type":"/music/artist",
"name":null,
"album": [{
"name":null,
"track": [{
"name":"Too Much Information",
"length": null
}]
}]
}]
|
[{
"type" : "/music/artist",
"name" : "The Police",
"album" : [{
"name" : "Ghost in the Machine",
"track" : [{
"name" : "Too Much Information",
"length" : 222.733
}]
}, {
"name" : "Message in a Box (disc 3)",
"track" : [{
"name" : "Too Much Information",
"length" : 222.733
}]
}]
}]
|
Take a look at that query again. It involves three different objects: an album, and artist, and a track. We can't tell Metaweb anything interesting about the album (such as a name or id): just that it contains the song we're interested in. We can't tell Metaweb anything about the artist object either: just that they recorded an album that includes the song. Despite the seeming vagueness of this query, Metaweb has no trouble finding the answer we want.
At first glance, it seems as if the only information we're providing to Metaweb with this query is the track name. But notice that we also explicitly specify the type of the outermost object: we've said that we want an object of type /music/artist. This is critical, because types have properties, and properties specify the type of their values. Since we've specified that the outermost object is /music/artist, Metaweb knows that the middle object is a /music/album (because that is the type of the /music/artist/album property) and that the inner object is a /music/track (because that is the type of the /music/album/track property). [5]
We've answered our question about the song Too Much Information with a track-centric query and an artist-centric query. For completeness, here is the album-centric query that returns the same information:
[{
"type":"/music/album",
"name":null,
"artist":null,
"track": [{
"name":"Too Much Information",
"length": null
}]
}]
The id and name properties of every Metaweb object have special behavior that is important to understand. Some of this behavior was explained in Chapter 2, but it bears repeating here.
The critical thing about id is that it is unique: every object's id is different. For objects, such as types, that are organized into namespaces, the id is a fully-qualified name such as "/music/artist". For other objects, the id is a guid: a unique, but meaningless, string of hexadecimal digits. Note that although ids are represented with JSON strings, the id property of /type/object is of /type/id rather than /type/text or /type/rawstring.
In addition to its guarantees of uniqueness, the id property has some special behavior. Specifically, the id property cannot be constrained with pattern-matching or comparison operators, and cannot be used as a sort key. (We'll learn about operators and sorting later in this tutorial.)
The special thing about the name property is that it behaves like a unique property (you can safely query it with null instead of [], for example) but it is not truly unique. Any Metaweb object can have multiple names, but may have only one name in any given language. That is, the name property is unique on a per-language basis. When you query the name of an object, Metaweb returns its name (if it has one) in your preferred language. (The desired language is specified as a parameter to the mqlread query service, which is the topic of Chapter 4.)
To demonstrate the special behavior of the name property, we must choose a topic that has translations into other languages. Let's find the freebase.com topic named "Anarchism":
| Query | Result |
|---|---|
{
"type" : "/common/topic",
"name" : "Anarchism",
"id":null
}
|
{
"type": "/common/topic",
"name": "Anarchism",
"id": "#9202a8c04000641f8000000000003b60"
}
|
Now, let's take this object identified by id, and ask for its name:
| Query | Result |
|---|---|
{
"id":"#9202a8c04000641f8000000000003b60",
"name":null
}
|
{
"id":"#9202a8c04000641f8000000000003b60",
"name":"Anarchism"
}
|
This simply returns the English name we started with: "Anarchism". Let's ask for all names:
| Query | Result |
|---|---|
{
"id":"#9202a8c04000641f8000000000003b60",
"name":[]
}
|
{
"id":"#9202a8c04000641f8000000000003b60",
"name":["Anarchism"]
}
|
This query just returns the unique English name in an array. So let's try again and ask for all names, along with the languages in which they are encoded:
| Query | Result |
|---|---|
{
"id":"#9202a8c04000641f8000000000003b60",
"name":[{}]
}
|
{
"id" : "#9202a8c04000641f8000000000003b60",
"name" : [
{"lang":"/lang/en","type":"/type/text",
"value":"Anarchism"},
{"lang":"/lang/es","type":"/type/text",
"value":"Anarquismo"},
{"lang":"/lang/fr","type":"/type/text",
"value":"Anarchisme"},
{"lang":"/lang/it","type":"/type/text",
"value":"Anarchismo"},
{"lang":"/lang/de","type":"/type/text",
"value":"Anarchismus"},
]
}
|
Bingo! We find that this object has names in English (en), Spanish (es), French (fr), Italian (it), and German (de)
Here's how we can ask for a name of the object in a specific language other than our preferred language:
| Query | Result |
|---|---|
{
"id":"#9202a8c04000641f8000000000003b60",
"name":{
"value":null,
"lang":"/lang/fr"
}
}
|
{
"id" : "#9202a8c04000641f8000000000003b60",
"name": {
"value": "Anarchisme",
"lang": "/lang/fr"
}
}
|
We know how to ask "what are the names and lengths of the tracks on the album Synchronicity by The Police?". The query looks like this:
{
"type":"/music/album",
"name":"Synchronicity",
"artist":"The Police",
"track":[{"name":null, "length":null}]
}
Metaweb also allows us to ask "What are the names and lengths of the long songs on the album?" The query below includes a numeric constraint on the length property, and the freebase.com response only includes the two songs on the album that are longer than 300 seconds:
| Query | Result |
|---|---|
{
"type":"/music/album",
"name":"Synchronicity",
"artist":"The Police",
"track":[{
"name":null,
"length":null,
"length>":300
}]
}
|
{
"type" : "/music/album",
"name" : "Synchronicity",
"artist" : "The Police",
"track" : [{
"name" : "Synchronicity II",
"length" : 305.066
}, {
"name" : "Wrapped Around Your Finger",
"length" : 313.733
}]
}
|
The line "length>":300 in the query expresses a constraint to Metaweb: it specifies that the track must be longer than 300 seconds. In addition to >, you can also use < for less-than, and <= and >= for less-than-or-equal and greater-than-or-equal. Note, however, that no spaces are allowed before or after these punctuation characters.
This constraint syntax looks quite odd at first. It is a result of the limitations of the JSON format: everything must be expressed with property names, colons, and values. We would like to be able to express a constraint like:
"length" <= 300
But that is not legal JSON syntax, so we express it instead like this:
"length<=" : 300
You can include more than one numeric constraint on the same property, restricting the value to a range. Here's how we ask for songs that are at least three minutes long, but less than four:
{
"type":"/music/album",
"name":"Synchronicity",
"artist":"The Police",
"track":[{
"name":null,
"length":null,
"length>=":180,
"length<":240
}]
}
If you include a constraint on a property, you must also ask Metaweb to return the value of that property. You cannot, for example ask: "List all songs longer than 5 minutes, but don't bother to tell me exactly how long they are."
Numbers are not the only type that can be constrained with these operators. Here, for example, is a query that constrains a /type/datetime property to obtain a list of albums released in 1999:
[{
"type":"/music/album",
"name":null,
"artist":null,
"release_date":null,
"release_date>=":"1999-01-01",
"release_date<=":"1999-12-31"
}]
Metaweb queries can also place constraints on textual values. To do this use the pattern matching operator ~=. [6] To try this out, let's find some short songs about love:
[{
"type":"/music/track",
"artist":null,
"name":null,
"name~=":"love",
"length":null,
"length<":120
}]
Here's a query for songs about love recorded by bands whose name begins with "The":
[{
"type":"/music/track",
"artist":null,
"artist~=":"^The",
"name":null,
"name~=":"love"
}]
Results include If You Love Somebody, Set Them Free by The Police and I'm Sick of Love by The White Stripes.
Notice that the constraint on the artist property in the query above uses the ^ character to specify that the word The must appear at the beginning of the artist's name. If you're familiar with regular expressions, this might make you think that Metaweb supports pattern matching with regular expressions. In fact, Metaweb's matching syntax is closer to that used by internet search engines. Table 3.2 summarizes MQL pattern matching syntax. Note that all searches are case-insensitive.
Table 3.2. MQL Pattern Matching Syntax
| Pattern | Matches |
|---|---|
love |
Matches any string that contains the word "love". Does not match strings containing "glove" or "lover". |
love* |
Matches any string containing a word that begins with "love", such as "love", "lover" or "lovely". Does not match "glove". |
*love |
Matches any string containing a word that ends with "love", such as "love" or "glove". |
*love* |
Matches any string that contains "love", such as "love", "glove", "lover" and "glover". |
* |
Matches any single word |
love you |
Matches any string that contains the phrase "love you". Does not match strings that contain "glove you", "love your", "you love", "love hate you" or "loveyou". |
^ |
Matches the beginning of a string. For example, |
$ |
Matches the end of a string. For example, |
- |
A hyphen or other punctuation matches an optional space. For example, |
\ |
Use a backslash to escape any punctuation character that you want to match literally. |
Here's a query to find all bands whose name is two words long and begins with the word The (such as The Police, and The Clash):
[{
"type" : "/music/artist",
"name" : null,
"name~=" : "^The *$"
}]
What bands have three-word names that begin with "the" and end with a plural (e.g. The Beach Boys, The Doobie Brothers)?
[{
"type" : "/music/artist",
"name" : null,
"name~=" : "^The * *s$"
}]
In addition to matching text with ~=, string constraints can also be applied with the <, >, <=, and >= operators, which compare strings in case-insensitive, Unicode-aware alphabetical order. For example, to find bands whose name begins with one of the letters A through F, use this query:
[{
"type" : "/music/artist",
"name" : null,
"name>=" : "A",
"name<" : "G"
}]
Note that it is not legal to constrain the id property, with either the pattern-matching operator or the greater-than or less-than operators.
Every Metaweb query for a set of values is implicitly limited to 100 values -- to reduce resource consumption and bandwidth usage, Metaweb does not return more values than this unless you explicitly ask for more. If you ran the query above for bands whose name begins with the letters A through F, you ran up against this limit. To change the number of desired results to a larger, or a smaller, number, use the limit directive. Here, for example, is a query that returns the names of up to 2000 bands:
[{
"type":"/music/artist",
"name":null,
"limit":2000
}]
limit is not a property name: it is a reserved word in MQL. No type may have a property named "limit". Limits can be useful to prune the result tree of values you aren't really interested in. The following query, for example, asks "What bands have names that begin with "The" and have recorded songs longer than 8 minutes? I'm only interested in the band name, so just give me one of the long songs, not the full list."
[{
"type" : "/music/artist",
"name" : null,
"name~=" : "^The",
"track": [{
"name":null,
"length":null,
"length>":480,
"limit":1
}]
}]
Note that we use a limit of one in the above. Specifying a limit of zero means "don't limit the results: return everything you've got". Although MQL allows you to ask for an unlimited number of results, Metaweb does not guarantee that you'll always get an answer. Complicated queries with a large number of results may time out before Metaweb can complete the result.
Since the limit directive must appear within curly braces, limiting a query sometimes requires you to transform a simple query into a slightly more complex one. Consider this query to list all albums by The Police:
{
"type" : "/music/artist",
"name" : "The Police",
"album" : []
}
If we want to limit the result to five albums, we must rewrite the query as follows:
{
"type" : "/music/artist",
"name" : "The Police",
"album" : [{"name":null, "limit":5}]
}
Use the sort directive if you'd like the Metaweb server to sort the results of your query before returning them. For example, to ask for the names of the tracks on an album in alphabetical order, sort them by name:
// Tracks on the album Synchronicity, in alphabetical order
{
"type":"/music/album",
"name":"Synchronicity",
"artist":"The Police",
"track": [{
"name":null,
"sort":"name"
}]
}
As you can see, the sort directive simply specifies the name of the property by which the sort is to be done. To order these same tracks from shortest to longest, use "length" as the sort key:
// Tracks on the album Synchronicity, from shortest to longest
{
"type":"/music/album",
"name":"Synchronicity",
"artist":"The Police",
"track": [{
"name":null,
"length":null,
"sort":"length"
}]
}
Note that the query above includes "length":null. If you want to use a property as a sort key, you must query that property.
To reverse this order, precede the name of the sort key by a minus sign:
// Tracks on the album Synchronicity, from longest to shortest
{
"type":"/music/album",
"name":"Synchronicity",
"artist":"The Police",
"track": [{
"name":null,
"length":null,
"sort":"-length"
}]
}
The sorts shown above are convenient, but could easily be duplicated on the client side. That is, you could request unordered results from Metaweb and sort them yourself. One situation in which the sort directive cannot be duplicated on the client is when it interacts with the limit directive. Result sets are truncated to the specified limit after the sort is applied. Use sort and limit together in queries like this:
// What is the longest track on Synchronicity?
{
"type":"/music/album",
"name":"Synchronicity",
"artist":"The Police",
"track": {
"name":null,
"length":null,
"sort":"-length",
"limit":1
}
}
(Note that explicitly specifying a limit of 1 means that we can safely omit the square brackets from the query.)
Sorting need not be limited to a single sort key. To specify more than one key, use an array on the right-hand side of the sort directive:
// List all tracks by The Police, sorted by album name and track name
[{
"type":"/music/track",
"artist":"The Police",
"name":null,
"album":null,
"sort":["album","name"]
}]
If your query includes sub-queries, then the properties of those sub-queries can also be used as sort keys. The query below is a variation on the one above that uses this kind of hierarchically-named sort key:
// List all tracks by The Police, on albums released before 1990,
// sorted by album name and track name
[{
"type":"/music/track",
"artist":"The Police",
"name":null,
"album":{
"name":null,
"release_date":null,
"release_date<":"1990"
},
"sort":["album.name","name"]
}]
Here is an example that uses the sort directive in two places:
// List all albums by The Police, along with the name of their longest track.
// Order the albums from longest longest track to shortest longest track.
[{
"type":"/music/album",
"artist":"The Police",
"name":null,
"track":{
"name":null,
"length":null,
"sort":"-length",
"limit":1
},
"sort":"-track.length"
}]
If you do not include a sort directive in a query then Metaweb returns unordered results to you. In practice, with the current implementation, values are returned in the order that they were added to the database. For ordered data, such as the list of tracks on an album, this insertion order is often non-random. But there is no guarantee that this will always be the case. If you don't ask for the data to be sorted, you should treat the result as an unordered set of values rather than an ordered list. [7]
Some data, such as the tracks on an album, have a natural order. If you want results to be sorted according to this natural ordering, use "sort":"index". (Or, to reverse the natural ordering, use "sort":"-index".
// Return the tracks on the album Synchronicity in the order that they appear
{
"type":"/music/album",
"artist":"The Police",
"name":"Synchronicity",
"track":[{
"name":null,
"index":null,
"sort":"index"
}]
}
Since we've used "index" as a sort key, we must query the value of "index" as well. index is a keyword in MQL and is not a true property of any object. It can be queried, however, and when you do this, Metaweb returns a non-negative integer. It is important to understand that the notion of order does not apply to objects in Metaweb, but to the relationships between objects. It is the link between the album "Synchronicity" and the track "Mother" that has an index of 3, not the track itself. This becomes clear when you consider the case of a track that appears on more than one album. If "Mother" also appears on an album named "Greatest Hits" it is likely to have a different index on that album. [8]
Since index is not a true property, there are a lot of things you cannot do with it. You cannot constrain the index with property names index> or index<. MQL read queries may use index as a sort key, and they may query the index with "index":null, but may not use the keyword in any other way. You cannot write "index":1 to ask for the second item in a set, for example. (The index keyword can be used in other ways in write queries, however, and we'll learn about that in Chapter 5).
The index keyword can be used in conjunction with the limit directive. Consider the following query, which ask for the last two tracks on Synchronicity:
| Query | Result |
|---|---|
{
"type":"/music/album",
"artist":"The Police",
"name":"Synchronicity",
"track":[{
"name":null,
"index":null,
"sort":"-index",
"limit":2
}]
}
|
{
"type": "/music/album",
"artist": "The Police",
"name": "Synchronicity",
"track": [{
"index": 1,
"name": "Murder by Numbers"
},{
"index": 0,
"name": "Tea in the Sahara"
}]
}
|
The query above correctly returns the names of the final two tracks on the album Synchronicity. Look carefully, however at the index values it returns: the last track is given an index of 1 and the penultimate track an index of 0. This is not a bug: this query simply reveals the true nature of ordered collections in Metaweb. Metaweb does not include an absolute index for each link. The implementation is able to say whether any link is greater-than or less-than another, but it cannot tell you the absolute position of that link within the complete set of links.
The number that Metaweb returns as the value of the index property is a synthetic one, generated by Metaweb as a simple way to express the order of elements. If Metaweb returns an array holding n elements, then it generates index values for those elements that range from 0 to n-1. For example, if you ask for the last two tracks on an album, the resulting values have indexes 0 and 1. If you ask for tracks that are shorter than 2 minutes and Metaweb finds three of them, then it will assign them index values of 0, 1, and 2. If you want to know the track number for the tracks on a particular album, you must query the complete set of tracks. Then add one to the index value to get the track number. If you want to know the track numbers of the short songs, you must query the complete set of tracks, and search for the short songs yourself.
In addition to the limit and sort directives, MQL also includes an optional directive. If part of your query is not required to match, add "optional":true to it. For example, we can use the optional directive to ask the question: "What bands have names that begin with "The", and do they have a Greatest Hits album?". The query looks like this:
[{
"type" : "/music/artist",
"name" : null,
"name~=" : "^The",
"album":[{
"name":null,
"name~=": "greatest hits",
"optional":true
}]
}]
Without the optional directive, the query would only return bands whose name begins with The who have released a Greatest Hits album. With the optional directive, we get all bands whose name begins with The, and additionally, we get the name of any albums they have released that include the phrase "greatest hits".
Optional queries can be nested inside optional queries. The following query is an extension to the one above. It further asks for the names of tracks longer than 5 minutes, if any exist, on the Greatest Hits album, if it exists.
[{
"type" : "/music/artist",
"name" : null,
"name~=" : "^The",
"album":[{
"name":null,
"name~=": "greatest hits",
"optional":true,
"track": [{
"length":null,
"length>":300,
"name":null,
"index":null,
"sort":"index",
"optional":true
}]
}]
}]
Note that it is legal, but never necessary or useful, to add "optional":false to a query. Also, it is never useful to use the optional directive in the top-level of a query. Queries are implicitly optional at that level: if Metaweb can't find a match, it returns an empty result.
Recall from the beginning of this tutorial that most objects in Metaweb have two or more types:
| Query | Result |
|---|---|
{
"id":"#9202a8c04000641f800000000006df1b",
"name":null,
"type":[]
}
|
{
"id":"#9202a8c04000641f800000000006df1b",
"name":"The Police",
"type":[
"/common/topic",
"/music/artist"
]
}
|
What do you do if you want to query one property, such as a list of albums from one type, and another property, such as a list of images, from a second type? MQL addresses this issue by allowing you to specify a fully-qualified property name that includes the name of the type to which it belongs. So here is how we ask for the albums by and pictures of, The Police:
{
"type":"/music/artist",
"name":"The Police",
"album":[],
"/common/topic/image":[{}]
}
The first line of this query specifies that the object to be matched should be of type /music/artist. The second line specifies the name of the object. name and type are properties of /type/object, and are shared by all objects in the database. These property names (along with id, guid, key, timestamp, creator, and permission) can always be used without qualification (although you can qualify them with /type/object if you want to). Other types are not allowed to define properties whose names conflict with these.
The third line of the query asks for a property named album. This property is not defined by /type/object, but it is defined by /music/artist, and the query has already declared that the object will be an instance of that type. The fourth line asks for a property named image. This is not defined by /type/object nor by /music/artist, and so we must qualify it with the name of its type so that Metaweb can understand it.
For symmetry, and to be explicit, you can rewrite the query to fully-qualify both properties of interest:
{
"type":"/music/artist",
"name":"The Police",
"/music/artist/album":[],
"/common/topic/image":[{}]
}
If you do this, you might be tempted to drop the initial type specification, since the album property is now fully-qualified:
[{
"name":"The Police",
"/music/artist/album":[],
"/common/topic/image":[]
}]
Notice that we've put the toplevel query in square brackets now. This query will return any object whose name is The Police, even if it has no album or image properties, and even if it is an instance of neither /music/artist nor /common/topic.
Note that qualified property names use / as a delimiter and nested sort keys use . as a delimiter. If your query uses qualified property names and sorts by those names, you may end up using both delimiter characters. The following query is a variation on one shown earlier, in which two of the properties have been (unnecessarily) qualified. Note the lengthy sort key:
// Police songs from albums released before 1990, sorted by album name
[{
"type":"/music/track",
"artist":"The Police",
"name":null,
"/music/track/album":{
"/type/object/name":null,
"release_date":null,
"release_date<":"1990"
},
"sort":"/music/track/album./type/object/name"
}]
We saw wildcards earlier in this tutorial in the forms {"*":null} and {"*":{}}. But they are somewhat more versatile than that. Consider the following query:
| Query | Result |
|---|---|
{
"id":"#9202a8c04000641f8000000002f9e349",
"*":null
}
|
{
"id":"#9202a8c04000641f8000000002f9e349",
"guid":"#9202a8c04000641f8000000002f9e349",
"name": "Synchronicity",
"type": ["/music/album", "/common/topic"],
"key": [
"1299f319-8ff4-44fb-8440-7fb990972864",
"RELEASE3178"
],
"creator": "/user/mwcl_musicbrainz",
"permission": "/boot/all_permission",
"timestamp": "2006-11-30T13:42:18.0194Z"
}
|
This query identifies a unique object by ID, and then uses a wildcard to ask for all of its properties. Since no type has been specified, the wildcard is expanded with all the properties of /type/object, and the result is as shown above.
Note that some of the properties expand to a single value, and others to arrays. Thus the syntax "*":null really means "*":null-or-[]. We could instead write the query using "*":[]. In this case, all of the property are returned as arrays, even unique properties.
Now let's modify the query to specify a type other than the default of /type/object:
{
"type":"/music/album",
"name":"Synchronicity",
"*":null
}
In this query, the * wildcard expands differently. Since we have specified that the object is of type /music/album, Metaweb looks up the properties of that type and queries each one with a null or [], depending on whether the property is unique or not. It does this in addition to also querying the common object properties shown in the query result above.
Note that if a property is explicitly listed in a query, a wildcard expansion will not overwrite it. Consider this:
{
"type":"/music/album",
"name":"Synchronicity",
"track":[{}],
"*":null
}
This query explicitly asks for an array of tracks, as objects rather than just as track names. The expansion of the wildcard would normally include "track":[], but in this case that property would conflict with the explicitly specified one and will be left out of the expansion.
Wildcards can also be used in a second, more aggressive, form. "*":{} expands to query each property with {} or [{}] instead of null or []. Similarly, "*":[{}] expands to query each property, even unique properties, with [{}]. Let's repeat the query with which we began this section, using "*":{} instead. With this query, each of the properties of /type/object is expanded into a complete object, and the result is much longer. The long response is reproduced here in its entirety because it serves as a useful review of the structure of some of the most fundamental Metaweb data types:
| Query | Result |
|---|---|
{
"id":"#9202a8c04000641f8000000002f9e349",
"*":{}
}
|
{
"id": "#9202a8c04000641f8000000002f9e349",
"guid": {
"type": "/type/id",
"value": "#9202a8c04000641f8000000002f9e349",
},
"name": {
"lang": "/lang/en",
"type": "/type/text",
"value": "Synchronicity"
},
"type": [{
"type": ["/type/type"],
"id": "/music/album",
"name": "Record album"
},{
"type": ["/type/type"],
"id": "/common/topic",
"name": "Topic"
}],
"key": [{
"type": "/type/key",
"namespace":
"/user/metaweb/datasource/MusicBrainz",
"value":
"1299f319-8ff4-44fb-8440-7fb990972864"
}, {
"type": "/type/key",
"namespace":
"/user/metaweb/datasource/MusicBrainz/name",
"value": "RELEASE3178"
}],
"creator": {
"type": ["/type/user"],
"id": "/user/mwcl_musicbrainz",
"name": null
},
"permission": {
"type": ["/type/permission"],
"id": "/boot/all_permission",
"name": "Global Write Permission"
},
"timestamp": {
"type": "/type/datetime",
"value": "2006-11-30T13:42:18.0194Z"
}
}
|
MQL queries use JSON properties to express constraints. Those constraints are implicitly ANDed together. Consider:
[{
"type":"/music/artist",
"name":null,
"name~=":"^The",
"album":"Greatest Hits"
}]
This query says: tell me the names of objects which have type "/music/artist" AND which have a name that begins with "The" AND which have an album named "Greatest Hits".
Suppose we want to find the names of all bands who have an album named "Greatest Hits" AND an album named "Super Hits". We might try this query:
[{
"type":"/music/artist",
"name":null,
"album":["Greatest Hits","Super Hits"] // Invalid MQL
}]
But this is not legal MQL. And if it was, it would probably mean find an artist who has recorded exactly two albums, with names "Greatest Hits" and "Super Hits". A musical artist object may have multiple album links to album objects. We want to constrain our query so that all result objects have links to two specific album names. Here's a natural way to express this query:
[{
"type":"/music/artist",
"name":null,
"album":"Greatest Hits",
"album":"Super Hits" // Invalid JSON
}]
This query makes sense in the Metaweb object model: find objects that have one "album" link to an album named "Greatest Hits" and another "album" link to an album named "Super Hits". Unfortunately, this query is not valid JSON: since it includes the same property name twice, it cannot be parsed into object form.
MQL's solution to this dilemma is to allow an arbitrary identifier and colon to prefix any property name. The prefix and colon are ignored: they serve simply as a workaround to the JSON limitation just described. With this trick we can rewrite the query above like this:
| Query | Result |
|---|---|
[{
"type":"/music/artist",
"name":null,
"a:album":"Greatest Hits",
"b:album":"Super Hits",
"limit":2
}]
|
[{
"type": "/music/artist",
"name": "Alice Cooper",
"a:album": "Greatest Hits",
"b:album": "Super Hits"
},{
"type": "/music/artist",
"name": "Dan Fogelberg",
"a:album": "Greatest Hits",
"b:album": "Super Hits"
}]
|
Note that the arbitrary prefixes we choose for the query are repeated in the result objects. The prefixes are arbitrary, but they must be valid identifiers which means they cannot contain punctuation characters and must not begin with a digit.
This property prefixing scheme is not limited to sets of two properties. And prefixed properties can include operator suffixes. Let's find bands that have lots of hits and have recorded Christmas albums:
[{
"type":"/music/artist",
"name":null,
"a:album":"Greatest Hits",
"b:album":"Super Hits",
"c:album~=":"christmas",
"c:album":[]
}]
Another use of property prefixes is to constrain a property and also query the property at the same time. Let's find bands that have released a Greatest Hits album, and also ask for the names of all the albums they have released:
[{
"type":"/music/artist",
"name":null,
"album":[],
"includes:album":"Greatest Hits",
}]
Note that although property prefixes are arbitrary, we can choose identifiers that add meaning to our queries.
At the beginning of this tutorial, we wrote a query to determine the types of the object that represents The Police. In order to do this, we first asked for the id of The Police, and then used the resulting guid to uniquely identify the object so we could ask for its types. Property prefixes make this easier:
| Query | Result |
|---|---|
{
"constraint:type":"/music/artist",
"name":"The Police",
"query:type":[]
}
|
{
"constraint:type": "/music/artist",
"name": "The Police",
"query:type": [
"/music/artist",
"/common/topic"
],
}
|
As an interesting aside, let's return to the query with which we started this section. We want to find bands that have released "Greatest Hits" and "Super Hits" albums. There is actually a way to do this without property prefixes. It relies on the fact that Metaweb relationships are always bi-directional and that MQL queries can be "turned inside out":
[{
"type":"/music/artist",
"name":null,
"album":[{
"name":"Greatest Hits",
"artist":{
"album":"Super Hits"
}
}]
}]
Translated into English, this query says: "give me the names of all bands that have released an album named "Greatest Hits", the artist of which has released an album named "Super Hits". The album property of a band object refers to an album object. And the artist property of the album object refers back to the band object. We can use this fact to further constrain the artist. This technique (some would say "hack") is worth understanding because it illustrates one of the deep properties of Metaweb objects.
MQL has no general-purpose way to express an OR relationship. Suppose we want a list of bands who have released an album named "Greatest Hits" OR and album named "Super Hits". The ~= pattern matching operator does not have an OR syntax, and there is no way to specify that an album name match either "Greatest Hits" OR "Super Hits". The only way to do this is to make two queries and combine their results: [9]
[{
"type":"/music/artist",
"name":null,
"album":"Greatest Hits"
}]
[{
"type":"/music/artist",
"name":null,
"album":"Super Hits"
}]
Combining the results of two queries is fairly straightforward. The only tricky issue is avoiding duplicates. If a band appears in the results of both queries, for example, you would want to take care that it did not appear twice in the combined result.
Combining property prefixes with a pattern matching operator and the optional directive, we can achieve something vaguely like an OR operation:
[{
"type":"/music/artist",
"name":null,
"album":[],
"album~=":"hits",
"great:album":[{
"name":"Greatest Hits",
"optional":true
}],
"superb:album":[{
"name":"Super Hits",
"optional":true
}]
}]
This query returns all bands that have any albums whose name includes the word "hits". It returns the names of those albums, and includes optional sub-queries for the particular names we're interested in.
We began the discussion of expressing OR in queries by saying that MQL had no general-purpose syntax for expressing OR. There is, however a specialized syntax for expressing OR, and it is useful in a number of situations. |= can be used at the end of a property as a constraint like > or ~=. The value of such a constrained property should be a JSON array of strings. The constraint says "match any one of the values in this array". (That is: match the first value OR the second value OR the third value...)
The reason that this is not a general-purpose way to express OR in MQL is that it only works when the strings in the array are object ids or guids. The meaning and use of the |= constraint becomes much clearer with some examples. One straightforward use is to run the same query over multiple objects that are specified by id. The following query asks for the properties of three types:
[{
"id|=":["/type/type", "/type/property", "/type/key"],
"id":null,
"/type/type/properties":[]
}]
This next example asks for the ids of GIF or PNG (but not JPEG) images:
[{
"type":"/common/image",
"id":null,
"/type/content/media_type":null,
"/type/content/media_type|=":[
"/media_type/image/gif",
"/media_type/image/png"
]
}]
Finally, here is an example that uses both the |= constraint to express an OR and uses a property prefix to express AND. It asks for the French and Spanish translations of the country name "England":
| Query | Result |
|---|---|
{
"type":"/location/country",
"english:name": "England",
"foreign:name": [{
"value":null,
"lang":null,
"lang|=":["/lang/fr","/lang/es"]
}]
}
|
{
"type":"/location/country",
"english:name":"England",
"foreign:name":[{
"value":"Angleterre",
"lang":"/lang/fr"
},{
"value":"Inglaterra",
"lang":"/lang/es"
}]
}
|
Metaweb has no syntax to perform logical NOT operations. In general, with huge universe of knowledge, the NOT of a result may be a very, very large set of objects. There is not a way to write a single query that says: list all bands who have a Greatest Hits album but do not have an album that includes the word "Best". To do this, you'd first query the bands with a Greatest Hits album. Then you'd query the bands who have "album~=":"best", and then you'd subtract the results of the second query from the first query.
As another example, suppose you wanted to know what bands had an album named "Greatest Hits", but wanted to exclude all country music. You could do one query for Greatest Hits albums, and then do another for all country music bands (using the /music/artist/genre property), and then subtract the second result from the first. This is not particularly efficient, since there are probably a whole lot of country music bands. Better would be a single query for albums named "Greatest Hits" that also asks for the genre of the album (with /music/album/genre). Then, parse the JSON result, and post-process it yourself to remove albums whose genre is "country".
If you've enjoyed making queries against Freebase's repository of musical knowledge, you might also enjoy querying the underpinnings of the Metaweb infrastructure. Types, properties and namespaces are all Metaweb objects, and they can all be queried just like other objects. Here, for example is how we find the properties of /type/object: [10]
| Query | Result |
|---|---|
{
"type":"/type/type",
"id":"/type/object",
"properties":[]
}
|
{
"type": "/type/type",
"id": "/type/object",
"properties": [
"/type/object/id",
"/type/object/guid",
"/type/object/type",
"/type/object/name",
"/type/object/key",
"/type/object/timestamp",
"/type/object/permission",
"/type/object/creator"
]
}
|
And let's ask about the name property:
| Query | Result |
|---|---|
{
"type":"/type/property",
"id":"/type/object/name",
"*":null
}
|
{
"type": "/type/property",
"id": "/type/object/name",
"guid": "#9202a8c04000641f80000000000000ca",
"name": "name",
"key": ["display_name", "name"],
"expected_type": "/type/text",
"unique": true,
"schema": "/type/object",
"master_property": null,
"reverse_property": [],
"creator": "/user/root",
"permission": "/boot/root_permission",
"timestamp": "2006-11-30T12:43:53.0081Z"
}
|
This kind of reflective query is not only useful for exploring the Metaweb infrastructure, but can be helpful in understanding the schemas of types you actually want to use. Suppose you know that the type /music/album has a property named track, and you want to know what type a track is so that you can query tracks directly. This is actually a very easy query, if you understand how types and properties work:
| Query | Result |
|---|---|
{
"id":"/music/album/track",
"/type/property/expected_type":null
}
|
{
"id":"/music/album/track",
"/type/property/expected_type":"/music/track"
}
|
Note that we omitted a type specification from this query and instead simply used the fully-qualified name of the expected_type property.
If you were planning to write a program that made many music-related queries, you might first want to explore all of Freebase's music-related types. But where do you get a list? You query the domain: [11]





