Advanced Search
The Commons allows for advanced searching. To activate advanced search, your search terms must contain a colon ":". The basic principle is that search terms are entered as field:search. Example:
title:water
Commons Search Fields
The search fields available on the Commons search are listed in the table below. You can also search on "text", as shown in examples following the table.
Field Name | Description | Example Search |
---|---|---|
title | The resource title. |
title:road |
notes | The resource abstract/description. | notes:water |
tags | The keyword tags shown in ovals below description in the resource records. |
tags:elevation |
groups | The resource category. Search using the code values listed in the topic category codes. | groups:biota |
organization | The publishing organization. Search using the code values listed on the publisher codes page, but note, you must replace the underscore with a dash (us_mn_state_mda must be us-mn-state-mda) |
organization:us-mn-state-mda |
res_format | The resource format. Search using the code values listed on the resource formats page. | res_format: wms |
license_id | The resource license type. Search using the code values listed on the license page. |
license_id: notspecified |
extras_dsPurpose | The purpose of the resource as shown in the "purpose" section in the Additional Info section on the resource page. |
extras_dsPurpose:school |
extras_dsPeriodOfContent | The date of the content of the resource as shown in the "date of content" section in the Additional Info section on the resource page. |
extras_dsPeriodOfContent:2014 |
extras_dsAccessConst | The access constraints of the resources as shown in the "access constraints" section in the Additional Info section on the resource page. |
extras_dsAccessConst:none |
extras_dsOriginator | The resource's originating organization as shown in the "originating organization" section on the resource page. | extras_dsOriginator:Agriculture |
Search Functions
Advanced search supports many special characters to achieve various search functions
Multiple character wild card search
To perform a multiple character wild card search use the "*" symbol. Multiple character wild card searches looks for 0 or more characters. Example:
title:well*
Single character wild card search
To perform a single character wild card search use the "?" symbol. The single character wild card search looks for terms that match that with the single character replaced. Example:
author:eri?sson
This will match both Ericsson and Eriksson (and of course all other letters in the location in question).
Although the Geospatial Commons does not support an author search, this example is included to illustrate the single character wild card search.
Fuzzy Searches
The Geospatial Commons supports fuzzy searches based on the Levenshtein Distance, or Edit Distance algorithm. To do a fuzzy search use the tilde, "~", symbol at the end of a Single word Term. Example:
author:powell~
will also find jowell and many others. Although the Geospatial Commons does not support an author search, this example is included to illustrate the fuzzy search.
An additional (optional) parameter can specify the required similarity. The value is between 0 and 1, with a value closer to 1 only terms with a higher similarity will be matched. For example:
title:roam~0.8
The default that is used if the parameter is not given is 0.5.
Proximity searches
The Geospatial Commons supports finding words are a within a specific distance away. To do a proximity search use the tilde, "~", symbol at the end of a Phrase. Example:
To search for a "flood" and "maps" within 5 words of each other in the abstract of a dataset use the search:
notes:"flood maps"~5
Range Searches
Range Queries allow one to match datasets whose field(s) values are between the lower and upper bound specified by the Range Query. Range Queries can be inclusive or exclusive of the upper and lower bounds. Sorting is done lexicographically. Example:
title:[2005 TO 2010] (inclusive search) title:{2005 TO 2010} (exclusive search)
Please note that the "TO" keyword needs to be spelled in CAPITAL letters.
Boolean Operators
Boolean operators allow terms to be combined through logic operators. The Geospatial Commons supports AND, "+", OR, NOT and "-" as Boolean operators (Note: Boolean operators must be ALL CAPS). The AND operator is the default conjunction operator. This means that if there is no Boolean operator between two terms, the AND operator is used. The OR operator links two terms and finds a matching dataset if either of the terms exist in a dataset. This is equivalent to a union using sets. The symbol || can be used in place of the word OR. To search for datasets that contain either "2005" or "2007" in their title use the query: Example:
title:2005 OR title:2007
The AND operator matches datasets where both terms exist anywhere in the text of a single dataset. This is equivalent to an intersection using sets. The symbol && can be used in place of the word AND. Example:
To search for datasets that contain "2005" and "2007" use the query:
title:2005 AND title:2007
or
title:2005 title:2007
Both searches are identical because AND is the default operator.
+
The "+" or required operator requires that the term after the "+" symbol exist somewhere in the field of a single dataset. Example:
To search for datasets that must contain "survey" and may contain "economy" use the query:
text:+survey text:economy
NOT
The NOT operator excludes datasets that contain the term after NOT. This is equivalent to a difference using sets. The symbol ! can
be used in place of the word NOT. Example:
To search for datasets that contain "2007" but not "2005" in their
title use the query: title:2005 NOT title:2007
Note: The NOT operator can also be used with just one term. Example:
The following search will return all results for which the originating organization is not the Department of Agriculture.
NOT extras_dsOriginator:Agriculture
-
The "-" or prohibit operator excludes datasets that contain the term after the "-" symbol. Example:
To search for datasets that contain "2005" but not "2007" in their title use the query:
title:2005 -title:2007
Grouping
The Geospatial Commons supports using parentheses to group clauses to form sub queries. This can be very useful if you want to control the boolean logic for a query. Example:
To search for either "2007" or "survey" and "2005" use the query:
title:2005 AND (title:2007 OR title:survey)
Field Grouping
The Geospatial Commons supports using parentheses to group multiple clauses to a single field. To search for a title that contains both the word "return" and the phrase "pink panther" use the query:
title:(+return +"pink panther")
Default field
The field called text is a virtual field containing all text from all other fields. This field serves also as default field if you do not specify a field name. Example:
title:2005 and survey
is equivalent to
title:2005 and text:survey
Warning: If you don't specify a single field, you won't have a colon anymore and the Geospatial Commons will switch to simple search. This might lead to very surprising results, because simple search ignores most of the special characters used in advanced search.
Search examples and available fields
Fields available for searching in Commons are listed in the table at the top of this page. Some search fields given in the examples are not implemented in the Commons (for example, the author search, because author information is not represented in the Commons). The type of search illustrated (single character wild card search and fuzzy search) are implemented in the Commons, but only using the search fields listed in the table.
Further reading
The Geospatal Commons uses Apache Solr as its search engine. For further details check the documentation at
- http://lucene.apache.org/core/3_6_0/queryparsersyntax.html
- http://wiki.apache.org/solr/SolrQuerySyntax
- http://wiki.apache.org/solr/DisMaxQParserPlugin (for simple search)
Note that the whole functionality described in the Solr documentation is not offered through the simplified search interface in the Commons.
Citation: This Advanced Search Help document was modified for the Minnesota Geospatial Commons from https://github.com/kata-csc/kata-packaging/blob/master/doc/search-help.txt