We try to make search and filtering easy but we also recognize that there are situations that require more complex. queries and strategies.  For these situations we provide access to advanced search and filter syntax.  

On the Topic Filter view you will find an option for "advanced mode".  Selecting this option provides an expanded input area for entering queries.  The input area will further expand as needed to accommodate large queries.  We have included explanations and examples of the available syntax below.  

We are interested in how you use Attensa and would appreciate any feedback you might have including suggestions for. improvements.   

Terms

A query or filter is broken up into terms and operators. There are two types of terms: Single Terms and Phrases.

A Single Term is a single word such as "test" or "hello".

A Phrase is a group of words surrounded by double quotes such as "hello dolly".

Multiple terms can be combined together with Boolean operators to form a more complex query.

Attensa Query and Filter supports fielded data.

Specifying Fields in a Query

Data is organized in fields. Searches can take advantage of fields to add precision to queries. For example, you can search for a term only in a specific field, such as a title field.

To specify a field, type the field name followed by a colon ":" and then the term you are searching for within the field.

For example, suppose an index contains two fields, title and text. If you want to find a document called "The Right Way" which contains the text "don't go this way," you could include the following terms in your search query:

title:"Do it right" AND go 

Attensa Search fields:

Fields

title: Use this for finding terms that must be in the title of an article.

author: When you want to find all articles by a specific author.

text: Searches: all fields for the specified term

publishedDate: The date the article was published. As of right now, the date must be in ISO 8601 Combined Date and Time format. To search by a date range put the range in brackets. Example: [2014-08-11T23:59:59.999Z TO NOW]

created: The date the article was created in in Attensa.  This may be different than the publish date.  For example, an article that was created on January 1, but added to Attensa on January 2, will have the created date of January 2.  This can be useful in finding items that have been added to the system from a certain date regardless of their publish date.  The date the Stream was created. The date must be in ISO 8601 Combined Date and Time format. To search by a date range put the range in brackets. Example: [2014-08-11T23:59:59.999Z TO NOW]

followersCount: The amount of people following this Stream. Example: [5 TO 50] Example: [* TO 10]

itemsCount: The amount of articles contained within this Stream. Example: [800 TO 1000] Example: [1000 TO *]

Wildcard Searches

The query parser supports single and multiple character wildcard searches within single terms. Wildcard characters can be applied to single terms, but not to search phrases. 

The special character. is '?'.  For example, the search string. the?t would match both test and text.  

Multiple characters is used to match zero or more sequential characters.  The special character is '*'.  For example, the wildcard search test* would match test, testing, and tester. You can use wildcards in the middle of a term.  For example, the*t would match text and test while *est would match test and pest.   

Fuzzy Searches

Attensa’s query parser supports fuzzy searches based on the Damerau-Levenshtein Distanceor Edit Distance algorithm. Fuzzy searches discover terms that are similar to a specified term without necessarily being an exact match. To perform a fuzzy search, use the tilde ~ symbol at the end of a single-word term. For example, to search for a term similar in spelling to "roam," use the fuzzy search:

roam~

This search will match terms like roams, foam, & foams. It will also match the word "roam" itself.

An optional distance parameter specifies the maximum number of edits allowed, between 0 and 2, defaulting to 2. For example:

roam~1

This will match terms like roams & foam - but not foams since it has an edit distance of "2".

Proximity Searches

A proximity search looks for terms that are within a specific distance from one another.

To perform a proximity search, add the tilde character ~ and a numeric value to the end of a search phrase. For example, to search for a "Oregon" and "Portland" within 10 words of each other in a document, use the search:

"Portland Oregon"~10

The distance referred to here is the number of term movements needed to match the specified phrase. In the example above, if "Oregon" and "Portland" were 10 spaces apart in a field, but "Oregon" appeared before "Portland", more than 10 term movements would be required to move the terms together and position "Oregon" to the right of "Portland" with a space in between.

Range Searches

A range search specifies a range of values for a field (a range with an upper bound and a lower bound). The query matches documents whose values for the specified field or fields fall within the range. Range queries can be inclusive or exclusive of the upper and lower bounds. Sorting is done lexicographically, except on numeric fields. For example, the range query below matches all documents whose mod_date field has a value between 20020101 and 20030101, inclusive.

mod_date:[20020101 TO 20030101]

Range queries are not limited to date fields or even numerical fields. You could also use range queries with non-date fields:

title:{Aida TO Carmen}

This will find all documents whose titles are between Aida and Carmen, but not including Aida and Carmen.

The brackets around a query determine its inclusiveness.

Square brackets [ ] denote an inclusive range query that matches values including the upper and lower bound.

Curly brackets { } denote an exclusive range query that matches values between the upper and lower bounds, but excluding the upper and lower bounds themselves.

You can mix these types so one end of the range is inclusive and the other is exclusive. Here's an example: title:[Aida TO Carmen}

Boosting a Term with ^

Provides the relevance level of matching documents based on the terms found. To boost a term use the caret symbol ^ with a boost factor (a number) at the end of the term you are searching. The higher the boost factor, the more relevant the term will be.

Boosting allows you to control the relevance of a document by boosting its term. For example, if you are searching for

"Portland Oregon" and you want the term "Portland" to be more relevant, you can boost it by adding the ^ symbol along with the boost factor immediately after the term. For example, you could type:

Portland^4 Oregon

This will make documents with the term Portland appear more relevant. You can also boost Phrase Terms as in the example:

"Portland Oregon"^4 "Oregon Pearl"

By default, the boost factor is 1. Although the boost factor must be positive, it can be less than 1 (for example, it could be 0.2).

Boolean Operators Supported by the Query Parser

Boolean operators allow you to apply Boolean logic to queries, requiring the presence or absence of specific terms or conditions in fields in order to match documents. Attensa supports AND, "+", OR, NOT and "-" as Boolean operators.

When specifying Boolean operators with keywords such as AND or NOT, the keywords must appear in all uppercase.

The Boolean Operator OR

The OR operator is the default conjunction operator. This means that if there is no Boolean operator between two terms, the OR operator is used. The OR operator links two terms and finds a matching document if either of the terms exist in a document. This is equivalent to a union using sets. The symbol || can be used in place of the word OR.

To search for documents that contain either "Portland Oregon" or just "Portland," use the query: "Portland Oregon" Portland

or

"Portland Oregon" OR Portland

The Boolean Operator +

The + symbol (also known as the "required" operator) requires that the term after the + symbol exist somewhere in a field in at least one document in order for the query to return a match.

For example, to search for documents that must contain "Portland" and that may or may not contain "Pearl," use the following query:

+Portland Pearl

The Boolean Operator AND (&&)

The AND operator matches documents where both terms exist anywhere in the text of a single document. This is equivalent to an intersection using sets. The symbol && can be used in place of the word AND.

To search for documents that contain "Portland Oregon" and "Oregon Pearl," use either of the following queries:

"Portland Oregon" AND "Oregon Pearl"

"Portland Oregon" && "Oregon Pearl"

The Boolean Operator NOT (!)

The NOT operator excludes documents that contain the term after NOT. This is equivalent to a difference using sets. The symbol ! can be used in place of the word NOT.

The following queries search for documents that contain the phrase "Portland Oregon" but do not contain the phrase "Oregon Pearl":

"Portland Oregon" NOT "Oregon Pearl"

"Portland Oregon" ! "Oregon Pearl"

The Boolean Operator –

The - symbol or "prohibit" operator excludes documents that contain the term after the - symbol.

For example, to search for documents that contain "Portland Oregon" but not "Oregon Pearl," use the following query:

"Portland Oregon" -"Oregon Pearl"

Escaping Special Characters

Gives the following characters special meaning when they appear in a query: + - && || ! ( ) { } [ ] ^ " ~ * ? : /

To make the search provider interpret any of these characters literally, rather as a special character, precede the character with a backslash character \. For example, to search for (1+1):2 without having Solr interpret the plus sign and parentheses as special characters for formulating a sub-query with two terms, escape the characters by preceding each one with a backslash:

\(1\+1\)\:2

Grouping Terms to Form Sub-Queries

Supports using parentheses to group clauses to form sub-queries. This can be very useful if you want to control the Boolean logic for a query.

The query below searches for either "Portland" or "Oregon" and "website":

(Portland OR Oregon) AND website

This adds precision to the query, requiring that the term "website" exist, along with either term "Portland" and "Oregon."

Grouping Clauses within a Field

To apply two or more Boolean operators to a single field in a search, group the Boolean clauses within parentheses. For example, the query below searches for a title field that contains both the word "return" and the phrase "pink panther":

title:(+return +"pink panther")

Specifying Dates and Times

Queries against fields using the TrieDateField type (typically range queries) should use the appropriate date syntax:

publishedDate: [* TO NOW]
publishedDate: [1976-03-06T23:59:59.999Z TO *]
publishedDate: [1995-12-31T23:59:59.999Z TO 2007-03-06T00:00:00Z] publishedDate: [NOW-1YEAR/DAY TO NOW/DAY+1DAY]
publishedDate: [1976-03-06T23:59:59.999Z TO 1976-03-06T23:59:59.999Z+1YEAR] publishedDate: [1976-03-06T23:59:59.999Z/YEAR TO 1976-03-06T23:59:59.999Z]

Did this answer your question?