gasilaffiliates.blogg.se - Apache lucene query syntax

#Apache lucene query syntax full

Until this is added to the Lucene project, I've added a standalone lucene-addons repo (with jars compiled for the latest stable build of Lucene) on github. Most of the documentation is in the javadoc for SpanQueryParser.Īny and all feedback is welcome. Until LUCENE-2878 is closed, this might have a use for fans of SpanQuery. This parser can be very useful for concordance tasks (see also LUCENE-5317 and LUCENE-5318) and for analytical search.

Can specifiy Optimal String Alignment (OSA) vs Levenshtein for distance 1(Levenshtein).

Can specify prefix length in fuzzy queries: jakarta~1,2 (edit distance =1, prefix =2).

Can use an edit distance > 2 for fuzzy query via SlowFuzzyQuery (beware of potential performance issues!).

Can use negative only query: -jakarta :: Find all docs that don't contain "jakarta".

Can require at least x number of hits at boolean level: "apache AND (lucene solr tika)~2.

#Apache lucene query syntax full

Did I mention full recursion: ~2 (solr~ /l++/)]~10 :: Find something like "jakarta" within two words of "ap*che" and that hit has to be within ten words of something like "solr" or that "lucene" regex.Can use multiterms in phrasal queries: "jakarta~1 ap*che"~2.Can use "or grouping" clauses in phrasal queries: "apache (lucene solr)"~3 :: find "apache" and then either "lucene" or "solr" within three words.Can also use for single level phrasal queries instead of " as in:.Fully recursive phrasal queries with as in: ~3 lucene]~>4 ::įind "jakarta" within 3 words of "apache", and that hit has to be within four words before "lucene".Can specify "not near": "fever bieber"!~3,10 ::įind "fever" but not if "bieber" appears within 3 words before or 10 words after it.Can require "in order" for phrases with slop with the ~> operator: "jakarta apache"~>3.Main additions in SpanQueryParser syntax vs. multiple fields: title:lucene author:hatcher.boolean and +/-: (lucene OR apache) NOT jakarta +lucene +apache -jakarta.AnalyzingQueryParser: has an option to analyze multiterms.Īt a high level, there's a first pass BooleanQuery/field parser and then a span query parser handles all terminal nodes and phrases.ComplexPhraseQueryParser: can handle "near" queries that include multiterms (wildcard, fuzzy, regex, prefix),.

SurroundQueryParser: recursive parsing for "near" and "not" clauses.Classic QueryParser: most of its syntax.

To search for documents that contain 'jakarta apache' and 'Apache Lucene' use the query: 'jakarta apache' AND 'Apache Lucene'. The symbol & can be used in place of the word AND. This is equivalent to an intersection using sets. These classes are Grails domain classes, but I'm using the standard Compass annotations and Lucene query syntax.This parser extends QueryParserBase and includes functionality from: The AND operator matches documents where both terms exist anywhere in the text of a single document.

Is it possible to formulate this as a Lucene query? I'm actually using Compass, rather than Lucene directly, so I can use either CompassQueryBuilder or Lucene's query language.įor the sake of completeness, the domain classes themselves are shown below. However, what I really want to query for is "students attending a mandatory cooking course", which in this case would return nobody. If I execute the query "courseName:cooking AND mandatory:Y" it returns Bob, because Bob is attending the cooking course, and Bob is also attending a mandatory course. The data in the domain is summarised below Course.name Attendance.mandatory Student.name I'm trying to use Lucene to query a domain that has the following structure Student 1-* Attendance *-1 Course