Phrase vs. Distance Match in MySQL Full-Text Search

Phrase vs. Distance Match in MySQL Full-Text Search

 

Historical context on why Boolean mode was out of scope for the initial implementation:

  • The MySQL documentation does not clearly explain when to use different full-text search modes.

  • The team chose the default ("out-of-the-box") full-text search implementation for simplicity.

  • Boolean mode was not used due to several drawbacks:

    • It does not automatically sort results by relevance, requiring extra work to compute and apply relevance scores manually.

    • Using Boolean mode would require extending the query parser to support its custom syntax.

    • Implementing these changes would have added significant complexity without a clear path to automation.

  • The default mode already handles relevance sorting automatically, which met user needs with less effort.

 


Phrase Matching ("word1 word2")

This is used to search for exact phrases, with:

  • Exact order

  • No extra words in between

  • No variations

Example:

MATCH(content) AGAINST('"gene therapy" IN BOOLEAN MODE)

Matches:

  • "gene therapy is promising"

Doesn’t match:

  • "therapy for gene mutation"

  • "gene-based cell therapy"

Phrase matching is strict — it's looking for gene immediately followed by therapy.

 


🧲 Proximity Matching ("word1 word2" @N)

This is used to search for all words appearing close together, regardless of order, and allows for flexibility in between.

Example:

MATCH(content) AGAINST('"gene therapy" @3' IN BOOLEAN MODE)

Matches:

  • "gene therapy is promising" ✅ (distance: 1)

  • "therapy for gene mutation" ✅ (distance: 2 — therapy, for, gene)

  • "a therapy based on gene editing" ✅ (distance: 3 — within threshold)

Doesn’t match:

  • "gene expression has little relation to cell therapy" ❌ (too far apart)

So while "gene therapy" (phrase) only matches that exact sequence, "gene therapy" @3 allows them to be near each other in any order and with some wiggle room between them.

 


🆚 Summary Comparison

Feature

Phrase Match
("word1 word2")

Proximity Match
("word1 word2" @N)

Feature

Phrase Match
("word1 word2")

Proximity Match
("word1 word2" @N)

Word Order Matters

✅ Yes

❌ No

Must be Adjacent

✅ Yes

❌ No (within N words)

Allows Intervening Words

❌ No

✅ Yes (up to N-1 words)

Use Case

Exact matches (e.g., quotes)

Conceptual closeness

Flexibility

❌ Rigid

✅ Flexible

Available in BOOLEAN MODE?

✅ Yes

✅ Yes (InnoDB only)


When to use which?

  • Use phrase matching when you're looking for exact quotations or tight phrases (e.g., "climate change", "artificial intelligence").

  • Use proximity search when you're more interested in conceptual relevance, especially in large or noisy text fields (e.g., medical records, article abstracts, etc.).


References: