Is AI Search plugin close to Production-Ready?

Remember when I first wrote about building an AI-powered WordPress search with embeddings? Back then, I was excited about the concept, but I was aware there was a thin line between a proof-of-concept and something you’d want to run on a real site.

Well, a few months and several user complaints later, I’ve learned a lot about what it takes to make AI search actually work in production. Here’s what I’ve been fixing and why.

Reality Check

When I first shipped AI Search, I was focused on the cool tech like embeddings, cosine similarity, semantic understanding. The code worked, but I made some classic engineer mistakes:

On a site with 50 posts? No problem. On a site with 5,000 posts? Users started complaining about 10-second page loads.

The Service Registration Problem

The biggest issue was embarrassingly simple. I was calling our external embedding service during __construct(), which meant every single page load triggered an API call:

Users were getting 20-second page loads because my plugin was trying to register with our service on every request, and my service wasn’t online for months (for some reason).

The fix: Move registration to only happen when users actually save their settings (only admin page consuming). Revolutionary, I know.

The Search Performance Issue

Remember that numberposts => -1 from earlier? That was pulling every post into memory, calculating embeddings for all of them, just to return maybe 10 results. On larger sites, this was doing thousands of similarity calculations per search.

I had to rethink the entire approach:

  • Added result caching (search results are cached for 5 minutes)
  • Limited the scope of posts we process
  • Added early exit conditions

The difference? Search went from 3-8 seconds to sub-second on most sites.

Cache Management: Learning from Users

One thing I didn’t anticipate was how much users would want control over their embedding data.

So I built a proper cache management interface. Three levels of control:

  1. Clear cache only – Safe, embeddings regenerate automatically
  2. Clear post embeddings – Nuclear option, requires manual regeneration
  3. Clear everything – Start fresh (with appropriate warnings)

Plus real-time stats showing exactly what’s cached and how much space it’s using.

Similarity Threshold Refactoring

In my original post, I mentioned needing a similarity threshold but didn’t really dive into the practical implications. Turns out, this was more important than I thought.

Initially, I had the range set to 0.0-1.0 with 0.01 precision. But users kept getting irrelevant results with lower thresholds. After analyzing actual usage patterns:

  • Values below 0.5 rarely returned useful results
  • Users needed finer control for their specific content types
  • The difference between 0.750 and 0.753 can be significant for some use cases

New approach: 0.5-1.0 range with 0.001 precision. Much better real-world results.

What’s Actually Next

I’m not done. The roadmap includes:

  • Search analytics – Understanding what people search for
  • Better content support – Custom fields, taxonomies, media
  • API endpoints – For headless setups and external integrations
  • Background processing – For sites with massive content volumes

The Real Takeaway

The technical implementation of embeddings and cosine similarity was the easy part. The hard part was making it work reliably for real users on real sites with real constraints.

If you’re building AI-powered features, spend as much time thinking about edge cases, performance, and user control as you do about the actual AI. Your users will thank you.

The current version (1.7.0) is what I should have shipped originally. it works, it’s fast, and it gives users the control they need. Sometimes you have to ship the MVP to learn what the actual product should be.


Posted

in

,

by

Tags:

Comments

Leave a Reply

Your email address will not be published. Required fields are marked *