Skip to main content

WordPress AI Search Pro API

AI Search Pro API Documentation

API Endpoints Reference

Gemini
Gemini API Endpoints

Base URL:
https://generativelanguage.googleapis.com/v1beta

POST
models/{model}:embedContent
Implemented

Generate embeddings for text content.

POST
models/{model}:countTokens
Implemented

Count tokens in text.

POST
models/{model}:generateContent
Implemented

Generate chat completions.

POST
models/{model}:streamGenerateContent
Implemented

Stream chat completions.

Gemini Feature Summary
  • Token counting: Uses actual API
  • Streaming: Implemented with SSE
  • Caching: Custom WordPress transient implementation
  • Batch processing: Implemented with retry logic

OpenAI
OpenAI API Endpoints

API Library: OpenAI PHP SDK (uses client library, not direct HTTP calls)

POST
/v1/embeddings
Implemented

Generate embeddings (via SDK).

POST
/v1/chat/completions
Implemented

Generate chat completions (via SDK).

OpenAI Feature Summary
  • Embeddings: Fully implemented
  • Chat completions: Fully implemented
  • Token counting: Word-based estimation (1.3 tokens per word)
  • Streaming: Fully implemented with callback support
  • Caching: WordPress transient-based with 1-hour TTL and FIFO replacement
  • Batch processing: Stub implementation (OpenAI Batch API not yet integrated)

Detailed Implementation


OpenAI Token Counting

Method: count_tokens($text, $model = null)

  • Uses word-based estimation: 1.3 tokens per word (more accurate than character-based)
  • Can be enhanced with actual token counting via tiktoken library if needed
  • API responses include actual token usage counts for reference
$tokens = $provider->count_tokens('Hello world');  // ~3 tokens


OpenAI Streaming

Method: create_chat_completion($messages, $model, ['stream' => true, 'callback' => callable])

  • Streams responses in real-time using OpenAI’s streaming API
  • Supports optional callback function for processing chunks
  • Automatically collects full response and estimates tokens
$result = $provider->create_chat_completion(
    $messages,
    'gpt-4o',
    [
        'stream' => true,
        'callback' => function($chunk) {
            echo $chunk;
            flush();
        }
    ]
);
// $result['content'] contains full streamed response
// $result['tokens'] contains estimated token count


OpenAI Caching

Methods:

  • create_chat_completion($messages, $model, ['cache_key' => 'unique_key'])
  • clear_context_cache()
  • get_cache_stats()

Features:

  • WordPress transient-based persistence
  • 1-hour TTL per cached entry
  • 50-entry FIFO replacement limit
  • Automatic cache key prefixing (wp_ai_search_openai_)
  • Returns 'cached' => true when serving from cache
// First call - hits API
$result = $provider->create_chat_completion(
    $messages,
    'gpt-4o',
    ['cache_key' => 'faq_answer']
);
// $result['cached'] = false

// Second call - served from cache
$cached = $provider->create_chat_completion(
    $messages,
    'gpt-4o',
    ['cache_key' => 'faq_answer']
);
// $cached['cached'] = true (90% cost savings)