WordPress AI Search Pro API

Gemini
Gemini API Endpoints

Base URL:
https://generativelanguage.googleapis.com/v1beta

POST
models/{model}:embedContent
Implemented

Generate embeddings for text content.

POST
models/{model}:countTokens
Implemented

Count tokens in text.

POST
models/{model}:generateContent
Implemented

Generate chat completions.

POST
models/{model}:streamGenerateContent
Implemented

Stream chat completions.

Gemini Feature Summary

Token counting: Uses actual API
Streaming: Implemented with SSE
Caching: Custom WordPress transient implementation
Batch processing: Implemented with retry logic

OpenAI
OpenAI API Endpoints

API Library: OpenAI PHP SDK (uses client library, not direct HTTP calls)

POST
/v1/embeddings
Implemented

Generate embeddings (via SDK).

POST
/v1/chat/completions
Implemented

Generate chat completions (via SDK).

OpenAI Feature Summary

Embeddings: Fully implemented
Chat completions: Fully implemented
Token counting: Word-based estimation (1.3 tokens per word)
Streaming: Fully implemented with callback support
Caching: WordPress transient-based with 1-hour TTL and FIFO replacement
Batch processing: Stub implementation (OpenAI Batch API not yet integrated)

Detailed Implementation

✓
OpenAI Token Counting

Method: count_tokens($text, $model = null)

Uses word-based estimation: 1.3 tokens per word (more accurate than character-based)
Can be enhanced with actual token counting via tiktoken library if needed
API responses include actual token usage counts for reference

$tokens = $provider->count_tokens('Hello world');  // ~3 tokens

✓
OpenAI Streaming

Method: create_chat_completion($messages, $model, ['stream' => true, 'callback' => callable])

Streams responses in real-time using OpenAI’s streaming API
Supports optional callback function for processing chunks
Automatically collects full response and estimates tokens

$result = $provider->create_chat_completion(
    $messages,
    'gpt-4o',
    [
        'stream' => true,
        'callback' => function($chunk) {
            echo $chunk;
            flush();
        }
    ]
);
// $result['content'] contains full streamed response
// $result['tokens'] contains estimated token count

✓
OpenAI Caching

Methods:

create_chat_completion($messages, $model, ['cache_key' => 'unique_key'])
clear_context_cache()
get_cache_stats()

Features:

WordPress transient-based persistence
1-hour TTL per cached entry
50-entry FIFO replacement limit
Automatic cache key prefixing (wp_ai_search_openai_)
Returns 'cached' => true when serving from cache

// First call - hits API
$result = $provider->create_chat_completion(
    $messages,
    'gpt-4o',
    ['cache_key' => 'faq_answer']
);
// $result['cached'] = false

// Second call - served from cache
$cached = $provider->create_chat_completion(
    $messages,
    'gpt-4o',
    ['cache_key' => 'faq_answer']
);
// $cached['cached'] = true (90% cost savings)

AI Search Pro API Documentation

Gemini Gemini API Endpoints

Gemini Feature Summary

OpenAI OpenAI API Endpoints