WordPress AI Search Pro API
AI Search Pro API Documentation
API Endpoints Reference
Gemini
Gemini API Endpoints
Base URL:
https://generativelanguage.googleapis.com/v1beta
POST
Implemented
models/{model}:embedContentImplemented
Generate embeddings for text content.
POST
Implemented
models/{model}:countTokensImplemented
Count tokens in text.
POST
Implemented
models/{model}:generateContentImplemented
Generate chat completions.
POST
Implemented
models/{model}:streamGenerateContentImplemented
Stream chat completions.
Gemini Feature Summary
- Token counting: Uses actual API
- Streaming: Implemented with SSE
- Caching: Custom WordPress transient implementation
- Batch processing: Implemented with retry logic
OpenAI
OpenAI API Endpoints
API Library: OpenAI PHP SDK (uses client library, not direct HTTP calls)
POST
Implemented
/v1/embeddingsImplemented
Generate embeddings (via SDK).
POST
Implemented
/v1/chat/completionsImplemented
Generate chat completions (via SDK).
OpenAI Feature Summary
- Embeddings: Fully implemented
- Chat completions: Fully implemented
- Token counting: Word-based estimation (1.3 tokens per word)
- Streaming: Fully implemented with callback support
- Caching: WordPress transient-based with 1-hour TTL and FIFO replacement
- Batch processing: Stub implementation (OpenAI Batch API not yet integrated)
Detailed Implementation
✓
OpenAI Token Counting
Method: count_tokens($text, $model = null)
- Uses word-based estimation: 1.3 tokens per word (more accurate than character-based)
- Can be enhanced with actual token counting via
tiktokenlibrary if needed - API responses include actual token usage counts for reference
$tokens = $provider->count_tokens('Hello world'); // ~3 tokens
✓
OpenAI Streaming
Method: create_chat_completion($messages, $model, ['stream' => true, 'callback' => callable])
- Streams responses in real-time using OpenAI’s streaming API
- Supports optional callback function for processing chunks
- Automatically collects full response and estimates tokens
$result = $provider->create_chat_completion(
$messages,
'gpt-4o',
[
'stream' => true,
'callback' => function($chunk) {
echo $chunk;
flush();
}
]
);
// $result['content'] contains full streamed response
// $result['tokens'] contains estimated token count
✓
OpenAI Caching
Methods:
create_chat_completion($messages, $model, ['cache_key' => 'unique_key'])clear_context_cache()get_cache_stats()
Features:
- WordPress transient-based persistence
- 1-hour TTL per cached entry
- 50-entry FIFO replacement limit
- Automatic cache key prefixing (
wp_ai_search_openai_) - Returns
'cached' => truewhen serving from cache
// First call - hits API
$result = $provider->create_chat_completion(
$messages,
'gpt-4o',
['cache_key' => 'faq_answer']
);
// $result['cached'] = false
// Second call - served from cache
$cached = $provider->create_chat_completion(
$messages,
'gpt-4o',
['cache_key' => 'faq_answer']
);
// $cached['cached'] = true (90% cost savings)