Aria is SellTrove's AI-powered selling assistant. It helps sellers with product description writing, pricing suggestions, and customer inquiry responses — all contextualised with the seller's actual store data.
The core technical challenge: Aria needs to retrieve relevant context (a seller's product catalogue, past orders, pricing history) before generating a response. This is Retrieval-Augmented Generation (RAG).
The vector database choice: Pinecone for production, with OpenAI's text-embedding-3-small model for generating embeddings. Product descriptions, past order data, and seller preferences are embedded and stored. At query time, the user's question is embedded and the closest vectors are retrieved as context.
The architecture: user sends a message to Aria. The message is embedded. Top-k similar vectors are retrieved from Pinecone with the seller's tenant ID as a metadata filter. The retrieved context and the user's message are combined into a prompt. The LLM generates a contextualised response.
The latency budget: embedding (50ms) + vector retrieval (30ms) + LLM generation (1-3s) = 1.1-3.1s end-to-end. Acceptable for a conversational assistant. Not acceptable for a synchronous API response.
— Dick Bassey | DevDick | 2026