An astrology chatbot lives or dies on continuity: a user asks about marriage, then follows up with "and when?", then "what about my partner's chart?" The Vedika API query endpoint is stateless by design, so conversation memory is something you build around it. The pattern is straightforward: capture the birth context once, persist a compact summary of what has already been answered, and replay only the relevant slice on each new turn. This guide shows how to do that against the real endpoints.
Why the endpoint is stateless (and why that's good)
Every call to POST /api/v1/astrology/query is scored independently. It takes a natural-language question, a birthDetails object, and returns a reading. There is no server-side thread, no hidden session, and no implicit memory of the last message. That puts you in control of three things that matter for a production chatbot:
- Data residency and retention — you decide what conversation state is stored, where, and for how long.
- Cost — you decide how much prior context to replay, which directly affects per-query spend.
- Determinism — the same
birthDetailsalways yields the same chart, so memory is about conversation facts, not re-deriving astronomy.
The underlying chart math comes from the open-source XALEN Ephemeris engine. Because it is deterministic, the planetary positions for a given birth moment never change between turns. Your memory layer therefore only has to remember what the user said and what was already explained — not recompute the sky.
The shape of a memory layer
A workable conversation-memory model for an astrology chatbot has three parts:
- Session identity — the immutable birth context, captured once at the start of a session.
- Turn log — the raw question/answer pairs, kept for audit and display.
- Rolling summary — a short, structured digest of resolved facts that you prepend to new questions.
The session identity is the part you must never lose or mutate. Store the exact birthDetails the moment a user provides their birth data:
// Captured once when the session begins
const session = {
id: "sess_8f21",
birthDetails: {
datetime: "1990-08-15T14:30:00",
latitude: 19.076,
longitude: 72.8777,
timezone: "Asia/Kolkata"
},
summary: "", // rolling digest of resolved facts
turns: [] // raw Q/A pairs
};
Reuse session.birthDetails verbatim on every call. If you let the timezone or coordinates drift between turns — for example by re-geocoding a city name each time — the chart can shift and your answers will contradict each other. Capture once, freeze, reuse.
Threading a follow-up question
When a user sends a follow-up like "and when will that happen?", you don't need to resend the entire transcript. Prepend a compact context preamble built from the rolling summary, then ask the new question. The endpoint reads it as a single self-contained query:
curl -X POST https://api.vedika.io/api/v1/astrology/query \
-H "x-api-key: vk_live_xxx" \
-H "Content-Type: application/json" \
-d '{
"question": "Context so far: user is asking about marriage timing; chart already cast for their birth details; 7th house and its lord were discussed. New question: when is marriage most likely?",
"birthDetails": {
"datetime": "1990-08-15T14:30:00",
"latitude": 19.076,
"longitude": 72.8777,
"timezone": "Asia/Kolkata"
}
}'
In code, a single helper keeps the threading consistent. Notice that birthDetails is always the frozen session copy, and only the question string carries the conversational context:
async function ask(session, userQuestion, { fast = false } = {}) {
const context = session.summary
? `Context so far: ${session.summary}\n\nNew question: ${userQuestion}`
: userQuestion;
const res = await fetch("https://api.vedika.io/api/v1/astrology/query", {
method: "POST",
headers: {
"x-api-key": process.env.VEDIKA_KEY,
"Content-Type": "application/json"
},
body: JSON.stringify({
question: context,
birthDetails: session.birthDetails,
...(fast ? { speed: "fast" } : {})
})
});
const data = await res.json();
session.turns.push({ q: userQuestion, a: data.answer });
return data;
}
Use speed: "fast" for short clarifying turns ("which house is that?") and the standard path for substantive readings. Mixing the two within one conversation is fine — the chart inputs are identical, so only the depth of the response changes.
Keeping the summary small
The rolling summary is where conversations stay cheap. After each turn, fold the new exchange into a short digest rather than letting the context grow unbounded. A practical digest records: the user's primary concern, the chart already cast, the houses or periods already covered, and any constraints the user stated ("I only care about career, not health"). Regenerate it with your own LLM agent or a simple template — the goal is to hand the next query a paragraph, not a transcript.
This matters for billing. With per-query pricing in the $0.01–$0.05 range, a chatbot that replays 20 prior turns on every message pays for context it doesn't need. A 200-word summary captures the same continuity at a fraction of the input size.
Streaming for chat-feel responsiveness
For a typing-indicator experience, swap the call to the SSE endpoint POST /api/v1/astrology/query/stream. It accepts the same body and emits the reading incrementally, so your chat UI can render tokens as they arrive instead of waiting for the full response. Your memory logic is unchanged — you still freeze birthDetails and prepend the summary; only the transport differs.
Multi-chart conversations
Relationship and synastry questions introduce a second person. Model this as multiple session identities under one conversation, not one giant blob. When a user says "now compare with my partner," attach a second frozen birthDetails and make the active subject explicit in your summary ("primary chart: user; secondary chart: partner"). Because each chart is deterministic, you can switch between them across turns without recomputation drift.
If you need raw computed positions to drive your own routing — say, to detect which house a question touches before you phrase the query — call the V2 computation layer at /v2/astrology/*. It uses flat datetime, latitude, longitude, and timezone fields and returns structured chart data you can cache against the session. Vedika exposes 700+ operations across 25 domains, covering Vedic (sidereal), Western (tropical), and KP in one API, so a single conversation can move between systems while reusing the same birth context.
Honest sourcing in a remembered conversation
Memory raises a subtle integrity question: when a user references something "you said three turns ago," the bot should not invent a justification it never had. Vedika's readings attribute astrological claims to the classical texts practitioners actually train on — works such as the Brihat Parashara Hora Shastra, Phaladeepika, and, for KP, Krishnamurti's readers. When you build your summary, preserve those attributions rather than paraphrasing them away. A remembered fact that drops its source is how chatbots start bluffing; carrying the citation forward keeps follow-ups grounded.
Key facts
- The query endpoint is stateless —
POST /api/v1/astrology/queryholds no session; you own conversation memory. - Freeze birth context once — store
birthDetailsat session start and reuse it verbatim so the chart never drifts. - Carry a rolling summary, not a transcript — prepend a short digest to each new question to keep continuity cheap.
- Streaming is a drop-in —
/api/v1/astrology/query/streamtakes the same body over SSE; memory logic is unchanged. - Determinism makes memory simple — identical inputs to the XALEN Ephemeris engine produce identical positions every turn.
- Per-query pricing is $0.01–$0.05; trimming replayed context directly reduces spend.
- Prototype free — the sandbox mirrors request and response shapes with no key required.
Where to start
Build the memory layer against the free sandbox first — it accepts the same shapes with no key, so you can validate threading and summarization end to end. When you're ready for live charts, pick a tier on the pricing page (Starter at $12/mo through Enterprise at $240/mo, with a fast path and voice on higher tiers) and check the request and response contracts in the API docs. For a deeper look at how the underlying readings stay source-attributable, see our notes on citation-backed astrology answers.
Frequently asked questions
Does the Vedika astrology query endpoint store conversation history for me?
No. Each call to POST /api/v1/astrology/query is stateless and scored independently. You own the conversation state: store the birthDetails once, persist a short summary of prior turns, and pass the relevant context back in with each new question.
How do I keep the birth chart consistent across many follow-up questions?
Persist the exact birthDetails object when a session starts and reuse it verbatim on every subsequent query. Because the engine is deterministic, identical inputs produce identical planetary positions, so the chart never drifts between turns.
What is among the lowest-priced way to handle long astrology conversations?
Summarize rather than replay. Keep a rolling digest of resolved facts and prepend it as context, and use speed: "fast" for short clarifying turns. Because pricing is per query, trimming redundant context lowers cost.