For an astrology AI, the right architecture is a split one: use deterministic computed data for every numeric and positional fact, and use retrieval-augmented generation (RAG) only for the interpretive language. A model that recalls planetary positions from its training data drifts by degrees; a model with no retrieval layer invents citations. Vedika separates these concerns explicitly — chart math is computed before any language model runs, and interpretation is grounded in real classical texts via retrieval.
The two failure modes you are choosing between
When developers first wire a language model to an astrology use case, they hit one of two walls. The first is numeric hallucination: ask a model where the Moon was at a given birth time and it confidently returns a longitude that is wrong by several degrees, which is enough to flip a sign or a house. The second is doctrinal fabrication: ask why a placement matters and the model paraphrases a plausible-sounding rule that no classical text actually states, sometimes attaching an invented verse number.
These are different problems with different solutions. The first is solved by never letting the model compute positions — you compute them deterministically and hand them over. The second is solved by retrieval: pulling the actual passage the interpretation should rest on, so the model paraphrases something real.
Why "just use a bigger model" does not fix it
Scale reduces but does not eliminate numeric drift, and it does nothing for source attribution — a larger model is simply more fluent at producing a citation that does not exist. The fix is architectural, not a matter of model size. Positions belong to an ephemeris; doctrine belongs to a corpus; the language model belongs on top, reasoning over both.
Where computed data is non-negotiable
Anything that has a single correct answer should be computed, not generated. In an astrology system that covers a large surface area, including the following:
- Planetary longitudes, latitudes, speeds, and retrograde state
- Ascendant and house cusps under the chosen house system
- Divisional (varga) charts derived from the base positions
- Dasha and antardasha period boundaries and dates
- Detected yogas and combinations, which depend on exact placements
- Sidereal-versus-tropical conversion for Vedic versus Western
Vedika computes all of this with the XALEN Ephemeris, an open-source engine (Apache-2.0, published to crates.io as xalen, PyPI as xalen, and npm as @xalen/wasm) with roughly 2,200 tests. Its positions were validated against JPL DE440 and the swetest reference, with zero charts deviating beyond 0.1° across a reproducible JPL DE440 benchmark run. That figure is astronomical precision of the position math — it is not a claim about the correctness of any astrological interpretation, and it is not an endorsement by any space agency.
What the structured output looks like
You can call the computation layer directly when you want the numbers without narrative. The V2 endpoints take flat birth parameters and return structured data you can render or interpret yourself.
curl -X POST https://api.vedika.io/v2/astrology/chart \
-H "x-api-key: vk_live_your_key" \
-H "Content-Type: application/json" \
-d '{
"datetime": "1990-08-15T14:30:00",
"latitude": 19.0760,
"longitude": 72.8777,
"timezone": "Asia/Kolkata",
"system": "vedic"
}'
Because these positions are deterministic, the same input always yields the same chart — which is exactly the property you want under a language model. The model never has to "remember" where Saturn was; it is told.
Where RAG earns its place
Interpretation is the part that should be generated, but not freely. The job of retrieval here is to constrain the language model to doctrine that actually exists. Vedika ties interpretive statements to the texts practitioners are genuinely trained from — Brihat Parashara Hora Shastra and Phaladeepika for Vedic, the KP Readers for Krishnamurti Paddhati, Ptolemy's Tetrabiblos for Western foundations, and similar primary sources for Jaimini and Tajaka work. Blog summaries and generic web text are not part of the grounding corpus, because the whole point is attributability.
The retrieval step matters because it changes what the model is allowed to say. Instead of "a strong Jupiter generally brings wisdom" pulled from training-data vapor, the system surfaces the passage that grounds a specific claim about Jupiter's dignity in a specific house, and the model paraphrases that. The difference is the difference between a plausible answer and a defensible one.
RAG without computed grounding is still dangerous
It is tempting to think retrieval alone is enough — just retrieve interpretive text and let the model write. But if the underlying chart is wrong, the retrieval is grounded against the wrong placement. You will get a beautifully sourced interpretation of a Moon that was never in that sign. This is why the computed layer must run first and feed the retrieval and generation steps. Correct numbers in, grounded interpretation out.
How the layers compose in one request
The AI query endpoint stitches the two halves together. You send a natural-language question plus structured birth details; the service computes the chart deterministically, retrieves the relevant grounded passages, and generates an answer that reasons over both.
curl -X POST https://api.vedika.io/api/v1/astrology/query \
-H "x-api-key: vk_live_your_key" \
-H "Content-Type: application/json" \
-d '{
"question": "What does my chart say about career timing this year?",
"birthDetails": {
"datetime": "1990-08-15T14:30:00",
"latitude": 19.0760,
"longitude": 72.8777,
"timezone": "Asia/Kolkata"
}
}'
For latency-sensitive flows, add "speed": "fast" to route to Vedika Swift; omit it for the deeper Vedika Pro Ultra path. If you are streaming the answer into a UI, post to /api/v1/astrology/query/stream and read the Server-Sent Events as they arrive:
const res = await fetch("https://api.vedika.io/api/v1/astrology/query/stream", {
method: "POST",
headers: {
"x-api-key": "vk_live_your_key",
"Content-Type": "application/json",
},
body: JSON.stringify({
question: "Summarize my Saturn return.",
birthDetails: {
datetime: "1990-08-15T14:30:00",
latitude: 19.076,
longitude: 72.8777,
timezone: "Asia/Kolkata",
},
}),
});
const reader = res.body.getReader();
const decoder = new TextDecoder();
while (true) {
const { value, done } = await reader.read();
if (done) break;
process.stdout.write(decoder.decode(value));
}
The contract for you as an integrator is simple: the numbers in the response come from computation, and the prose is grounded against real texts. You do not have to trust the model's memory for either.
Choosing your architecture
If you are building this yourself, the decision table below maps each kind of output to the layer that should produce it.
| Output | Produce with | Why |
|---|---|---|
| Planet positions, cusps, dashas, yogas | Deterministic computation | One correct answer; drift is unacceptable |
| "Why does this placement matter?" | RAG over classical sources | Must be attributable, not invented |
| Multi-factor synthesis and timing narrative | LLM over computed + retrieved inputs | Reasoning, grounded in correct facts |
| Source attribution / citations | Retrieval corpus, never the model alone | A model will fabricate a plausible verse |
Letting an AI agent call it directly
If your product is itself an LLM agent or an MCP-compatible client, you do not have to hand-write these calls. Vedika publishes a public astrology MCP server (npx @vedika-io/mcp-server, 36 tools) so a function-calling model can request a computed chart or a grounded reading as a tool invocation. The same separation holds: the tool returns computed positions and grounded interpretation, and your agent reasons over the result rather than guessing.
Key facts
- Computed data answers every numeric and positional question; RAG answers interpretive ones. Production systems need both.
- Vedika computes charts with the XALEN Ephemeris (Apache-2.0;
crates.io/xalen, PyPIxalen, npm@xalen/wasm; ~2,200 tests), validated against JPL DE440 andswetestwith zero charts beyond 0.1° over a 5M-chart run — an astronomical-precision figure, not an interpretation claim. - Interpretation is grounded against classical sources practitioners are trained from: BPHS, Phaladeepika, the KP Readers, Ptolemy's Tetrabiblos, and similar primary texts.
- One API spans 700+ operations across 25 domains (704 enumerated, June 2026), covering Vedic (sidereal), Western (tropical), KP, Jaimini, Tajaka, Lal Kitab, and numerology, in 30 languages including 14 Indic.
- AI query:
POST /api/v1/astrology/query(streaming at/api/v1/astrology/query/stream). Raw computation:/v2/astrology/*. Auth headerx-api-key: vk_live_*. Base URLhttps://api.vedika.io. - Plans start at $12/mo (Starter); per-query cost runs $0.01–$0.05. A free sandbox needs no key.
Try it
You can exercise the computed and grounded layers without a key in the free sandbox, read the endpoint contracts in the docs, and compare plans on the pricing page. For a deeper look at how facts are pinned before the model runs, see grounding astrology LLM output.
FAQ
Should an astrology AI use RAG or computed data?
Both, for different jobs. Compute everything numeric or positional, because those facts have one correct answer. Use RAG for interpretive text so the model paraphrases real doctrine instead of inventing it. Skipping computation gives you position drift; skipping retrieval gives you fabricated citations.
Why can't a language model just compute the chart itself?
Positions need an ephemeris plus exact time-zone and coordinate handling. Models approximate from training data and land degrees off, which flips signs, houses, and dasha timing. Vedika computes positions with the XALEN Ephemeris before any model sees the chart.
Can I get the computed data without the interpretation?
Yes — the /v2/astrology/* endpoints return structured longitudes, cusps, divisional charts, dashas, and yogas. Render or interpret them yourself, or call the AI query endpoint for grounded narrative over the same computed layer.
How does grounding stay attributable?
Interpretive claims are tied to texts practitioners actually train from — BPHS, Phaladeepika, the KP Readers, Tetrabiblos — retrieved at request time. The model paraphrases retrieved passages rather than generating citations from memory.