tutorial

Building a voice astrology assistant with an API

A practical guide to wiring speech-to-text, an astrology API, and text-to-speech into a voice assistant that answers birth-chart questions from spoken input.

To build a voice astrology assistant you chain three layers: a speech-to-text step that turns the spoken question into text, an astrology API call that computes the chart and generates an answer, and a text-to-speech step that speaks the reply. The Vedika API sits in the middle as a text-in, text-out service — you POST a question plus birth details to /api/v1/astrology/query and receive a written answer ready to voice. This guide walks through the full loop, the request shapes, latency choices, and the parts of a voice flow that quietly break in production.

The architecture of a voice astrology loop

A voice assistant is not a single API. It is a pipeline, and each stage has its own failure mode. Keeping the astrology computation as a plain text request — rather than something that ingests audio — means you can swap any speech component without rewriting your core logic.

  1. Capture and transcribe. Your client records audio and a speech-to-text layer returns a transcript such as “when will I get married?”
  2. Resolve birth details. You attach the user's stored birthDetails (datetime, latitude, longitude, timezone) to the transcript.
  3. Query the astrology API. You call /api/v1/astrology/query; the engine computes the chart and writes a grounded answer.
  4. Speak the answer. You hand the text to a text-to-speech engine and play it back.

Only step three is Vedika's concern. That separation is deliberate: you stay free to use whatever transcription and synthesis tooling fits your platform, budget, and language coverage.

The core request: query with birth details

The main AI endpoint takes a natural-language question and a birthDetails object. This is the call your assistant makes on every conversational turn.

curl -X POST https://api.vedika.io/api/v1/astrology/query \
  -H "x-api-key: vk_live_your_key_here" \
  -H "Content-Type: application/json" \
  -d '{
    "question": "When is a good time for me to change jobs?",
    "speed": "fast",
    "birthDetails": {
      "datetime": "1992-03-14T07:25:00",
      "latitude": 18.5204,
      "longitude": 73.8567,
      "timezone": "Asia/Kolkata"
    }
  }'

The speed: "fast" flag is the one voice-specific tweak in the body. In a spoken exchange, a reply that arrives a second sooner feels far more natural than the same reply rendered with a few extra paragraphs. Reserve the default (slower, longer) path for moments when the user explicitly asks for a full reading.

Why birth details are numbers, not a city name

Notice the request carries latitude, longitude, and timezone rather than a place name. Chart math is sensitive to a few arcminutes of position and to the exact UTC offset at the moment of birth, so the API expects resolved coordinates. A voice user will say “I was born in Pune,” not “18.52 north,” which means geocoding belongs in your flow, not in the astrology call — more on that below.

Streaming for a responsive voice experience

Waiting for a complete answer before speaking makes an assistant feel sluggish. The streaming endpoint emits Server-Sent Events as the reply is generated, so you can start synthesising speech from the first full sentence.

// Generic fetch-based SSE consumer; no vendor SDK required.
async function speakStreamingAnswer(question, birthDetails, onSentence) {
  const res = await fetch(
    "https://api.vedika.io/api/v1/astrology/query/stream",
    {
      method: "POST",
      headers: {
        "x-api-key": process.env.VEDIKA_API_KEY,
        "Content-Type": "application/json",
      },
      body: JSON.stringify({ question, speed: "fast", birthDetails }),
    }
  );

  const reader = res.body.getReader();
  const decoder = new TextDecoder();
  let buffer = "";

  while (true) {
    const { value, done } = await reader.read();
    if (done) break;
    buffer += decoder.decode(value, { stream: true });

    // Flush complete sentences to the text-to-speech engine as they arrive.
    let match;
    while ((match = buffer.match(/[^.!?]*[.!?]+/))) {
      onSentence(match[0].trim());      // hand this chunk to your TTS layer
      buffer = buffer.slice(match[0].length);
    }
  }
  if (buffer.trim()) onSentence(buffer.trim());
}

The pattern is the same in any language: read the byte stream, accumulate text, and emit it to your speech engine one sentence at a time. The assistant begins speaking within a second or two while the rest of the answer is still being written, which is the single biggest perceived-latency win in a voice product.

Handling birth details in a spoken flow

The most common voice bug is not in the astrology call at all — it is in collecting birth data over speech. People rarely state coordinates, and date-of-birth phrasing is messy (“the fourteenth of March, ninety-two”). Solve this once, at onboarding, and every later turn becomes a clean question-only request.

Geocode the city, then cache it

Ask for the birth city by voice, geocode it to a latitude/longitude and an IANA timezone, and store that with the user profile. From then on, your assistant only needs the spoken question.

import os, requests

VEDIKA_KEY = os.environ["VEDIKA_API_KEY"]

# birth_details is resolved ONCE at onboarding and stored per user.
birth_details = {
    "datetime": "1992-03-14T07:25:00",   # parsed from spoken date + time
    "latitude": 18.5204,                  # from geocoding "Pune"
    "longitude": 73.8567,
    "timezone": "Asia/Kolkata",
}

def ask(question: str) -> str:
    r = requests.post(
        "https://api.vedika.io/api/v1/astrology/query",
        headers={"x-api-key": VEDIKA_KEY},
        json={"question": question, "speed": "fast",
              "birthDetails": birth_details},
        timeout=30,
    )
    r.raise_for_status()
    return r.json()["answer"]   # feed this string to your TTS engine

transcript = "will this year be good for my career?"   # from speech-to-text
print(ask(transcript))

Confirm ambiguous dates before you compute

Spoken dates are easy to mishear. A good voice flow reads the parsed date back — “That's the 14th of March, 1992, correct?” — before the first chart query. An incorrect birth time changes the rising sign and house cusps, so a one-line confirmation prevents an entire reading from being quietly wrong.

Computation endpoints when you need raw data, not prose

Sometimes a voice turn needs a fact, not a paragraph — “what's my moon sign?” The V2 computation routes return structured chart data you can phrase yourself, which keeps very short answers crisp and fast.

V2 endpoints under /v2/astrology/* take a flat body (datetime, latitude, longitude, timezone) rather than a nested birthDetails object. They expose the same engine used by the AI query path. Across the platform there are 700+ operations spanning 25 domains (704 enumerated as of June 2026), so most discrete facts a voice user might request have a direct computation route.

All chart math runs on the XALEN Ephemeris, Vedika's own open-source engine (Apache-2.0, published to crates.io, PyPI, and npm). It is validated against JPL DE440 and swetest reference data, with no charts deviating beyond 0.1° of arc across a five-million-chart test. That is astronomical precision — planetary positions — which gives your spoken answers a dependable computational foundation underneath the interpretation layer.

Three systems, thirty languages, one assistant

A voice assistant often serves users who expect different traditions. The same API resolves Vedic (sidereal), Western (tropical), and KP from one request surface, alongside Jaimini, Tajaka, Lal Kitab, and numerology. You can let the user pick their tradition by voice and route accordingly without integrating separate providers.

For spoken interfaces, language coverage matters as much as system coverage. The platform generates answers in 30 languages, including 14 Indic languages, so a Tamil or Hindi voice assistant can receive a reply in the same language it asked in — no separate translation hop that would add latency and drift.

Letting an LLM agent drive the calls

If your assistant is built around a function-calling model or runs inside an MCP-compatible client, you can skip hand-written HTTP. Vedika publishes a public astrology MCP server (npx @vedika-io/mcp-server) exposing 36 tools. An LLM agent can then invoke chart operations as tools, deciding which computation a spoken question needs and stitching the result into its reply.

Key facts

Putting the pieces in place

A working voice astrology assistant is mostly orchestration: capture audio, transcribe, attach cached birth details, call the query endpoint (streaming for responsiveness), and synthesise the reply. The astrology API deliberately stays a text service so your voice stack remains yours to choose and replace. The hard parts — chart computation across three systems, grounded interpretation, multilingual output — are handled behind one request shape.

You can prototype the entire request flow against the free sandbox with no key, read the full request and response shapes in the docs, and check per-query costs on the pricing page. For background on the interpretation layer that turns chart data into spoken answers, see how to evaluate an astrology API.

FAQ

Do I need to send audio to the astrology API directly?

No. The API works with text and structured birth details. You transcribe speech to text yourself, send the question plus birthDetails to /api/v1/astrology/query, then pass the returned answer to a text-to-speech engine. The API stays text-in, text-out, which keeps it portable across any voice stack.

Can the assistant stream a spoken answer instead of waiting?

Yes. Use /api/v1/astrology/query/stream, which sends Server-Sent Events as the answer is generated. Buffer tokens into sentence chunks and feed each to your speech engine so the assistant starts speaking within a second or two.

How do I handle coordinates a voice user never speaks?

Collect the birth city by voice, geocode it once to coordinates and a timezone, and store it with the user profile. Later turns only need the spoken question because birthDetails are already on file.

Which speed setting should a voice assistant use?

Pass speed: "fast" on conversational turns for lower latency. Drop it only when the user explicitly asks to hear a longer, detailed reading.

Build on the Vedika astrology API

700+ operations, Vedic + Western + KP, 30 languages, an open-source XALEN ephemeris, and a built-in LLM. Free sandbox — no signup.

Try the free sandbox