Skip to main content

Streaming STT Quickstart

This tutorial shows how to connect to the Truebar streaming API for speech-to-text (STT) without relying on any additional frameworks. You will authenticate, open a WebSocket pipeline, stream audio, and print interim and final transcripts.

All commands assume the following environment variables are set:

export TRUEBAR_USERNAME="alice@example.com"export TRUEBAR_PASSWORD="super-secret"export TRUEBAR_CLIENT_ID="truebar-client"export TRUEBAR_AUTH_URL="https://auth.true-bar.si/realms/truebar/protocol/openid-connect/token"export TRUEBAR_STT_WS_URL="wss://api.true-bar.si/api/pipelines/stream"

Swap the hostnames if you are targeting a playground or bespoke environment.

1. Prepare audio#

Truebar expects mono 16 kHz PCM. Convert any existing WAV/MP3 file before running the samples:

ffmpeg -i sample.wav -ac 1 -ar 16000 -f s16le sample.pcm

You can also capture audio from the microphone—the browser-focused guides cover that flow in detail.

Voice tag

Before you run the samples, export TRUEBAR_ASR_TAG with the online ASR stage you want to use (see GET /api/pipelines/stages or copy the tag from your existing .env.truebar). The default KALDI:en-US:*:* works only if that stage exists in your tenant.


2. Run the sample#

Install dependencies and run the script:

npm install ws axiosnode stt.js
stt.js
import axios from "axios";import WebSocket from "ws";import { readFileSync } from "node:fs";
const tokensToText = (tokens: any[]) => {  let output = "";  let prevRight = false;  tokens?.forEach((token: any, index: number) => {    const text = token?.text ?? "";    if (!text) return;    const left = Boolean(token?.isLeftHanded);    if (index > 0 && !prevRight && !left) {      output += " ";    }    output += text;    prevRight = Boolean(token?.isRightHanded);  });  return output;};
async function fetchToken() {  const form = new URLSearchParams({    grant_type: "password",    username: process.env.TRUEBAR_USERNAME!,    password: process.env.TRUEBAR_PASSWORD!,    client_id: process.env.TRUEBAR_CLIENT_ID ?? "truebar-client",  });
  const { data } = await axios.post(process.env.TRUEBAR_AUTH_URL!, form, {    headers: { "Content-Type": "application/x-www-form-urlencoded" },  });
  return data.access_token as string;}
const token = await fetchToken();const ws = new WebSocket(process.env.TRUEBAR_STT_WS_URL!, {  headers: { Authorization: `Bearer ${token}` },});const pcm = readFileSync("sample.pcm");const chunkSize = 3200 * 2; // 100 ms @ 16 kHz (16-bit samples)let streamed = false;
ws.on("message", (payload, isBinary) => {  if (isBinary) return;  const msg = JSON.parse(payload.toString());
  if (msg.type === "STATUS") console.log("STATUS:", msg.status);  if (msg.type === "TEXT_SEGMENT") {    const text = tokensToText(msg.textSegment.tokens);    console.log(msg.textSegment.isFinal ? "FINAL" : "INTERIM", "-", text);  }
  if (msg.type === "STATUS" && msg.status === "CONFIGURED" && !streamed) {    streamed = true;    for (let offset = 0; offset < pcm.length; offset += chunkSize) {      ws.send(pcm.subarray(offset, offset + chunkSize));    }    ws.send(JSON.stringify({ type: "EOS", lockSession: false }));  }
  if (msg.type === "STATUS" && msg.status === "FINISHED") {    ws.close();  }});
ws.once("message", () => {  ws.send(    JSON.stringify({      type: "CONFIG",      pipeline: [        {          task: "ASR",          exceptionHandlingPolicy: "THROW",          config: {            tag: process.env.TRUEBAR_ASR_TAG ?? "KALDI:en-US:*:*",            parameters: { enableInterims: true },          },        },      ],    }),  );});
ws.on("open", () => console.log("STT stream connected"));ws.on("close", () => console.log("STT stream closed"));ws.on("error", (err) => console.error("STT error", err));
Session cleanup

Always close the stream with {"type": "EOS", "lockSession": false}. Switch lockSession to true only when you intend to resume the same session later; otherwise keep it false so Truebar frees the pipeline immediately.

3. Next steps#

  • Need microphone capture, diarisation, or browser-specific logic? Continue with the Streaming STT guide.
  • To record transcriptions, query the History API after closing the session.
  • Ready for synthesis? Jump to the Streaming TTS quickstart.