Offline Pipeline API
Truebarβs offline API lets you run the same pipelines used for streaming, but as HTTP jobs. It is ideal for large audio archives, background text clean-up, and TTS workloads where real-time latency is not required.
#
When to choose the offline API- Batch ingestion β Transcribe or synthesise long-form media without keeping a WebSocket connection open.
- Post-processing β Re-run NLP stages on previously captured transcripts.
- Automation β Integrate with schedulers or serverless jobs where HTTP fits better than persistent sockets.
#
Endpoint overview- Path:
POST /api/pipelines/process
- Auth:
Authorization: Bearer <access_token>
- Pipeline: provided per request in the
pipeline
form part - Input data: uploaded as a second form part (
data
) - Async mode: append
?async=true
to process large jobs asynchronously (youβll receive a202 ACCEPTED
with a status URL)
Requests use multipart/form-data
so you can send JSON metadata and binary payloads together.
- cURL
- Python
- JavaScript (Node.js)
- Java
curl --fail --location \\ --request POST "$TRUEBAR_API_BASE_URL/api/pipelines/process" \\ --header "Authorization: Bearer $TRUEBAR_ACCESS_TOKEN" \\ --form 'pipeline=@pipeline.json;type=application/json' \\ --form 'data=@sample.pcm;type=application/octet-stream'
import jsonimport osfrom pathlib import Path
import requests
pipeline = [ { "task": "ASR", "exceptionHandlingPolicy": "THROW", "config": {"tag": "KALDI:sl-SI:COL:20221208-0800", "parameters": {"enableInterims": False}}, }, { "task": "NLP_pc", "exceptionHandlingPolicy": "SKIP", "config": {"tag": "NEMO_PUNCTUATOR:sl-SI:*:*", "parameters": {"enableSplitToSentences": True}}, },]
response = requests.post( f"{os.environ['TRUEBAR_API_BASE_URL']}/api/pipelines/process", headers={"Authorization": f"Bearer {os.environ['TRUEBAR_ACCESS_TOKEN']}"}, files={ "pipeline": ("pipeline.json", json.dumps(pipeline), "application/json"), "data": ("sample.pcm", Path("sample.pcm").read_bytes(), "application/octet-stream"), }, timeout=30,)response.raise_for_status()if response.headers.get("Content-Type", "").startswith("application/json"): print(response.json())else: with open("result.bin", "wb") as handle: handle.write(response.content)
import { readFileSync } from 'node:fs';import FormData from 'form-data';import axios from 'axios';
const pipeline = [ { task: 'ASR', exceptionHandlingPolicy: 'THROW', config: { tag: 'KALDI:sl-SI:COL:20221208-0800', parameters: { enableInterims: false } }, }, { task: 'NLP_pc', exceptionHandlingPolicy: 'SKIP', config: { tag: 'NEMO_PUNCTUATOR:sl-SI:*:*', parameters: { enableSplitToSentences: true } }, },];
const form = new FormData();form.append('pipeline', JSON.stringify(pipeline), { filename: 'pipeline.json', contentType: 'application/json',});form.append('data', readFileSync('sample.pcm'), { filename: 'sample.pcm', contentType: 'application/octet-stream',});
const response = await axios.post( `${process.env.TRUEBAR_API_BASE_URL}/api/pipelines/process`, form, { headers: { Authorization: `Bearer ${process.env.TRUEBAR_ACCESS_TOKEN}`, ...form.getHeaders(), }, },);
if (response.headers['content-type']?.startsWith('application/json')) { console.log(response.data);} else { console.log(`Received ${response.data.length} bytes of audio.`);}
import java.io.IOException;import java.net.URI;import java.net.http.HttpClient;import java.net.http.HttpRequest;import java.net.http.HttpResponse;import java.nio.charset.StandardCharsets;import java.nio.file.Files;import java.nio.file.Path;import java.util.UUID;
public class SubmitOfflineJob { public static void main(String[] args) throws IOException, InterruptedException { var boundary = "----TruebarBoundary" + UUID.randomUUID();
String pipelineJson = """ [ { \"task\": \"ASR\", \"exceptionHandlingPolicy\": \"THROW\", \"config\": { \"tag\": \"KALDI:sl-SI:COL:20221208-0800\", \"parameters\": { \"enableInterims\": false } } }, { \"task\": \"NLP_pc\", \"exceptionHandlingPolicy\": \"SKIP\", \"config\": { \"tag\": \"NEMO_PUNCTUATOR:sl-SI:*:*\", \"parameters\": { \"enableSplitToSentences\": true } } } ] """;
byte[] audioBytes = Files.readAllBytes(Path.of("sample.pcm"));
var builder = new StringBuilder(); builder.append("--").append(boundary).append("\r\n"); builder.append("Content-Disposition: form-data; name=\"pipeline\"; filename=\"pipeline.json\"\r\n"); builder.append("Content-Type: application/json\r\n\r\n"); builder.append(pipelineJson).append("\r\n");
builder.append("--").append(boundary).append("\r\n"); builder.append("Content-Disposition: form-data; name=\"data\"; filename=\"sample.pcm\"\r\n"); builder.append("Content-Type: application/octet-stream\r\n\r\n");
byte[] header = builder.toString().getBytes(StandardCharsets.UTF_8); byte[] footer = ("\r\n--" + boundary + "--\r\n").getBytes(StandardCharsets.UTF_8);
byte[] body = new byte[header.length + audioBytes.length + footer.length]; System.arraycopy(header, 0, body, 0, header.length); System.arraycopy(audioBytes, 0, body, header.length, audioBytes.length); System.arraycopy(footer, 0, body, header.length + audioBytes.length, footer.length);
HttpRequest request = HttpRequest.newBuilder() .uri(URI.create(System.getenv("TRUEBAR_API_BASE_URL") + "/api/pipelines/process")) .header("Authorization", "Bearer " + System.getenv("TRUEBAR_ACCESS_TOKEN")) .header("Content-Type", "multipart/form-data; boundary=" + boundary) .POST(HttpRequest.BodyPublishers.ofByteArray(body)) .build();
HttpClient client = HttpClient.newHttpClient(); HttpResponse<byte[]> response = client.send(request, HttpResponse.BodyHandlers.ofByteArray()); System.out.println("Status: " + response.statusCode()); System.out.println("Content-Type: " + response.headers().firstValue("Content-Type").orElse("unknown")); }}
pipeline.json
contains the same stage definition you would send to the streaming API.
[ { "task": "ASR", "exceptionHandlingPolicy": "THROW", "config": { "tag": "KALDI:sl-SI:COL:20221208-0800", "parameters": { "enableInterims": false } } }, { "task": "NLP_pc", "exceptionHandlingPolicy": "SKIP", "config": { "tag": "NEMO_PUNCTUATOR:sl-SI:*:*", "parameters": { "enableSplitToSentences": true } } }]
#
Text payload schemaText-based pipelines expect a JSON array of text segments. Each segment wraps one or more tokens; at minimum you must supply the token text.
[ { "isFinal": true, "startTimeMs": 0, "endTimeMs": 1500, "tokens": [ { "text": "Danes" }, { "text": "je" }, { "text": "lep" }, { "text": "dan" }, { "text": ".", "isLeftHanded": true } ] }]
Token fields:
text
(required) β literal text.isLeftHanded
/isRightHanded
β spacing hints used by the quickstarts.startOffsetMs
/endOffsetMs
β token-level timings relative to the segment.speakerCode
β diarisation speaker label.confidence
β probability score returned by ASR/NLP stages.
Segment fields (isFinal
, startTimeMs
, endTimeMs
) are optional unless you need precise alignment metadata.
#
Audio payloads- Upload audio in the
data
form field. - The server accepts common WAV/FLAC/MP3/MP4 containers; prefer mono 16 kHz PCM for best accuracy.
- Avoid multi-track media: the decoder does not auto-select an audio stream.
- For large uploads, enable
--compressed
incurl
or use resumable uploads in your HTTP client.
#
Response formats- Pipelines ending in text stages (ASR, NLP) return JSON using the same segment structure as above.
- Pipelines ending in audio stages (TTS) return raw audio bytes (
Content-Type: application/octet-stream
). - On success youβll receive
200 OK
(sync) or202 ACCEPTED
(async); failed jobs include a JSON error object withid
,timestamp
, andmessage
.
#
Worked examples#
Punctuation-only text job- cURL
- Python
- JavaScript (Node.js)
- Java
curl --fail --location \ --request POST "$TRUEBAR_API_BASE_URL/api/pipelines/process" \ --header "Authorization: Bearer $TRUEBAR_ACCESS_TOKEN" \ --form 'pipeline=[{"task":"NLP_pc","exceptionHandlingPolicy":"THROW","config":{"tag":"NEMO_PC:sl-SI:*:*","parameters":{"enableSplitToSentences":false}}}];type=application/json' \ --form 'data=[{"tokens":[{"text":"Danes"},{"text":"je"},{"text":"lep"},{"text":"dan"}]}];type=application/json'
import jsonimport os
import requests
pipeline = [ { "task": "NLP_pc", "exceptionHandlingPolicy": "THROW", "config": { "tag": "NEMO_PC:sl-SI:*:*", "parameters": {"enableSplitToSentences": False}, }, }]
text_segments = [ {"tokens": [{"text": word} for word in ["Danes", "je", "lep", "dan"]]}]
response = requests.post( f"{os.environ['TRUEBAR_API_BASE_URL']}/api/pipelines/process", headers={"Authorization": f"Bearer {os.environ['TRUEBAR_ACCESS_TOKEN']}"}, files={ "pipeline": ("pipeline.json", json.dumps(pipeline), "application/json"), "data": ("segments.json", json.dumps(text_segments), "application/json"), }, timeout=10,)response.raise_for_status()print(response.json())
import axios from 'axios';import FormData from 'form-data';
const pipeline = [ { task: 'NLP_pc', exceptionHandlingPolicy: 'THROW', config: { tag: 'NEMO_PC:sl-SI:*:*', parameters: { enableSplitToSentences: false }, }, },];
const segments = [ { tokens: [{ text: 'Danes' }, { text: 'je' }, { text: 'lep' }, { text: 'dan' }] },];
const form = new FormData();form.append('pipeline', JSON.stringify(pipeline), { filename: 'pipeline.json', contentType: 'application/json',});form.append('data', JSON.stringify(segments), { filename: 'segments.json', contentType: 'application/json',});
const { data } = await axios.post( `${process.env.TRUEBAR_API_BASE_URL}/api/pipelines/process`, form, { headers: { Authorization: `Bearer ${process.env.TRUEBAR_ACCESS_TOKEN}`, ...form.getHeaders(), }, },);console.log(data);
import java.net.URI;import java.net.http.HttpClient;import java.net.http.HttpRequest;import java.net.http.HttpResponse;
public class SubmitPunctuationJob { public static void main(String[] args) throws Exception { String boundary = "----TruebarBoundaryPunctuation"; String pipelineJson = """ [ { \"task\": \"NLP_pc\", \"exceptionHandlingPolicy\": \"THROW\", \"config\": { \"tag\": \"NEMO_PC:sl-SI:*:*\", \"parameters\": { \"enableSplitToSentences\": false } } } ] """; String dataJson = """ [ { \"tokens\": [ { \"text\": \"Danes\" }, { \"text\": \"je\" }, { \"text\": \"lep\" }, { \"text\": \"dan\" } ] } ] """;
String body = "--" + boundary + "\r\n" + "Content-Disposition: form-data; name=\"pipeline\"; filename=\"pipeline.json\"\r\n" + "Content-Type: application/json\r\n\r\n" + pipelineJson + "\r\n" + "--" + boundary + "\r\n" + "Content-Disposition: form-data; name=\"data\"; filename=\"segments.json\"\r\n" + "Content-Type: application/json\r\n\r\n" + dataJson + "\r\n" + "--" + boundary + "--\r\n";
HttpRequest request = HttpRequest.newBuilder() .uri(URI.create(System.getenv("TRUEBAR_API_BASE_URL") + "/api/pipelines/process")) .header("Authorization", "Bearer " + System.getenv("TRUEBAR_ACCESS_TOKEN")) .header("Content-Type", "multipart/form-data; boundary=" + boundary) .POST(HttpRequest.BodyPublishers.ofString(body)) .build();
HttpClient client = HttpClient.newHttpClient(); HttpResponse<String> response = client.send(request, HttpResponse.BodyHandlers.ofString()); System.out.println(response.body()); }}
Response:
[ { "isFinal": true, "tokens": [ { "text": "Danes" }, { "text": "je" }, { "text": "lep" }, { "text": "dan" }, { "text": ".", "isLeftHanded": true } ] }]
#
Speech-to-text audio job- cURL
- Python
- JavaScript (Node.js)
- Java
curl --fail --location \ --request POST "$TRUEBAR_API_BASE_URL/api/pipelines/process" \ --header "Authorization: Bearer $TRUEBAR_ACCESS_TOKEN" \ --form 'pipeline=[{"task":"ASR","exceptionHandlingPolicy":"THROW","config":{"tag":"KALDI:sl-SI:COL:20221208-0800","parameters":{"enableInterims":false,"enableSd":false}}}];type=application/json' \ --form 'data=@sample.wav;type=audio/wav'
import jsonimport osfrom pathlib import Path
import requests
pipeline = [ { "task": "ASR", "exceptionHandlingPolicy": "THROW", "config": { "tag": "KALDI:sl-SI:COL:20221208-0800", "parameters": {"enableInterims": False, "enableSd": False}, }, }]
response = requests.post( f"{os.environ['TRUEBAR_API_BASE_URL']}/api/pipelines/process", headers={"Authorization": f"Bearer {os.environ['TRUEBAR_ACCESS_TOKEN']}"}, files={ "pipeline": ("pipeline.json", json.dumps(pipeline), "application/json"), "data": ("sample.wav", Path("sample.wav").read_bytes(), "audio/wav"), }, timeout=30,)response.raise_for_status()print(response.json())
import { readFileSync } from 'node:fs';import FormData from 'form-data';import axios from 'axios';
const pipeline = [ { task: 'ASR', exceptionHandlingPolicy: 'THROW', config: { tag: 'KALDI:sl-SI:COL:20221208-0800', parameters: { enableInterims: false, enableSd: false }, }, },];
const form = new FormData();form.append('pipeline', JSON.stringify(pipeline), { filename: 'pipeline.json', contentType: 'application/json',});form.append('data', readFileSync('sample.wav'), { filename: 'sample.wav', contentType: 'audio/wav',});
const { data } = await axios.post( `${process.env.TRUEBAR_API_BASE_URL}/api/pipelines/process`, form, { headers: { Authorization: `Bearer ${process.env.TRUEBAR_ACCESS_TOKEN}`, ...form.getHeaders(), }, },);console.log(data);
import java.net.URI;import java.net.http.HttpClient;import java.net.http.HttpRequest;import java.net.http.HttpResponse;import java.nio.charset.StandardCharsets;import java.nio.file.Files;import java.nio.file.Path;
public class SubmitAsrJob { public static void main(String[] args) throws Exception { String boundary = "----TruebarBoundaryAsr"; String pipelineJson = """ [ { \"task\": \"ASR\", \"exceptionHandlingPolicy\": \"THROW\", \"config\": { \"tag\": \"KALDI:sl-SI:COL:20221208-0800\", \"parameters\": { \"enableInterims\": false, \"enableSd\": false } } } ] """;
byte[] audio = Files.readAllBytes(Path.of("sample.wav"));
var builder = new StringBuilder(); builder.append("--").append(boundary).append("\r\n"); builder.append("Content-Disposition: form-data; name=\"pipeline\"; filename=\"pipeline.json\"\r\n"); builder.append("Content-Type: application/json\r\n\r\n"); builder.append(pipelineJson).append("\r\n"); builder.append("--").append(boundary).append("\r\n"); builder.append("Content-Disposition: form-data; name=\"data\"; filename=\"sample.wav\"\r\n"); builder.append("Content-Type: audio/wav\r\n\r\n");
byte[] header = builder.toString().getBytes(StandardCharsets.UTF_8); byte[] footer = ("\r\n--" + boundary + "--\r\n").getBytes(StandardCharsets.UTF_8); byte[] body = new byte[header.length + audio.length + footer.length]; System.arraycopy(header, 0, body, 0, header.length); System.arraycopy(audio, 0, body, header.length, audio.length); System.arraycopy(footer, 0, body, header.length + audio.length, footer.length);
HttpRequest request = HttpRequest.newBuilder() .uri(URI.create(System.getenv("TRUEBAR_API_BASE_URL") + "/api/pipelines/process")) .header("Authorization", "Bearer " + System.getenv("TRUEBAR_ACCESS_TOKEN")) .header("Content-Type", "multipart/form-data; boundary=" + boundary) .POST(HttpRequest.BodyPublishers.ofByteArray(body)) .build();
HttpClient client = HttpClient.newHttpClient(); HttpResponse<String> response = client.send(request, HttpResponse.BodyHandlers.ofString()); System.out.println(response.body()); }}
The response is a JSON array of text segments. To receive audio instead (e.g., TTS), end the pipeline with a TTS
stage; the response body will contain PCM bytes that you can stream to a player or persist as a file.
#
HTTP responses & errorsSuccessful operations return:
200 OK
β synchronous job completed with results in the body.202 ACCEPTED
β async job accepted; poll the history API or the URL returned in theLocation
header.204 NO_CONTENT
β ad-hoc helper endpoints (e.g., deletes) completed successfully.
Common error codes:
400 BAD_REQUEST
β malformed multipart payload or invalid pipeline definition.401 UNAUTHORIZED
/403 FORBIDDEN
β missing token or insufficient roles (PIPELINE_OFFLINE_API
,STAGE_*
).404 NOT_FOUND
β unknown session/job identifier when querying results.409 CONFLICT
β duplicate submission or resource already exists.415 UNSUPPORTED_MEDIA_TYPE
β decoder cannot read the uploaded audio.500 INTERNAL_SERVER_ERROR
β unexpected platform error (contact support with theid
from the response body).
{ "id": "b7f1c8c0-3a4b-4d9b-8d2e-f2b9f8a6c1de", "timestamp": "2024-04-22T08:15:30.123Z", "message": "Pipeline tag KALDI:sl-SI:COL:20221208-0800 not available for this tenant."}
#
Related guides- Streaming STT Guide β real-time transcription.
- Streaming TTS Guide β real-time synthesis.
- History API β retrieve completed jobs and recordings.