Integrate into Website and Apps

View as Markdown

To integrate your agent into your website or mobile app, use this endpoint. You can send and receive audio in real time, which allows for seamless and natural conversations.

This integration has four parts:

  1. Create a widget agent in Synthflow.
  2. Request a short-lived websocket session URL with your API key.
  3. Stream microphone audio to the websocket and play back agent audio from websocket messages.
  4. Wrap the connection, recording, and playback logic in your own client application.

Step 1. Create a widget agent

1

In your Synthflow dashboard, create an agent of the type widget. For more details on how to do that, see Create an agent.

2

Go to Agents and select the agent you have just created. Copy the ID next to its name. This is the ID you will use as the assistant_id parameter.

Step 2. Request a session

Make a GET request to the following endpoint:

$curl -X GET "https://widget.synthflow.ai/websocket/token/{assistant_id}" \
> -H "Authorization: Bearer <your-synthflow-api-key>"

It will return a response of the form:

1{
2 "sessionURL": "wss://widget.synthflow.ai/websocket/start?token=..."
3}

You can then use this session URL to connect to the websocket.

Step 3. Send and receive events

At a high level, the websocket contract works like this:

  • Your client sends raw PCM16 microphone audio at 48 kHz.
  • Your client also sends a JSON readiness event when it is prepared to receive speech.
  • Synthflow returns raw PCM16 agent audio at 16 kHz.
  • Synthflow also returns JSON status events, including the signal that the agent is ready.

Send events

You can send two types of messages through the websocket:

  1. Binary messages containing the user’s speech (raw PCM16 audio, sample rate of 48000).
  2. JSON messages ({ "type": "status_client_ready" }) to signal that you are ready for the agent to start speaking.

Receive events

You will receive two types of messages:

  • Binary messages containing the agent’s speech (raw PC16 audio, sample rate 16000).
  • JSON messages ({ "type": "status_agent_ready" }) that signal that the agent is ready to start receiving audio.

Writing a client

To use the websocket connection in a webpage, follow these steps:

The sample implementation is split into three responsibilities:

  • audio output playback
  • microphone recording and PCM conversion
  • websocket lifecycle management
1

Write functions to record and play linear audio:

1type AudioOutControls = {
2 enqueueAudioChunk: (chunk: Int16Array) => void;
3 audioContext: AudioContext;
4 stop: () => void;
5};
6
7async function startAudioOut(): Promise<AudioOutControls> {
8 const ctx = new AudioContext({ sampleRate: 16000 });
9 const gainNode = ctx.createGain();
10 gainNode.connect(ctx.destination);
11 await ctx.audioWorklet.addModule("/audio-out-worklet.js");
12 const worklet = new AudioWorkletNode(ctx, "audio-out-worklet");
13 worklet.connect(gainNode);
14 return {
15 enqueueAudioChunk: (chunk: Int16Array) =>
16 worklet.port.postMessage({ buffer: chunk }),
17 audioContext: ctx,
18 stop: () => {
19 worklet.disconnect();
20 gainNode.disconnect();
21 ctx.close();
22 },
23 };
24}
25
26async function startRecording(
27 onAudioChunk: (chunk: ArrayBufferLike) => void,
28 onRecordingStarted: () => Promise<void>,
29): Promise<{ stop: () => void }> {
30 const stream = await navigator.mediaDevices.getUserMedia({ audio: true });
31 const audioContext = new window.AudioContext();
32 const sampleRate = audioContext.sampleRate; // 48000
33 const sourceNode = audioContext.createMediaStreamSource(stream);
34 const scriptProcessor = audioContext.createScriptProcessor(8192, 1, 1);
35
36 let hasRecordingStarted = false;
37 scriptProcessor.onaudioprocess = (
38 audioProcessingEvent: AudioProcessingEvent
39 ) => {
40 if (!hasRecordingStarted) {
41 hasRecordingStarted = true;
42 onRecordingStarted();
43 }
44 const floatData = audioProcessingEvent.inputBuffer.getChannelData(0);
45 const int16Data = float32ToInt16(floatData);
46 const byteBuffer = new Uint8Array(int16Data.buffer);
47 onAudioChunk(byteBuffer);
48 };
49
50 sourceNode.connect(scriptProcessor);
51 scriptProcessor.connect(audioContext.destination);
52 return {
53 stop: () => {
54 scriptProcessor.disconnect();
55 scriptProcessor.onaudioprocess = null;
56 sourceNode.disconnect();
57 audioContext.close();
58 stream.getTracks().forEach((track) => track.stop());
59 },
60 };
61}
62
63function float32ToInt16(float32Array: Float32Array): Int16Array {
64 const int16Array = new Int16Array(float32Array.length);
65 for (let i = 0; i < float32Array.length; i++) {
66 const s = Math.max(-1, Math.min(1, float32Array[i]));
67 int16Array[i] = s < 0 ? s * 0x8000 : s * 0x7fff;
68 }
69 return int16Array;
70}
1class Processor extends AudioWorkletProcessor {
2 constructor(options) {
3 super();
4 this.buffer = getAudioBuffer();
5 this.port.onmessage = (ev) => {
6 this.buffer.pushArray(new Int16Array(ev.data.buffer));
7 };
8 }
9
10 process(inputs, outputs, parameters) {
11 const output = outputs[0][0];
12 for (let i = 0; i < output.length; i++) {
13 let value = this.buffer.getSample();
14 if (value === undefined) {
15 break;
16 }
17 output[i] = value / 32768;
18 }
19 return true;
20 }
21}
22
23function getAudioBuffer() {
24 let samplePointer = 0;
25
26 /**
27 * @type {Array<Int16Array>}
28 */
29 let arrays = [];
30
31 /**
32 * @type {Int16Array | undefined}
33 */
34 let currentArray = undefined;
35
36 return {
37 getSample: () => {
38 if (currentArray === undefined || samplePointer >= currentArray.length) {
39 currentArray = arrays.shift();
40 samplePointer = 0;
41 }
42 if (currentArray === undefined) {
43 return undefined;
44 }
45 const sample = currentArray[samplePointer];
46 samplePointer++;
47 return sample;
48 },
49 pushArray: (array) => {
50 arrays.push(array);
51 },
52 };
53}
54
55registerProcessor("audio-out-worklet", Processor);
2

Write a function to manage the websocket connection:

1const MESSAGE_TYPE_STATUS_AGENT_READY = "status_agent_ready";
2const MESSAGE_TYPE_STATUS_CLIENT_READY = "status_client_ready";
3
4async function startSynthflow(
5 url: string,
6 playAudioChunk: (chunk: Int16Array) => void
7): Promise<{
8 stop: () => void;
9}> {
10 updateStatus("Connecting to server", "warning");
11
12 const websocket = new WebSocket(
13 url
14 );
15 websocket.binaryType = "arraybuffer";
16
17 websocket.onmessage = async (event) => {
18 // Handle audio
19 if (event.data instanceof ArrayBuffer) {
20 const arrayBuffer = event.data as ArrayBuffer;
21 const pcmSamples = new Int16Array(arrayBuffer);
22 playAudioChunk(pcmSamples);
23 } else {
24 const data = JSON.parse(event.data);
25 switch (data.type) {
26 case MESSAGE_TYPE_STATUS_AGENT_READY:
27 updateStatus("Connected to call", "success");
28 console.log("Received agent ready message");
29 break;
30 default:
31 console.log("Received unknown message from server", data);
32 break;
33 }
34 }
35 };
36
37 const recordingControls = await startRecording(
38 (audio) => {
39 if (websocket.readyState === WebSocket.OPEN) {
40 websocket.send(audio);
41 }
42 },
43 async () => {
44 sendWhenReady(websocket, JSON.stringify({ type: MESSAGE_TYPE_STATUS_CLIENT_READY }))
45 console.log("Scheduled send client ready message");
46 }
47 );
48
49 websocket.onclose = () => {
50 recordingControls.stop();
51 updateStatus("Disconnected from server", "error");
52 };
53
54 return {
55 stop: () => {
56 websocket.close();
57 },
58 };
59}
60
61function base64ToArrayBuffer(base64: string): ArrayBuffer {
62 const binaryString = window.atob(base64);
63 const len = binaryString.length;
64 const bytes = new Uint8Array(len);
65 for (let i = 0; i < len; i++) {
66 bytes[i] = binaryString.charCodeAt(i);
67 }
68 return bytes.buffer;
69}
70
71function sendWhenReady(websocket: WebSocket, message: string) {
72 if (websocket.readyState === WebSocket.CLOSED) {
73 console.log("WebSocket is closed, not sending message");
74 return;
75 } else if (websocket.readyState === WebSocket.OPEN) {
76 websocket.send(message);
77 } else {
78 setTimeout(() => sendWhenReady(websocket, message), 50);
79 }
80}
3

Use the functions defined above to create a call:

1async function makeCall() {
2 const sessionURL = ... // make a call to your server to get a session URL
3
4 const audioOutControls = await startAudioOut();
5 const synthflowControls = await startSynthflow(sessionURL, (audio) => {
6 audioOutControls.enqueueAudioChunk(audio);
7 });
8
9 // stop audio/call when needed
10}