Skip to content

Conversation Threading

Nodexa maintains conversation history server-side. You don't need to resend the entire message history on every request — instead, reference the previous response ID.


How Threading Works

Every response has a unique id with the resp_ prefix. To continue a conversation, pass that ID as previous_response_id in your next request:

{
  "model": "YOUR_ASSISTANT_ID",
  "input": "What did I just ask you?",
  "previous_response_id": "resp_01234567-89ab-cdef-0123-456789abcdef"
}

The platform will:

  1. Load the conversation history associated with that response ID
  2. Append the new user message to the history
  3. Generate a response in context of the full conversation
  4. Store the new exchange and return a new response ID

This new response ID is what you'll use for the next turn.


Multi-Turn Conversation Example

import OpenAI from 'openai';

const client = new OpenAI({
  baseURL: 'https://your-admin.example.com/v1',
  apiKey: process.env.NODEXA_API_KEY,
});

const assistantId = 'YOUR_ASSISTANT_ID';

// Turn 1: introduce yourself
const turn1 = await client.responses.create({
  model: assistantId,
  input: 'My name is Alice and I work in software engineering.',
});
console.log('Assistant:', turn1.output_text);
// "Nice to meet you, Alice! What are you working on in software engineering?"

// Turn 2: follow-up, referencing turn 1
const turn2 = await client.responses.create({
  model: assistantId,
  input: 'What is my profession?',
  previous_response_id: turn1.id,
});
console.log('Assistant:', turn2.output_text);
// "You mentioned that you work in software engineering."

// Turn 3: reference turn 2
const turn3 = await client.responses.create({
  model: assistantId,
  input: 'Can you suggest some career growth tips for someone in my field?',
  previous_response_id: turn2.id,
});
console.log('Assistant:', turn3.output_text);
from openai import OpenAI
import os

client = OpenAI(
    base_url="https://your-admin.example.com/v1",
    api_key=os.environ['NODEXA_API_KEY'],
)

assistant_id = "YOUR_ASSISTANT_ID"

# Turn 1
turn1 = client.responses.create(
    model=assistant_id,
    input="My name is Alice and I work in software engineering.",
)
print("Assistant:", turn1.output_text)

# Turn 2
turn2 = client.responses.create(
    model=assistant_id,
    input="What is my profession?",
    previous_response_id=turn1.id,
)
print("Assistant:", turn2.output_text)

# Turn 3
turn3 = client.responses.create(
    model=assistant_id,
    input="Can you suggest some career growth tips for someone in my field?",
    previous_response_id=turn2.id,
)
print("Assistant:", turn3.output_text)
# Turn 1
TURN1=$(curl -s "https://your-admin.example.com/v1/responses" \
  -H "x-api-key: YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "YOUR_ASSISTANT_ID",
    "input": "My name is Alice and I work in software engineering."
  }')

echo "Assistant: $(echo $TURN1 | jq -r '.output_text')"
RESP1_ID=$(echo $TURN1 | jq -r '.id')

# Turn 2
TURN2=$(curl -s "https://your-admin.example.com/v1/responses" \
  -H "x-api-key: YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d "{
    \"model\": \"YOUR_ASSISTANT_ID\",
    \"input\": \"What is my profession?\",
    \"previous_response_id\": \"$RESP1_ID\"
  }")

echo "Assistant: $(echo $TURN2 | jq -r '.output_text')"
RESP2_ID=$(echo $TURN2 | jq -r '.id')

# Turn 3
curl "https://your-admin.example.com/v1/responses" \
  -H "x-api-key: YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d "{
    \"model\": \"YOUR_ASSISTANT_ID\",
    \"input\": \"Can you suggest some career growth tips for someone in my field?\",
    \"previous_response_id\": \"$RESP2_ID\"
  }"

Starting a New Conversation

To start a fresh conversation, omit previous_response_id. The platform will create a new conversation thread automatically.

{
  "model": "YOUR_ASSISTANT_ID",
  "input": "Hello, I have a new question for you."
}

Threading with Streaming

Threading works the same way when streaming. Read the response ID from the response.completed event:

const stream = await client.responses.create({
  model: 'YOUR_ASSISTANT_ID',
  input: 'Tell me about black holes.',
  previous_response_id: previousId,
  stream: true,
});

let responseId = null;

for await (const event of stream) {
  if (event.type === 'response.output_text.delta') {
    process.stdout.write(event.delta);
  }
  if (event.type === 'response.completed') {
    responseId = event.response.id;
  }
}

// Use responseId for the next turn

Threading with Tool Calls

When a response has status: "requires_action" (a function tool was called), you must continue the same thread with the function result:

// Step 1: Initial request
const response = await client.responses.create({
  model: 'YOUR_ASSISTANT_ID',
  input: 'What is the weather in Paris?',
  tools: [
    {
      type: 'function',
      name: 'get_weather',
      parameters: {
        type: 'object',
        properties: {
          location: { type: 'string' },
        },
        required: ['location'],
      },
    },
  ],
});

// Step 2: response.status === 'requires_action'
// Extract the tool call
const toolCall = response.output.find(item => item.type === 'function_call');

// Step 3: Execute the function
const weatherData = await myGetWeatherFunction(
  JSON.parse(toolCall.arguments).location
);

// Step 4: Return the result in a new request, continuing the SAME thread
const finalResponse = await client.responses.create({
  model: 'YOUR_ASSISTANT_ID',
  previous_response_id: response.id, // <- same thread
  input: [
    {
      type: 'function_call_output',
      call_id: toolCall.call_id,
      output: JSON.stringify(weatherData),
    },
  ],
});

console.log(finalResponse.output_text);

Best Practices

Always store the latest response ID. The response ID from the most recent turn is what you pass as previous_response_id — not the ID of the first message in the conversation.

// Correct: store and update the ID on each turn
let lastResponseId = null;

async function chat(message) {
  const response = await client.responses.create({
    model: assistantId,
    input: message,
    ...(lastResponseId ? { previous_response_id: lastResponseId } : {}),
  });
  lastResponseId = response.id; // always update
  return response.output_text;
}

Include x-user-id for per-user memory. When threading is combined with a user ID, the assistant will also recall long-term memory items stored from previous sessions.

const response = await client.responses.create(
  {
    model: assistantId,
    input: message,
    previous_response_id: lastResponseId,
  },
  {
    headers: { 'x-user-id': currentUserId },
  }
);

Do not reconstruct history manually. If you have a previous_response_id, there is no need to resend older messages in input. The platform loads the full history automatically. Sending redundant history in input alongside previous_response_id may result in duplicate messages.


Thread Persistence

Conversation history is stored server-side and persists indefinitely (subject to your platform's retention policy). Threads survive server restarts and can be resumed across client sessions.

If you need to invalidate a thread (e.g., the user's session ended), simply start a new one by omitting previous_response_id.