New Thinking Model - Qwen3.5 with Reasoning Support

Text Generator > Blog > New Thinking Model - Qwen3.5

New Thinking Model with Reasoning Support

Qwen3.5-4B - Think Before You Speak

We've upgraded our text generation model to Qwen3.5-4B, a new model with built-in thinking and reasoning capabilities. When enabled, the model thinks through problems step by step before giving you the final answer.

What's New

Thinking/Reasoning Mode - The model can now show its thought process via an enable_thinking parameter
OpenAI-Compatible Chat API - New /v1/chat/completions endpoint compatible with OpenAI SDKs
Streaming Support - Real-time SSE streaming with separate thinking and content streams
Better Quality - Improved text generation quality across all use cases

Thinking Mode

When enable_thinking is set to true, the model reasons through the problem internally before producing its response. The reasoning is returned separately in a reasoning_content field, so you can display it or hide it as needed.

Chat Completions API

The new /v1/chat/completions endpoint is compatible with the OpenAI SDK format. You can use it with any OpenAI-compatible client library.

Python Example - Chat with Thinking

import requests
import os

API_KEY = os.getenv("TEXT_GENERATOR_API_KEY")
if API_KEY is None:
    raise Exception(
        "Please set TEXT_GENERATOR_API_KEY environment variable, "
        "login to https://text-generator.io to get your API key")

response = requests.post(
    "https://api.text-generator.io/v1/chat/completions",
    json={
        "messages": [
            {"role": "user", "content": "What causes rainbows?"}
        ],
        "enable_thinking": True,
        "max_tokens": 2000,
    },
    headers={"secret": API_KEY}
)

result = response.json()
message = result["choices"][0]["message"]

# The model's reasoning process (optional)
if "reasoning_content" in message:
    print("Thinking:", message["reasoning_content"])

# The final answer
print("Answer:", message["content"])

Streaming Example

import requests
import json
import os

API_KEY = os.getenv("TEXT_GENERATOR_API_KEY")
headers = {"secret": API_KEY, "Content-Type": "application/json"}

response = requests.post(
    "https://api.text-generator.io/v1/chat/completions",
    json={
        "messages": [
            {"role": "user", "content": "Write a short poem about the stars."}
        ],
        "stream": True,
        "enable_thinking": False,
        "max_tokens": 500,
    },
    headers=headers,
    stream=True,
)

for line in response.iter_lines():
    if line:
        text = line.decode("utf-8")
        if text.startswith("data: ") and text != "data: [DONE]":
            chunk = json.loads(text[6:])
            delta = chunk["choices"][0]["delta"]
            if "content" in delta:
                print(delta["content"], end="", flush=True)
            elif "reasoning_content" in delta:
                print(delta["reasoning_content"], end="", flush=True)

OpenAI SDK Compatible

from openai import OpenAI

client = OpenAI(
    api_key="your-text-generator-api-key",
    base_url="https://api.text-generator.io/v1",
    default_headers={"secret": "your-text-generator-api-key"},
)

response = client.chat.completions.create(
    model="qwen3.5-4b",
    messages=[
        {"role": "user", "content": "Explain quantum entanglement simply."}
    ],
    max_tokens=1000,
)

print(response.choices[0].message.content)

JavaScript Example

const response = await fetch(
  "https://api.text-generator.io/v1/chat/completions",
  {
    method: "POST",
    headers: {
      "Content-Type": "application/json",
      "secret": "your-api-key",
    },
    body: JSON.stringify({
      messages: [{ role: "user", content: "What is 2+2? Think step by step." }],
      enable_thinking: true,
      max_tokens: 2000,
    }),
  }
);

const data = await response.json();
console.log("Thinking:", data.choices[0].message.reasoning_content);
console.log("Answer:", data.choices[0].message.content);

Curl Example

curl -X POST https://api.text-generator.io/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "secret: YOUR_API_KEY" \
  -d '{
    "messages": [{"role": "user", "content": "What causes tides?"}],
    "enable_thinking": true,
    "max_tokens": 2000
  }'

Try It in the Playground

The Playground now has an "Enable Thinking/Reasoning" checkbox. Turn it on to see the model's thought process in a collapsible section above the response.

Legacy API Still Works

The existing /api/v1/generate endpoint continues to work as before. The model upgrade is seamless - your existing integrations will see improved quality without any code changes.

Pricing

Chat completions and thinking mode are included in all existing plans at no extra cost.

Try it yourself at: Text Generator Playground