Chat interface

Natural language interactive dialogue chat is a composite interface, and one interface can do many tasks.Here we will list the interface calling methods first, and then describe how to use them in different scenarios.

make a request

  • Request method: POST
  • Request URL: /v1/ctai/query_text_chat
  • Content-Type: application/json

request parameters

    "query_text": "Hello",
    "ctdoc_id": "test_doc_id",
    "session_id": "test_id",
    "enable_citation": "1",
    "enable_followup": "1",
    "smart_temp ": "0",
    "context": "test_context",
    "stream": "1",

request parameters

parametertyperequiredParameter Description
query_textstringyesQuestion text, eg:hello.
ctdoc_idstringnoDocument id, separated by English commas, if document id is specified, it will only be considered from the specified document, if not specified, it will be considered from all documents.
session_idstringyesSession id, which is used to concatenate multiple rounds of chatting sessions. Only multiple Q&As with the same session ID can form multiple rounds of Q&A.It cannot exceed 64 bytes, and only numeric English and -_. characters are allowed.
enable_citationstringnoWhether to return citations, by default.0:no , 1:yes.
enable_followupstringnoWhether to return recommendation questions, returned by default.0:no,1:yes.
smart_tempstringnoAI creativity level, the value is 0-10.0 is the most conservative level and 10 is the most liberal level.
contextstringnoContext content, default is empty.
flowstringnoWhether to stream output, the default is non-streaming, 0: no, 1: yes

Description of request method

session_id is the key to forming a round-to-round dialogue, and the same session_id will be regarded as the same dialogue.On the contrary, if you want to restart a topic, just generate a new session_id and pass it in.So session_id should use GUID generation rules to avoid duplication in the system. context can be used to add context manually, or to customize the prompt to a certain extent (note that this usage is usually only applicable to the first dialogue request initiated by a new session_id)

Different scenarios require different ways to use this API.The following is an example of centrally calling the chat interface.

  • Seriously answer scene should not specify ctdoc_id, you don’t want the robot to play randomly, you want the robot to show the source of the article, and you don’t want to continue asking questions. Then the request may look like this:
    "query_text": "What are the conditions for being able to marry in the Civil Code",
    "session_id": "C9857E6A-2E26-4A9D-887E-BB8A4A0B9BD4",
    "enable_citation": "1",
    "enable_followup": "0",
    "smart_temp": "0",
    "stream": "1",
  • Chat scene:should not specify ctdoc_id, hope that the robot will play randomly, do not want the robot to show the source of the article, and want to continue to ask questions, then the request may look like this:
    "query_text": "Request whether there is water on Mars",
    "session_id": "05FE9D4A-E82A-4E36-827A-EFB7A5ABF6E0",
    "enable_citation": "0",
    "enable_followup": "1",
    "smart_temp": "9",
    "stream": "1",
  • Summary scene:Specify the ctdoc_id to summarize a certain text. The robot can perform to a certain extent and does not need to show the source. If you want to continue asking questions, the request may look like this:
    "query_text": "What is the author's point of view mainly expressed in this book",
    "ctdoc_id": "the_book_doc_id",
    "session_id": "05FE9D4A-E82A-4E36-827A-EFB7A5ABF6E0",
    "enable_citation": " 0",
    "enable_followup": "1",
    "smart_temp": "4",
    "stream": "1",

In order to achieve the best scene effect, we need to try various combinations.

accept request

Accepted parameters are divided into non-streaming responses (stream = 0) and streaming responses (stream = 1).They differ as follows:

  • The non-streaming response:uses the standard application/json output to output all the results of the robot's thinking at one time, and the return is relatively slow.But the advantage is that the system overhead is small, the encoding is simple, and the format is complete.
  • Streaming response:uses the SSE protocol to continuously output, similar to the effect of a ChatGPT typewriter to slowly output all information.The response time will be very fast, usually under 1 second.However, the encoding is more complicated and the system overhead is large.

non-streaming response

  • Content-Type : application/json
    "answer_text": "Hello",
    "ref_doc_list": [
            "ctdoc_id": "1b58edcbfd9ad65d5e87bf77de721f5e",
            "ctai_doc": {
                "ctdoc_id": "1b58edcbfd9ad65d5e87bf77de721f5e",
                "doc_type": "doc ",
                "doc_name": "test_name",
                "doc_url": "test_url",
                "create_time": "1679886694",
                "update_time": "1680103463",
                "doc_file_url": "https://test. com/test.pdf"
        // ... other document information objects
    "follow_up_list": [
            "title": "Recommended question 1",
        // .. . Other recommended questions
parametertypeParameter Description
answer_textstringAnswer text
The id of the cited document will be included with [[xxxxxxx]]
ref_doc_listarrayList of Cited Documents
ctdoc_idstringdocument id
ctai_docobjectDocument Information Object
doc_typestringDocument type, the values are as follows::
doc:document of document type
url:document of URL type
doc_urlstringOnly documents of type url are valid.The url address of the URL type document
doc_namestringOnly documents of type doc are valid.file name
doc_file_urlstringdocument download url
create_timestringCreation time, an integer timestamp in seconds
update_timestringLast modification time, an integer timestamp in seconds
follow_up_listarrayList of recommended questions

stream response

The streaming response uses the SSE (Server-Sent Events) protocol; and it does not support receiving the Last-Event-Id Header for the time being, that is, it does not support recovering the interrupted SSE connection through the packet id;

There are two types of subpackage rules for streaming responses: "intermediate packet" and "end packet". The last packet of the subpackage is the end packet, and all subpackages except the end packet are intermediate packets. After all subpackages are sent, they will be closed automatically connect.They are as follows respectively:

  • The rules of the tundish are as follows:
    • An intermediate package contains only a list of documents and a list of recommended issues for citations from content in the current package;
    • The event field of the middle package is an empty string;
  • The rules for ending a packet are as follows:
    • The end package does not contain any actual answer content, that is to say, the end package is a special package that is added separately after all answers are output (through the middle package output);
    • The event field of the end packet is stop;
    • The end package contains a list of documents and a list of recommended questions for all citations of the overall answer of this query;

Response Header

  • Content-Type : text/event-stream

Note:Because the response header no longer returns json, you cannot add a header accepting json when initiating a request.

Response example

Here, take the response "Hello world" as an example, it may be split into two data packets for transmission, and an end flag packet (event is stop) is appended at the end.The final response is a total of three data packets.

id: test_id_1
data: {"ret":"0","data":{"msg_id":"test_id_1","msg_sn":"0","event":"","answer_text":"Hello" }}

id: test_id_2
data: {"ret":"0","data":{"msg_id":"test_id_2","msg_sn":"1","event":"","answer_text": "world"}}

id: test_id_3
data: {"ret":"0","data":{"msg_id":"test_id_3","msg_sn":"2","event":"stop", "answer_text":""}}

Specifically, the data packet of each data packet is formatted like this:

    "ret": "0",
    "msg": "",
        "msg_id": "test_id",
        "msg_sn": "1",
        "event": "stop" ,
        "answer_text": "Hello",
        "ref_doc_list": [],
        "follow_up_list": []
parametertypeParameter Description
retstringResult code, string type, success is 0, failure is non-0, the specific error is detailed in the appendix and the specific interface description.
msgstringError message, empty string if successful.
dataobjectBusiness parameter information object.
msg_idstringThe unique id of each subpackage, each subpackage id is globally unique
msg_snstringThe serial number of each subpackage, starting from 0, +1 for each package
eventstringThe event of the current streaming subcontract, event= is an empty string to indicate the intermediate package, event=stop indicates the last package
answer_textstringConsistent with the parameter of the same name for the non-streaming interface.The document id block and follow up block of each citation will be returned as a whole, and will not be split into multiple packages to return.
ref_doc_listarrayIt is consistent with the parameter of the same name of the non-streaming response, see the description of the non-streaming interface for details.
follow_up_listarrayIt is consistent with the parameter of the same name of the non-streaming response, see the description of the non-streaming interface for details.

How to use this API on the front-end

If using SSE from JavaScript front-end, we recommend Microsoft's library: FetchEventSource. This library is well encapsulated and here is an example:

import { EventSourceMessage, fetchEventSource } from '@microsoft/fetch-event-source';
    method: 'POST',
    body: JSON.stringify(d),
    onmessage:(ev:EventSourceMessage) => {
      try {
        const res = JSON.parse(;
        const {ret,data,msg} = res;
        if (ret === '0') {
          if (data.ref_doc_list && data.ref_doc_list.length > 0) {
            doc_list = doc_list.concat(data.ref_doc_list)
          data.ref_doc_list = doc_list;
          result += data.answer_text;
          onMessage && onMessage(result,data);
        } else {
          const error = getErrorInfo(ret,msg);
          if (result) {
            result += "\n"+error
          } else {
            result = error;
      } catch (error) {
    async onopen(response) {
      if (response.status === 200) {
  onclose() {
  onerror(err) {
    console.log('Final result',result);
    return {
    return {