用量日志解析

用量日志解析方法和计费建议 #

用量日志仅会在日志中 tenant_id 不为空的情况下出现,即与中控 WebSocket 连接时,query 中需要设定不为空的 tenant 值,例如:ws://localhost:8070/v1?tenant=89757。中控 HTTP Adapter 的 HTTP 接口时,需要进行同样的设定。

目前中控支持的 URL query 方式的重置参数为(对于别名而言,中控会依次检测名称和多个别名,直到找到存在且取值不为空的为止):

字段名称字段别名字段含义字段行为
tenanttenantid tenant_id租户 ID如指定,则会在日志中记录用量
devicedeviceid device_id设备 ID如指定,则会强制覆盖 Starter 报文中的 device 字段
sessionsessionid session_id会话 ID如指定,则会强制覆盖 Starter 报文中的 session 字段

所有日志均为 JSON 格式,即每行都是一个独立的 JSON 对象,记录在 log 文件或 stdout 输出中。日志的字段和具体含义如下:

ASR 日志 #

对于 ASR 计费事件的信息,以下是相关可能出现的字段的详细信息:

字段名称字段取值类型字段含义字段示例
levelstring日志级别,用于表示日志的重要程度“info”
timestring事件发生的时间,采用 ISO 8601 格式“2024-03-13T16:59:17.926+0800”
callerstring产生日志的源代码文件和行号“common/usage.go:75”
msgstring日志的主要消息,描述了具体的事件“processed billable ASR audio”
pidinteger进程 ID,唯一标识了产生日志的进程99335
flowstring业务流程,这里是 ASR“ASR”
devicestring设备标识,表示产生此日志的设备“test-cli-2”
sessionstring会话标识,唯一标识了一个会话“7bcaaa75-f2b9-4702-bd19-88601be3b8fb”
asrstringASR 服务的供应商标识“ASR7”
tenant_idstring租户 ID,表示使用 ASR 服务的用户或者组织“ourdevbox”
sectionstring日志分类,这里是 “usage”,表示是用量相关的日志“usage”
log_idxinteger日志索引,表示在同一会话中的日志序列号,从1开始计算1
round_idxinteger回合索引,表示在同一会话中的回合序列号,从1开始计算1
current_secinteger当前处理的 ASR 音频的长度,单位是秒2
total_secinteger到目前为止,这个会话中所有处理的 ASR 音频的总长度,单位是秒12
total_sizeinteger到目前为止,这个会话中所有处理的 ASR 音频的总数据量,单位是字节384000
left_sizeinteger此条用量日志后后,剩余未统计的 ASR 音频数据量,单位是字节0
batch_sizeinteger用量日志触发的 ASR 音频数据量,单位是字节64000
BYOLboolean是否使用用户提供的供应商账号false

具体样例日志如下:

{"level":"info","time":"2024-03-13T16:59:17.926+0800","caller":"common/usage.go:75","msg":"processed billable ASR audio","pid":99335,"flow":"ASR","device":"test-cli-2","session":"7bcaaa75-f2b9-4702-bd19-88601be3b8fb","asr":"ASR7","tenant_id":"ourdevbox","section":"usage","log_idx":1,"round_idx":1,"current_sec":2,"total_sec":2,"total_size":64000,"left_size":0,"batch_size":64000, "BYOL": false}
{"level":"info","time":"2024-03-13T16:59:18.975+0800","caller":"common/usage.go:75","msg":"processed billable ASR audio","pid":99335,"flow":"ASR","device":"test-cli-2","session":"7bcaaa75-f2b9-4702-bd19-88601be3b8fb","asr":"ASR7","tenant_id":"ourdevbox","section":"usage","log_idx":2,"round_idx":1,"current_sec":2,"total_sec":4,"total_size":128000,"left_size":0,"batch_size":64000, "BYOL": false}
{"level":"info","time":"2024-03-13T16:59:21.026+0800","caller":"common/usage.go:75","msg":"processed billable ASR audio","pid":99335,"flow":"ASR","device":"test-cli-2","session":"7bcaaa75-f2b9-4702-bd19-88601be3b8fb","asr":"ASR7","tenant_id":"ourdevbox","section":"usage","log_idx":3,"round_idx":1,"current_sec":2,"total_sec":6,"total_size":192000,"left_size":0,"batch_size":64000, "BYOL": false}
{"level":"info","time":"2024-03-13T16:59:23.096+0800","caller":"common/usage.go:75","msg":"processed billable ASR audio","pid":99335,"flow":"ASR","device":"test-cli-2","session":"7bcaaa75-f2b9-4702-bd19-88601be3b8fb","asr":"ASR7","tenant_id":"ourdevbox","section":"usage","log_idx":4,"round_idx":1,"current_sec":2,"total_sec":8,"total_size":256000,"left_size":0,"batch_size":64000, "BYOL": false}
{"level":"info","time":"2024-03-13T16:59:25.165+0800","caller":"common/usage.go:75","msg":"processed billable ASR audio","pid":99335,"flow":"ASR","device":"test-cli-2","session":"7bcaaa75-f2b9-4702-bd19-88601be3b8fb","asr":"ASR7","tenant_id":"ourdevbox","section":"usage","log_idx":5,"round_idx":1,"current_sec":2,"total_sec":10,"total_size":320000,"left_size":0,"batch_size":64000, "BYOL": false}
{"level":"info","time":"2024-03-13T16:59:27.237+0800","caller":"common/usage.go:75","msg":"processed billable ASR audio","pid":99335,"flow":"ASR","device":"test-cli-2","session":"7bcaaa75-f2b9-4702-bd19-88601be3b8fb","asr":"ASR7","tenant_id":"ourdevbox","section":"usage","log_idx":6,"round_idx":1,"current_sec":2,"total_sec":12,"total_size":384000,"left_size":0,"batch_size":64000, "BYOL": false}
{"level":"info","time":"2024-03-13T16:59:43.316+0800","caller":"common/usage.go:75","msg":"processed last billable ASR audio","pid":99335,"flow":"ASR","device":"test-cli-2","session":"7bcaaa75-f2b9-4702-bd19-88601be3b8fb","asr":"ASR7","tenant_id":"ourdevbox","section":"usage","log_idx":7,"round_idx":2,"current_sec":0,"total_sec":12,"total_size":415695,"left_size":31695,"batch_size":64000, "BYOL": false}
{"level":"info","time":"2024-03-13T17:06:35.211+0800","caller":"common/usage.go:75","msg":"processed billable ASR audio","pid":99335,"flow":"ASR","device":"test-cli-2","session":"5046b7bd-da7b-40f0-91c1-26bfa515a9d2","asr":"ASR7","tenant_id":"ourdevbox","section":"usage","log_idx":1,"round_idx":1,"current_sec":2,"total_sec":2,"total_size":64000,"left_size":0,"batch_size":64000, "BYOL": false}
{"level":"info","time":"2024-03-13T17:06:36.252+0800","caller":"common/usage.go:75","msg":"processed billable ASR audio","pid":99335,"flow":"ASR","device":"test-cli-2","session":"5046b7bd-da7b-40f0-91c1-26bfa515a9d2","asr":"ASR7","tenant_id":"ourdevbox","section":"usage","log_idx":2,"round_idx":1,"current_sec":2,"total_sec":4,"total_size":128000,"left_size":0,"batch_size":64000, "BYOL": false}
{"level":"info","time":"2024-03-13T17:06:36.753+0800","caller":"common/usage.go:75","msg":"processed last billable ASR audio","pid":99335,"flow":"ASR","device":"test-cli-2","session":"5046b7bd-da7b-40f0-91c1-26bfa515a9d2","asr":"ASR7","tenant_id":"ourdevbox","section":"usage","log_idx":3,"round_idx":1,"current_sec":0,"total_sec":4,"total_size":143360,"left_size":15360,"batch_size":64000, "BYOL": false}

过滤建议,以下条件均需要符合:

  • level 字段级别为 info 字符串;
  • msg 字段包含 "billable ASR audio" 字符串;
  • flow 字段为 "ASR" 字符串;
  • tenant_id 字段存在且不为空字符串;
  • current_sec 字段存在且数值大于 0;

计费需要关注的字段:

  • current_sec 为当前即将处理的 ASR 音频的长度,单位是秒,当前粒度为2秒,即数值不会超过2
  • total_sec 为到目前为止,这个会话中所有处理的 ASR 音频的总长度,单位是秒;
  • tenant_id 为租户 ID,表示使用 ASR 服务的用户或者组织;
  • time 为事件发生的时间,采用 ISO 8601 格式;
  • msg 为日志的主要消息,"processed billable ASR audio" 表示进行中的,"processed last billable ASR audio" 表示断开链接时最后一条(可能会缺失);
  • asr 供应商标识,用于表示具体提供 ASR 服务的供应商,不同供应商的计费方式和单价会有所不同;
  • BYOL 是否使用用户提供的供应商账号,如果为 true,表示用户提供了自己的供应商账号,这部分成本由用户自行承担,应当在计费时过滤掉;

计费时应先根据以上条件过滤出 ASR 用量相关日志,然后按时间和租户信息进行过滤,最后根据 current_sec 的累积值计算出一段时间内同一个用户的 ASR 用量。另外,根据需要,还可以根据 sessiondevice 等字段进行进一步的过滤和分析。

TTS 日志 #

对于 TTS 计费事件的信息,以下是相关可能出现的字段的详细信息:

字段名称字段取值类型字段含义字段示例
levelstring日志级别,用于表示日志的重要程度“info”
timestring事件发生的时间,采用 ISO 8601 格式“2024-03-13T19:24:35.879+0800”
callerstring产生日志的源代码文件和行号“ftts/flow.go:241”
msgstring日志的主要消息,描述了具体的事件“processing billable TTS query”
pidinteger进程 ID,唯一标识了产生日志的进程65631
flowstring业务流程,这里是 TTS“TTS”
devicestring设备标识,表示产生此日志的设备“device-wei”
sessionstring会话标识,唯一标识了一个会话“bd6b48a0-24be-4c58-8642-29f8b7d0c88c”
ttsstringTTS 服务的供应商标识“TTS5”
tenant_idstring租户 ID,表示使用 TTS 服务的用户或者组织“devbox”
request_indexinteger请求索引,表示在同一会话中的请求序列号,从 1 开始计算1
requeststring请求标识,唯一标识了一个 TTS 请求“8scd9dzbez3vky5lg1s1ut815”
tracestring跟踪标识,用于跟踪请求的处理过程“473b1daa-c469-4dd4-b90f-fe80d5d7c9e1”
sectionstring日志分类,这里是 “usage”,表示是用量相关的日志“usage”
char_cntintegerTTS 请求的字符数,即将要转换为语音的文本的字符数73
query_snapstringTTS 请求的部分内容截取,用于预览、比对“19世纪末,保罗·埃尔利希”
hit_cacheboolean是否命中缓存,如果为 true,表示这个请求实际上未发送至供应商处理false
languagestringTTS 语音的语言“zh-CN”
voice_idstring语音的标识,用于表示 TTS 语音的类型和性别“zh-cn-XiaoyiNeural”
facefeature_idstring面部特征标识,用于指示 TTS 服务的一些特性“3d8917918f4c47d49a26fbea45808d44_s1”
conversion_idstring转换标识,用于跟踪 TTS 服务的转换过程“nina”
BYOLboolean是否使用用户提供的供应商账号false

具体样例日志如下:

{"level":"info","time":"2024-04-07T06:53:16.594Z","caller":"ftts/flow.go:243","msg":"processing billable TTS query","pid":1,"flow":"TTS","device":"PAAS_LIVE","session":"357|123","tts":"TTS3","tenant_id":"kaifa-test","request_index":1,"request":"44444444444","trace":"1a08b1bc-5b9a-4574-9056-711ecd720c08","section":"usage","char_cnt":78,"rune_cnt":66,"query_snap":"<speak sttts:version=\"0.1\">开始打断内容,今天星期四<break time=\"0ms\"/></speak>","hit_cache":false,"language":"zh-CN","voice_id":"nina","facefeature_id":"0325_nina_s3_beauty","conversion_id":"", "BYOL": false}
{"level":"info","time":"2024-04-07T06:53:16.617Z","caller":"ftts/flow.go:243","msg":"processed billable TTS query","pid":1,"flow":"TTS","device":"PAAS_LIVE","session":"357|123","tts":"TTS3","tenant_id":"kaifa-test","request_index":1,"request":"44444444444","trace":"1a08b1bc-5b9a-4574-9056-711ecd720c08","section":"usage","char_cnt":78,"rune_cnt":66,"query_snap":"<speak sttts:version=\"0.1\">开始打断内容,今天星期四<break time=\"0ms\"/></speak>","hit_cache":true,"language":"zh-CN","voice_id":"nina","facefeature_id":"0325_nina_s3_beauty","conversion_id":"", "BYOL": false}
{"level":"info","time":"2024-04-07T07:13:07.299Z","caller":"ftts/flow.go:243","msg":"processing billable TTS query","pid":1,"flow":"TTS","device":"PAAS_VIDEO_2D","session":"20989592278|6e5dc4df22174f5f8153750655c42b0a","tts":"TTS5","tenant_id":"166","request_index":1,"request":"6e5dc4df22174f5f8153750655c42b0a","trace":"cfda6671-df76-4684-9825-31ae060b9dba","section":"usage","char_cnt":449,"rune_cnt":231,"query_snap":"商汤科技作为人工智能软件公司,商汤科技以“坚持原创,让AI引领人类进步”为使命,旨在持续引领人工智能前沿研究,持续打造更具拓展性更普惠的人工智能软件平台,推动经济、社会和人类的发展,并持续吸引及培养顶尖人才,共同塑造未来。 \n\n\n\n商汤科技拥有深厚的学术积累,并长期投入于原创技术研究,不断增强行业领先的多模态、多任务通用人工智能能力,涵盖感知智能、自然语言处理、决策智能、智能内容生成等关键技术领域,同时包含AI芯片、AI传感器及AI算力基础设施在内的关键能力","hit_cache":false,"language":"zh-CN","voice_id":"zh-CN-XiaochenNeural","facefeature_id":"af0c4f927f7a4af7820167c0928ea357_s1_1","conversion_id":"", "BYOL": true}
{"level":"info","time":"2024-04-07T07:13:09.596Z","caller":"ftts/flow.go:243","msg":"processed billable TTS query","pid":1,"flow":"TTS","device":"PAAS_VIDEO_2D","session":"20989592278|6e5dc4df22174f5f8153750655c42b0a","tts":"TTS5","tenant_id":"166","request_index":1,"request":"6e5dc4df22174f5f8153750655c42b0a","trace":"cfda6671-df76-4684-9825-31ae060b9dba","section":"usage","char_cnt":449,"rune_cnt":231,"query_snap":"商汤科技作为人工智能软件公司,商汤科技以“坚持原创,让AI引领人类进步”为使命,旨在持续引领人工智能前沿研究,持续打造更具拓展性更普惠的人工智能软件平台,推动经济、社会和人类的发展,并持续吸引及培养顶尖人才,共同塑造未来。 \n\n\n\n商汤科技拥有深厚的学术积累,并长期投入于原创技术研究,不断增强行业领先的多模态、多任务通用人工智能能力,涵盖感知智能、自然语言处理、决策智能、智能内容生成等关键技术领域,同时包含AI芯片、AI传感器及AI算力基础设施在内的关键能力","hit_cache":false,"language":"zh-CN","voice_id":"zh-CN-XiaochenNeural","facefeature_id":"af0c4f927f7a4af7820167c0928ea357_s1_1","conversion_id":"", "BYOL": true}
{"level":"info","time":"2024-04-07T07:18:38.030Z","caller":"ftts/flow.go:243","msg":"processing billable TTS query","pid":1,"flow":"TTS","device":"PAAS_VIDEO_2D","session":"20989592279|c71d23205cf148daa55cb2b45fdbb848","tts":"TTS3","tenant_id":"166","request_index":1,"request":"c71d23205cf148daa55cb2b45fdbb848","trace":"b9c68547-8298-41f8-a3c8-aa86aa81d93a","section":"usage","char_cnt":449,"rune_cnt":231,"query_snap":"商汤科技作为人工智能软件公司,商汤科技以“坚持原创,让AI引领人类进步”为使命,旨在持续引领人工智能前沿研究,持续打造更具拓展性更普惠的人工智能软件平台,推动经济、社会和人类的发展,并持续吸引及培养顶尖人才,共同塑造未来。 \n\n\n\n商汤科技拥有深厚的学术积累,并长期投入于原创技术研究,不断增强行业领先的多模态、多任务通用人工智能能力,涵盖感知智能、自然语言处理、决策智能、智能内容生成等关键技术领域,同时包含AI芯片、AI传感器及AI算力基础设施在内的关键能力","hit_cache":false,"language":"zh-CN","voice_id":"nina","facefeature_id":"af0c4f927f7a4af7820167c0928ea357_s1_1","conversion_id":"", "BYOL": false}
{"level":"info","time":"2024-04-07T07:18:38.063Z","caller":"ftts/flow.go:243","msg":"processed billable TTS query","pid":1,"flow":"TTS","device":"PAAS_VIDEO_2D","session":"20989592279|c71d23205cf148daa55cb2b45fdbb848","tts":"TTS3","tenant_id":"166","request_index":1,"request":"c71d23205cf148daa55cb2b45fdbb848","trace":"b9c68547-8298-41f8-a3c8-aa86aa81d93a","section":"usage","char_cnt":449,"rune_cnt":231,"query_snap":"商汤科技作为人工智能软件公司,商汤科技以“坚持原创,让AI引领人类进步”为使命,旨在持续引领人工智能前沿研究,持续打造更具拓展性更普惠的人工智能软件平台,推动经济、社会和人类的发展,并持续吸引及培养顶尖人才,共同塑造未来。 \n\n\n\n商汤科技拥有深厚的学术积累,并长期投入于原创技术研究,不断增强行业领先的多模态、多任务通用人工智能能力,涵盖感知智能、自然语言处理、决策智能、智能内容生成等关键技术领域,同时包含AI芯片、AI传感器及AI算力基础设施在内的关键能力","hit_cache":true,"language":"zh-CN","voice_id":"nina","facefeature_id":"af0c4f927f7a4af7820167c0928ea357_s1_1","conversion_id":"", "BYOL": false}

过滤建议,以下条件均需要符合:

  • level 字段级别为 info 字符串;
  • msg 字段包含 "billable TTS query" 字符串;
  • flow 字段为 "TTS" 字符串;
  • tenant_id 字段存在且不为空字符串;
  • char_cnt 字段存在且数值大于 0;

计费需要关注的字段:

  • char_cnt 为当前 TTS 请求的字符数,即将要转换为语音的文本的字符数,具体计算方法依据字符的Unicode编码进行判定:

    • ASCII及其扩展Latin-1字符:所有Unicode编码值在255(含)以内的字符,包括但不限于ASCII字符集及其扩展Latin-1字符集(此范围常见于拉丁字母及部分符号),每个字符计为1个计费单位。
    • 非ASCII及扩展Latin-1字符:所有Unicode编码值超过255的字符,包括但不限于中文、日文、韩文汉字,以及西里尔字母等,每个字符计为2个计费单位。
    • 特殊字符:控制字符、空白字符以及语音合成标记语言(SSML和USSML)标记,同样按照上述相应的编码范围标准进行计费。

    例如,“Aloha”记为5个字符,“Voilà!”记为6个字符,“你好!”记为6个字符,“안녕하세요"记为10个字符。

  • tenant_id 为租户 ID,表示使用 TTS 服务的用户或者组织;

  • time 为事件发生的时间,采用 ISO 8601 格式;

  • msg 为日志的主要消息,"processing billable TTS query" 表示开始合成的,"processed billable TTS query" 表示合成完毕的(可能会缺失);

  • tts 供应商标识,用于表示具体提供 TTS 服务的供应商,不同供应商的计费方式和单价会有所不同;

  • hit_cache 是否命中缓存,如果为 true,表示这个请求实际上未发送至供应商处理,建议同样向用户收费,但是我们成本统计时应过滤这部分;

  • facefeature_id 非空时表示 TTS 服务附带请求了 Face Feature 服务,这部分 GPU 的成本需要额外计算;

  • conversion_id 非空时表示 TTS 服务附带请求了 Voice Conversion 服务,这部分 GPU 的成本需要额外计算;

  • BYOL 是否使用用户提供的供应商账号,如果为 true,表示用户提供了自己的供应商账号,这部分成本由用户自行承担,应当在计费时过滤掉;

计费时应先根据以上条件过滤出 TTS 用量相关日志,然后按时间和租户信息进行过滤,最后根据 char_cnt 计算出一段时间内同一个用户的 TTS 用量。另外,根据需要,还可以根据 sessiondevice 等字段进行进一步的过滤和分析。