用量日志解析方法和计费建议 #
用量日志仅会在日志中 tenant_id 不为空的情况下出现,即与中控 WebSocket 连接时,query 中需要设定不为空的 tenant 值,例如:ws://localhost:8070/v1?tenant=89757。中控 HTTP Adapter 的 HTTP 接口时,需要进行同样的设定。
目前中控支持的 URL query 方式的重置参数为(对于别名而言,中控会依次检测名称和多个别名,直到找到存在且取值不为空的为止):
| 字段名称 | 字段别名 | 字段含义 | 字段行为 |
|---|---|---|---|
tenant | tenantid tenant_id | 租户 ID | 如指定,则会在日志中记录用量 |
device | deviceid device_id | 设备 ID | 如指定,则会强制覆盖 Starter 报文中的 device 字段 |
session | sessionid session_id | 会话 ID | 如指定,则会强制覆盖 Starter 报文中的 session 字段 |
所有日志均为 JSON 格式,即每行都是一个独立的 JSON 对象,记录在 log 文件或 stdout 输出中。日志的字段和具体含义如下:
ASR 日志 #
对于 ASR 计费事件的信息,以下是相关可能出现的字段的详细信息:
| 字段名称 | 字段取值类型 | 字段含义 | 字段示例 |
|---|---|---|---|
| level | string | 日志级别,用于表示日志的重要程度 | “info” |
| time | string | 事件发生的时间,采用 ISO 8601 格式 | “2024-03-13T16:59:17.926+0800” |
| caller | string | 产生日志的源代码文件和行号 | “common/usage.go:75” |
| msg | string | 日志的主要消息,描述了具体的事件 | “processed billable ASR audio” |
| pid | integer | 进程 ID,唯一标识了产生日志的进程 | 99335 |
| flow | string | 业务流程,这里是 ASR | “ASR” |
| device | string | 设备标识,表示产生此日志的设备 | “test-cli-2” |
| session | string | 会话标识,唯一标识了一个会话 | “7bcaaa75-f2b9-4702-bd19-88601be3b8fb” |
| asr | string | ASR 服务的供应商标识 | “ASR7” |
| tenant_id | string | 租户 ID,表示使用 ASR 服务的用户或者组织 | “ourdevbox” |
| section | string | 日志分类,这里是 “usage”,表示是用量相关的日志 | “usage” |
| log_idx | integer | 日志索引,表示在同一会话中的日志序列号,从1开始计算 | 1 |
| round_idx | integer | 回合索引,表示在同一会话中的回合序列号,从1开始计算 | 1 |
| current_sec | integer | 当前处理的 ASR 音频的长度,单位是秒 | 2 |
| total_sec | integer | 到目前为止,这个会话中所有处理的 ASR 音频的总长度,单位是秒 | 12 |
| total_size | integer | 到目前为止,这个会话中所有处理的 ASR 音频的总数据量,单位是字节 | 384000 |
| left_size | integer | 此条用量日志后后,剩余未统计的 ASR 音频数据量,单位是字节 | 0 |
| batch_size | integer | 用量日志触发的 ASR 音频数据量,单位是字节 | 64000 |
| BYOL | boolean | 是否使用用户提供的供应商账号 | false |
具体样例日志如下:
{"level":"info","time":"2024-03-13T16:59:17.926+0800","caller":"common/usage.go:75","msg":"processed billable ASR audio","pid":99335,"flow":"ASR","device":"test-cli-2","session":"7bcaaa75-f2b9-4702-bd19-88601be3b8fb","asr":"ASR7","tenant_id":"ourdevbox","section":"usage","log_idx":1,"round_idx":1,"current_sec":2,"total_sec":2,"total_size":64000,"left_size":0,"batch_size":64000, "BYOL": false}
{"level":"info","time":"2024-03-13T16:59:18.975+0800","caller":"common/usage.go:75","msg":"processed billable ASR audio","pid":99335,"flow":"ASR","device":"test-cli-2","session":"7bcaaa75-f2b9-4702-bd19-88601be3b8fb","asr":"ASR7","tenant_id":"ourdevbox","section":"usage","log_idx":2,"round_idx":1,"current_sec":2,"total_sec":4,"total_size":128000,"left_size":0,"batch_size":64000, "BYOL": false}
{"level":"info","time":"2024-03-13T16:59:21.026+0800","caller":"common/usage.go:75","msg":"processed billable ASR audio","pid":99335,"flow":"ASR","device":"test-cli-2","session":"7bcaaa75-f2b9-4702-bd19-88601be3b8fb","asr":"ASR7","tenant_id":"ourdevbox","section":"usage","log_idx":3,"round_idx":1,"current_sec":2,"total_sec":6,"total_size":192000,"left_size":0,"batch_size":64000, "BYOL": false}
{"level":"info","time":"2024-03-13T16:59:23.096+0800","caller":"common/usage.go:75","msg":"processed billable ASR audio","pid":99335,"flow":"ASR","device":"test-cli-2","session":"7bcaaa75-f2b9-4702-bd19-88601be3b8fb","asr":"ASR7","tenant_id":"ourdevbox","section":"usage","log_idx":4,"round_idx":1,"current_sec":2,"total_sec":8,"total_size":256000,"left_size":0,"batch_size":64000, "BYOL": false}
{"level":"info","time":"2024-03-13T16:59:25.165+0800","caller":"common/usage.go:75","msg":"processed billable ASR audio","pid":99335,"flow":"ASR","device":"test-cli-2","session":"7bcaaa75-f2b9-4702-bd19-88601be3b8fb","asr":"ASR7","tenant_id":"ourdevbox","section":"usage","log_idx":5,"round_idx":1,"current_sec":2,"total_sec":10,"total_size":320000,"left_size":0,"batch_size":64000, "BYOL": false}
{"level":"info","time":"2024-03-13T16:59:27.237+0800","caller":"common/usage.go:75","msg":"processed billable ASR audio","pid":99335,"flow":"ASR","device":"test-cli-2","session":"7bcaaa75-f2b9-4702-bd19-88601be3b8fb","asr":"ASR7","tenant_id":"ourdevbox","section":"usage","log_idx":6,"round_idx":1,"current_sec":2,"total_sec":12,"total_size":384000,"left_size":0,"batch_size":64000, "BYOL": false}
{"level":"info","time":"2024-03-13T16:59:43.316+0800","caller":"common/usage.go:75","msg":"processed last billable ASR audio","pid":99335,"flow":"ASR","device":"test-cli-2","session":"7bcaaa75-f2b9-4702-bd19-88601be3b8fb","asr":"ASR7","tenant_id":"ourdevbox","section":"usage","log_idx":7,"round_idx":2,"current_sec":0,"total_sec":12,"total_size":415695,"left_size":31695,"batch_size":64000, "BYOL": false}
{"level":"info","time":"2024-03-13T17:06:35.211+0800","caller":"common/usage.go:75","msg":"processed billable ASR audio","pid":99335,"flow":"ASR","device":"test-cli-2","session":"5046b7bd-da7b-40f0-91c1-26bfa515a9d2","asr":"ASR7","tenant_id":"ourdevbox","section":"usage","log_idx":1,"round_idx":1,"current_sec":2,"total_sec":2,"total_size":64000,"left_size":0,"batch_size":64000, "BYOL": false}
{"level":"info","time":"2024-03-13T17:06:36.252+0800","caller":"common/usage.go:75","msg":"processed billable ASR audio","pid":99335,"flow":"ASR","device":"test-cli-2","session":"5046b7bd-da7b-40f0-91c1-26bfa515a9d2","asr":"ASR7","tenant_id":"ourdevbox","section":"usage","log_idx":2,"round_idx":1,"current_sec":2,"total_sec":4,"total_size":128000,"left_size":0,"batch_size":64000, "BYOL": false}
{"level":"info","time":"2024-03-13T17:06:36.753+0800","caller":"common/usage.go:75","msg":"processed last billable ASR audio","pid":99335,"flow":"ASR","device":"test-cli-2","session":"5046b7bd-da7b-40f0-91c1-26bfa515a9d2","asr":"ASR7","tenant_id":"ourdevbox","section":"usage","log_idx":3,"round_idx":1,"current_sec":0,"total_sec":4,"total_size":143360,"left_size":15360,"batch_size":64000, "BYOL": false}
过滤建议,以下条件均需要符合:
level字段级别为info字符串;msg字段包含"billable ASR audio"字符串;flow字段为"ASR"字符串;tenant_id字段存在且不为空字符串;current_sec字段存在且数值大于 0;
计费需要关注的字段:
current_sec为当前即将处理的 ASR 音频的长度,单位是秒,当前粒度为2秒,即数值不会超过2;total_sec为到目前为止,这个会话中所有处理的 ASR 音频的总长度,单位是秒;tenant_id为租户 ID,表示使用 ASR 服务的用户或者组织;time为事件发生的时间,采用 ISO 8601 格式;msg为日志的主要消息,"processed billable ASR audio"表示进行中的,"processed last billable ASR audio"表示断开链接时最后一条(可能会缺失);asr供应商标识,用于表示具体提供 ASR 服务的供应商,不同供应商的计费方式和单价会有所不同;BYOL是否使用用户提供的供应商账号,如果为true,表示用户提供了自己的供应商账号,这部分成本由用户自行承担,应当在计费时过滤掉;
计费时应先根据以上条件过滤出 ASR 用量相关日志,然后按时间和租户信息进行过滤,最后根据 current_sec 的累积值计算出一段时间内同一个用户的 ASR 用量。另外,根据需要,还可以根据 session、device 等字段进行进一步的过滤和分析。
TTS 日志 #
对于 TTS 计费事件的信息,以下是相关可能出现的字段的详细信息:
| 字段名称 | 字段取值类型 | 字段含义 | 字段示例 |
|---|---|---|---|
| level | string | 日志级别,用于表示日志的重要程度 | “info” |
| time | string | 事件发生的时间,采用 ISO 8601 格式 | “2024-03-13T19:24:35.879+0800” |
| caller | string | 产生日志的源代码文件和行号 | “ftts/flow.go:241” |
| msg | string | 日志的主要消息,描述了具体的事件 | “processing billable TTS query” |
| pid | integer | 进程 ID,唯一标识了产生日志的进程 | 65631 |
| flow | string | 业务流程,这里是 TTS | “TTS” |
| device | string | 设备标识,表示产生此日志的设备 | “device-wei” |
| session | string | 会话标识,唯一标识了一个会话 | “bd6b48a0-24be-4c58-8642-29f8b7d0c88c” |
| tts | string | TTS 服务的供应商标识 | “TTS5” |
| tenant_id | string | 租户 ID,表示使用 TTS 服务的用户或者组织 | “devbox” |
| request_index | integer | 请求索引,表示在同一会话中的请求序列号,从 1 开始计算 | 1 |
| request | string | 请求标识,唯一标识了一个 TTS 请求 | “8scd9dzbez3vky5lg1s1ut815” |
| trace | string | 跟踪标识,用于跟踪请求的处理过程 | “473b1daa-c469-4dd4-b90f-fe80d5d7c9e1” |
| section | string | 日志分类,这里是 “usage”,表示是用量相关的日志 | “usage” |
| char_cnt | integer | TTS 请求的字符数,即将要转换为语音的文本的字符数 | 73 |
| query_snap | string | TTS 请求的部分内容截取,用于预览、比对 | “19世纪末,保罗·埃尔利希” |
| hit_cache | boolean | 是否命中缓存,如果为 true,表示这个请求实际上未发送至供应商处理 | false |
| language | string | TTS 语音的语言 | “zh-CN” |
| voice_id | string | 语音的标识,用于表示 TTS 语音的类型和性别 | “zh-cn-XiaoyiNeural” |
| facefeature_id | string | 面部特征标识,用于指示 TTS 服务的一些特性 | “3d8917918f4c47d49a26fbea45808d44_s1” |
| conversion_id | string | 转换标识,用于跟踪 TTS 服务的转换过程 | “nina” |
| BYOL | boolean | 是否使用用户提供的供应商账号 | false |
具体样例日志如下:
{"level":"info","time":"2024-04-07T06:53:16.594Z","caller":"ftts/flow.go:243","msg":"processing billable TTS query","pid":1,"flow":"TTS","device":"PAAS_LIVE","session":"357|123","tts":"TTS3","tenant_id":"kaifa-test","request_index":1,"request":"44444444444","trace":"1a08b1bc-5b9a-4574-9056-711ecd720c08","section":"usage","char_cnt":78,"rune_cnt":66,"query_snap":"<speak sttts:version=\"0.1\">开始打断内容,今天星期四<break time=\"0ms\"/></speak>","hit_cache":false,"language":"zh-CN","voice_id":"nina","facefeature_id":"0325_nina_s3_beauty","conversion_id":"", "BYOL": false}
{"level":"info","time":"2024-04-07T06:53:16.617Z","caller":"ftts/flow.go:243","msg":"processed billable TTS query","pid":1,"flow":"TTS","device":"PAAS_LIVE","session":"357|123","tts":"TTS3","tenant_id":"kaifa-test","request_index":1,"request":"44444444444","trace":"1a08b1bc-5b9a-4574-9056-711ecd720c08","section":"usage","char_cnt":78,"rune_cnt":66,"query_snap":"<speak sttts:version=\"0.1\">开始打断内容,今天星期四<break time=\"0ms\"/></speak>","hit_cache":true,"language":"zh-CN","voice_id":"nina","facefeature_id":"0325_nina_s3_beauty","conversion_id":"", "BYOL": false}
{"level":"info","time":"2024-04-07T07:13:07.299Z","caller":"ftts/flow.go:243","msg":"processing billable TTS query","pid":1,"flow":"TTS","device":"PAAS_VIDEO_2D","session":"20989592278|6e5dc4df22174f5f8153750655c42b0a","tts":"TTS5","tenant_id":"166","request_index":1,"request":"6e5dc4df22174f5f8153750655c42b0a","trace":"cfda6671-df76-4684-9825-31ae060b9dba","section":"usage","char_cnt":449,"rune_cnt":231,"query_snap":"商汤科技作为人工智能软件公司,商汤科技以“坚持原创,让AI引领人类进步”为使命,旨在持续引领人工智能前沿研究,持续打造更具拓展性更普惠的人工智能软件平台,推动经济、社会和人类的发展,并持续吸引及培养顶尖人才,共同塑造未来。 \n\n\n\n商汤科技拥有深厚的学术积累,并长期投入于原创技术研究,不断增强行业领先的多模态、多任务通用人工智能能力,涵盖感知智能、自然语言处理、决策智能、智能内容生成等关键技术领域,同时包含AI芯片、AI传感器及AI算力基础设施在内的关键能力","hit_cache":false,"language":"zh-CN","voice_id":"zh-CN-XiaochenNeural","facefeature_id":"af0c4f927f7a4af7820167c0928ea357_s1_1","conversion_id":"", "BYOL": true}
{"level":"info","time":"2024-04-07T07:13:09.596Z","caller":"ftts/flow.go:243","msg":"processed billable TTS query","pid":1,"flow":"TTS","device":"PAAS_VIDEO_2D","session":"20989592278|6e5dc4df22174f5f8153750655c42b0a","tts":"TTS5","tenant_id":"166","request_index":1,"request":"6e5dc4df22174f5f8153750655c42b0a","trace":"cfda6671-df76-4684-9825-31ae060b9dba","section":"usage","char_cnt":449,"rune_cnt":231,"query_snap":"商汤科技作为人工智能软件公司,商汤科技以“坚持原创,让AI引领人类进步”为使命,旨在持续引领人工智能前沿研究,持续打造更具拓展性更普惠的人工智能软件平台,推动经济、社会和人类的发展,并持续吸引及培养顶尖人才,共同塑造未来。 \n\n\n\n商汤科技拥有深厚的学术积累,并长期投入于原创技术研究,不断增强行业领先的多模态、多任务通用人工智能能力,涵盖感知智能、自然语言处理、决策智能、智能内容生成等关键技术领域,同时包含AI芯片、AI传感器及AI算力基础设施在内的关键能力","hit_cache":false,"language":"zh-CN","voice_id":"zh-CN-XiaochenNeural","facefeature_id":"af0c4f927f7a4af7820167c0928ea357_s1_1","conversion_id":"", "BYOL": true}
{"level":"info","time":"2024-04-07T07:18:38.030Z","caller":"ftts/flow.go:243","msg":"processing billable TTS query","pid":1,"flow":"TTS","device":"PAAS_VIDEO_2D","session":"20989592279|c71d23205cf148daa55cb2b45fdbb848","tts":"TTS3","tenant_id":"166","request_index":1,"request":"c71d23205cf148daa55cb2b45fdbb848","trace":"b9c68547-8298-41f8-a3c8-aa86aa81d93a","section":"usage","char_cnt":449,"rune_cnt":231,"query_snap":"商汤科技作为人工智能软件公司,商汤科技以“坚持原创,让AI引领人类进步”为使命,旨在持续引领人工智能前沿研究,持续打造更具拓展性更普惠的人工智能软件平台,推动经济、社会和人类的发展,并持续吸引及培养顶尖人才,共同塑造未来。 \n\n\n\n商汤科技拥有深厚的学术积累,并长期投入于原创技术研究,不断增强行业领先的多模态、多任务通用人工智能能力,涵盖感知智能、自然语言处理、决策智能、智能内容生成等关键技术领域,同时包含AI芯片、AI传感器及AI算力基础设施在内的关键能力","hit_cache":false,"language":"zh-CN","voice_id":"nina","facefeature_id":"af0c4f927f7a4af7820167c0928ea357_s1_1","conversion_id":"", "BYOL": false}
{"level":"info","time":"2024-04-07T07:18:38.063Z","caller":"ftts/flow.go:243","msg":"processed billable TTS query","pid":1,"flow":"TTS","device":"PAAS_VIDEO_2D","session":"20989592279|c71d23205cf148daa55cb2b45fdbb848","tts":"TTS3","tenant_id":"166","request_index":1,"request":"c71d23205cf148daa55cb2b45fdbb848","trace":"b9c68547-8298-41f8-a3c8-aa86aa81d93a","section":"usage","char_cnt":449,"rune_cnt":231,"query_snap":"商汤科技作为人工智能软件公司,商汤科技以“坚持原创,让AI引领人类进步”为使命,旨在持续引领人工智能前沿研究,持续打造更具拓展性更普惠的人工智能软件平台,推动经济、社会和人类的发展,并持续吸引及培养顶尖人才,共同塑造未来。 \n\n\n\n商汤科技拥有深厚的学术积累,并长期投入于原创技术研究,不断增强行业领先的多模态、多任务通用人工智能能力,涵盖感知智能、自然语言处理、决策智能、智能内容生成等关键技术领域,同时包含AI芯片、AI传感器及AI算力基础设施在内的关键能力","hit_cache":true,"language":"zh-CN","voice_id":"nina","facefeature_id":"af0c4f927f7a4af7820167c0928ea357_s1_1","conversion_id":"", "BYOL": false}
过滤建议,以下条件均需要符合:
level字段级别为info字符串;msg字段包含"billable TTS query"字符串;flow字段为"TTS"字符串;tenant_id字段存在且不为空字符串;char_cnt字段存在且数值大于 0;
计费需要关注的字段:
char_cnt为当前 TTS 请求的字符数,即将要转换为语音的文本的字符数,具体计算方法依据字符的Unicode编码进行判定:- ASCII及其扩展Latin-1字符:所有Unicode编码值在255(含)以内的字符,包括但不限于ASCII字符集及其扩展Latin-1字符集(此范围常见于拉丁字母及部分符号),每个字符计为1个计费单位。
- 非ASCII及扩展Latin-1字符:所有Unicode编码值超过255的字符,包括但不限于中文、日文、韩文汉字,以及西里尔字母等,每个字符计为2个计费单位。
- 特殊字符:控制字符、空白字符以及语音合成标记语言(SSML和USSML)标记,同样按照上述相应的编码范围标准进行计费。
例如,“Aloha”记为5个字符,“Voilà!”记为6个字符,“你好!”记为6个字符,“안녕하세요"记为10个字符。
tenant_id为租户 ID,表示使用 TTS 服务的用户或者组织;time为事件发生的时间,采用 ISO 8601 格式;msg为日志的主要消息,"processing billable TTS query"表示开始合成的,"processed billable TTS query"表示合成完毕的(可能会缺失);tts供应商标识,用于表示具体提供 TTS 服务的供应商,不同供应商的计费方式和单价会有所不同;hit_cache是否命中缓存,如果为true,表示这个请求实际上未发送至供应商处理,建议同样向用户收费,但是我们成本统计时应过滤这部分;facefeature_id非空时表示 TTS 服务附带请求了 Face Feature 服务,这部分 GPU 的成本需要额外计算;conversion_id非空时表示 TTS 服务附带请求了 Voice Conversion 服务,这部分 GPU 的成本需要额外计算;BYOL是否使用用户提供的供应商账号,如果为true,表示用户提供了自己的供应商账号,这部分成本由用户自行承担,应当在计费时过滤掉;
计费时应先根据以上条件过滤出 TTS 用量相关日志,然后按时间和租户信息进行过滤,最后根据 char_cnt 计算出一段时间内同一个用户的 TTS 用量。另外,根据需要,还可以根据 session、device 等字段进行进一步的过滤和分析。