瀏覽代碼

set use_spk2info_cache=False

yuekaiz 5 月之前
父節點
當前提交
8ded65e611
共有 2 個文件被更改,包括 3 次插入1 次删除
  1. 2 0
      runtime/triton_trtllm/README.md
  2. 1 1
      runtime/triton_trtllm/run.sh

+ 2 - 0
runtime/triton_trtllm/README.md

@@ -84,6 +84,8 @@ The following results were obtained by decoding on a single L20 GPU with 26 prom
 | Streaming, use_spk2info_cache=True | 2 | 323.04 | 316.83 | 0.0905 |
 | Streaming, use_spk2info_cache=True | 4 | 977.68 | 903.68| 0.0733 |
 
+> If your service only needs a fixed speaker, you can set `use_spk2info_cache=True` in `run.sh`. To add more speakers, refer to the instructions [here](https://github.com/qi-hua/async_cosyvoice?tab=readme-ov-file#9-spk2info-%E8%AF%B4%E6%98%8E).
+
 **Offline TTS (Full Sentence Latency)**
 | Mode | Note | Concurrency | Avg Latency (ms) | P50 Latency (ms) | RTF |
 |---|---|---|---|---|---|

+ 1 - 1
runtime/triton_trtllm/run.sh

@@ -15,7 +15,7 @@ trt_engines_dir=./trt_engines_${trt_dtype}
 
 model_repo=./model_repo_cosyvoice2
 
-use_spk2info_cache=True
+use_spk2info_cache=False
 
 if [ $stage -le -1 ] && [ $stop_stage -ge -1 ]; then
     echo "Cloning CosyVoice"