专注于大型语言模型的压力测试工具,可定制支持多种数据集格式和不同的API协议格式。
用法
命令行
evalscope perf --help
usage: evalscope <command> [<args>] perf [-h] --model MODEL [--url URL] [--connect-timeout CONNECT_TIMEOUT] [--read-timeout READ_TIMEOUT] [-n NUMBER] [--parallel PARALLEL] [--rate RATE]
[--log-every-n-query LOG_EVERY_N_QUERY] [--headers KEY1=VALUE1 [KEY1=VALUE1 ...]] [--wandb-api-key WANDB_API_KEY] [--name NAME] [--debug] [--tokenizer-path TOKENIZER_PATH]
[--api API] [--max-prompt-length MAX_PROMPT_LENGTH] [--min-prompt-length MIN_PROMPT_LENGTH] [--prompt PROMPT] [--query-template QUERY_TEMPLATE] [--dataset DATASET]
[--dataset-path DATASET_PATH] [--frequency-penalty FREQUENCY_PENALTY] [--logprobs] [--max-tokens MAX_TOKENS] [--n-choices N_CHOICES] [--seed SEED] [--stop STOP] [--stream]
[--temperature TEMPERATURE] [--top-p TOP_P]
options:
-h, --help show this help message and exit
--model MODEL The test model name.
--url URL
--connect-timeout CONNECT_TIMEOUT
The network connection timeout
--read-timeout READ_TIMEOUT
The network read timeout
-n NUMBER, --number NUMBER
How many requests to be made, if None, will will send request base dataset or prompt.
--parallel PARALLEL Set number of concurrency request, default 1
--rate RATE Number of requests per second. default None, if it set to -1,then all the requests are sent at time 0. Otherwise, we use Poisson process to synthesize the request arrival times. Mutual exclusion
with parallel
--log-every-n-query LOG_EVERY_N_QUERY
Logging every n query.
--headers KEY1=VALUE1 [KEY1=VALUE1 ...]
Extra http headers accepts by key1=value1 key2=value2. The headers will be use for each query.You can use this parameter to specify http authorization and other header.
--wandb-api-key WANDB_API_KEY
The wandb api key, if set the metric will be saved to wandb.
--name NAME The wandb db result name and result db name, default: {model_name}_{current_time}
--debug Debug request sen