How To Print Concurrent Requests in PDF Format

本文档解释了为何不能直接从Oracle应用程序的并发管理器中打印或重新打印PDF格式的并发请求报告,并提供了几种解决方法。
Subject: How To Print Concurrent Requests in PDF Format
 Doc ID: Note:333504.1Type: HOWTO
 Last Revision Date: 05-JUL-2006Status: PUBLISHED

In this Document
  Goal
  Solution
  References


 

 

Applies to:

Oracle Application Object Library - Version: 11.5.1 to 11.5
Pasta - Version: 11.5.1 to 11.5.10
Information in this document applies to any platform.
FNDPSTAX, PDF, pasta.cfg

Goal

How to print or reprint concurrent request reports in PDF format directly from the concurrent manager?

Solution

Printing or reprinting concurrent request reports in PDF format directly from the concurrent manager is not supported for the following reasons:

1. Virtually, no or few printers understand the raw PDF format. The Adobe Acrobat Reader converts the PDF file to another format before sending the file to a printer (RTF / Postscript). This conversion process is not available or present in Oracle Applications.

2. As per Bug 1243042, Oracle Applications supports PDF output for viewing only, "This is the way the Applications are designed and implemented... anything beyond this would be more of an enhancement request.". That is, printing PDF directly from the concurrent manager is not supported. This concurrent manager limitation is also present in release 11.5.

3. Oracle Applications seeded reports are designed as "Character" (ASCII text) reports. Reports designed in Character mode have a different scaling than reports designed in Bitmap mode (Postscript / PDF/ PCL, etc.), inches rather than points. Although you can change or convert a seeded Character report to another format, these seeded reports were not designed for another format; therefore, the layout of the seeded reports may or may not print properly--customization of the reports and/or drivers may be required.

NOTE 1:  The option to change the output format of report programs is present in order to extend and customize Oracle Applications. That is, the option is there in order to build and add custom reports to Oracle Applications in various formats. Some customization, specific to a particular business need, may require the assistance of Oracle Consulting or a third party consulting group.

 

Limitations:

4. If the character set of your installation is UTF8, generating any report in PDF format (with or without printing) is not supported. PDF under a UTF8 character set environment, and other non-Latin 1 character sets, is not supported. As per Note 189708.1 "Oracle Reports 6i Setup Guide for Oracle Applications 11i":

"Oracle Reports 6i cannot generate PDF output with multi-byte characters even with the settings described in this section...In Oracle Applications 11i with Oracle Reports 6i, PDF output is supported for only Latin-1 character sets such as US7ASCII, WE8ISO8859P1 or WE8MSWIN1252. Any other single-byte, multi-byte or Unicode character set such as UTF8 or WE8ISO8859P15 is not supported."

NOTE 2: Under UTF8, generating PDF via the XML Publisher product is supported; however, printing of XML Publisher generated PDF files is accomplished by third party software and Apps seeded drivers--see step 8 of Note 316447.1 About Oracle XML Publisher Release 5.5. 

 

5. If the character set of your installation is UTF8, Pasta is required for all of your printing, regardless of the output format being used. The printing chapter of the Oracle Applications System Administrator's Guide states, "In order to print reports with the UFT8 character set, you 'must' configure PASTA".

NOTE 3: Older printers do not fully support the Unicode (UTF8) character set; therefore, it is essential to use Pasta to convert the output file to a format or character set that older printers do fully support. The main or common problem is poor interpretation of extended characters such as accent marks, Euro symbol, or umlauts. 

 

Workarounds:

6. From Oracle Applications via the Adobe Acrobat Reader, a PDF output file can be viewed and printed.

7. Although, it may be possible to create a custom print driver or print program using the Adobe Acrobat Distiller, viable instructions on how to perform such a custom setup is very scarce, see Note 262657.1--intended for Latin 1 character set environments.

8. Use Pasta, an Oracle post printing program, to change a copy of the report output file to the desired or printable output format. Please reference via Metalink the Pasta User's Guide Release 3.0 for additional information on Pasta. In this type of setup, you will need to determine what third party product and what OS concatenated commands are needed to covert the output file to the desired format and submit the print job. Place the needed concatenated commands in the pasta.cfg file as printCommand=command...arguments. Another approach is to used a third party file conversion program with the Pasta preprocessing command,  preprocess=command...arguments (i.e. preprocess=pdf2ps {infile} {outfile} ). See the examples in the end of the Pasta User's Guide under "Preprocessing Command".

9. Lastly, use another format like Postscript, that is fully supported for viewing and printing.

 

NOTE 4:  Please keep in mind, Applications Support is not structured to support in-depth customization issues; that is, the knowledge is distributed across several groups, such assistance is very limited. Please reference WebIV Note 122452.1 "Oracle Support Services Policy Regarding Customizations".

References

Note 189708.1 - Oracle Reports 6i Setup Guide for Oracle Applications 11i
Note 240864.1 - Activating and Configuring IX Library
Note 99495.1 - Oracle Applications Postscript Printing
 
import argparse import dataclasses import json import os import random import time import warnings from typing import Any, Optional, Union import torch, torch_npu import uvloop from tqdm import tqdm from transformers import AutoModelForCausalLM, AutoTokenizer, PreTrainedTokenizerBase, BitsAndBytesConfig # 从本地模块导入数据集和工具函数 from benchmark_dataset import ( AIMODataset, BurstGPTDataset, ConversationDataset, InstructCoderDataset, RandomDataset, SampleRequest, ShareGPTDataset, SonnetDataset, VisionArenaDataset, ) from benchmark_utils import convert_to_pytorch_benchmark_format, write_to_json from vllm.engine.arg_utils import AsyncEngineArgs, EngineArgs from vllm.entrypoints.openai.api_server import ( build_async_engine_client_from_engine_args, ) from vllm.inputs import TextPrompt, TokensPrompt from vllm.lora.request import LoRARequest from vllm.outputs import RequestOutput from vllm.sampling_params import BeamSearchParams from vllm.utils import FlexibleArgumentParser, merge_async_iterators # 禁用分词器并行 os.environ["TOKENIZERS_PARALLELISM"] = "false" bnb_config = BitsAndBytesConfig( load_in_4bit=True, bnb_4bit_compute_dtype=torch.bfloat16, # Support torch.float16, torch.float32, torch.bfloat16 bnb_4bit_quant_type="nf4", # # Only support `nf4` bnb_4bit_use_double_quant=False ) def run_hf( requests: list[SampleRequest], model: str, tokenizer: PreTrainedTokenizerBase, n: int, max_batch_size: int, trust_remote_code: bool, disable_detokenize: bool = False, ) -> float: device = torch.device("npu:0" if torch.npu.is_available() else "cpu") torch.npu.set_device(device) llm = AutoModelForCausalLM.from_pretrained( model, torch_dtype=torch.bfloat16, # quantization_config=bnb_config, use_safetensors=True, trust_remote_code=True ).to(device) torch.npu.synchronize() if llm.config.model_type == "llama": # To enable padding in the HF backend. tokenizer.pad_token = tokenizer.eos_token pbar = tqdm(total=len(requests)) start = time.perf_counter() batch: list[str] = [] max_prompt_len = 0 max_output_len = 0 for i in range(len(requests)): prompt = requests[i].prompt prompt_len = requests[i].prompt_len output_len = requests[i].expected_output_len # Add the prompt to the batch. batch.append(prompt) max_prompt_len = max(max_prompt_len, prompt_len) max_output_len = max(max_output_len, output_len) if len(batch) < max_batch_size and i != len(requests) - 1: # Check if we can add more requests to the batch. next_prompt_len = requests[i + 1].prompt_len next_output_len = requests[i + 1].expected_output_len if ( max(max_prompt_len, next_prompt_len) + max(max_output_len, next_output_len) ) <= 2048: # We can add more requests to the batch. continue # inputs = tokenizer(batch, return_tensors="pt", padding=True) inputs = tokenizer(batch, return_tensors="pt", padding=True) input_ids = inputs.input_ids.npu() attention_mask = inputs.attention_mask.npu() torch.npu.synchronize() llm_outputs = llm.generate( input_ids=input_ids, attention_mask=attention_mask, max_new_tokens=32, do_sample=False, ) if not disable_detokenize: # Include the decoding time. tokenizer.batch_decode(llm_outputs, skip_special_tokens=True) pbar.update(len(batch)) # Clear the batch. batch = [] max_prompt_len = 0 max_output_len = 0 end = time.perf_counter() return end - start def save_to_pytorch_benchmark_format( args: argparse.Namespace, results: dict[str, Any] ) -> None: pt_records = convert_to_pytorch_benchmark_format( args=args, metrics={ "requests_per_second": [results["requests_per_second"]], "tokens_per_second": [results["tokens_per_second"]], }, extra_info={ k: results[k] for k in ["elapsed_time", "num_requests", "total_num_tokens"] }, ) if pt_records: # Don't use json suffix here as we don't want CI to pick it up pt_file = f"{os.path.splitext(args.output_json)[0]}.pytorch.json" write_to_json(pt_file, pt_records) def get_requests(args, tokenizer): # Common parameters for all dataset types. common_kwargs = { "dataset_path": args.dataset_path, "random_seed": args.seed, } sample_kwargs = { "tokenizer": tokenizer, "lora_path": args.lora_path, "max_loras": args.max_loras, "num_requests": args.num_prompts, "input_len": args.input_len, "output_len": args.output_len, } if args.dataset_path is None or args.dataset_name == "random": sample_kwargs["range_ratio"] = args.random_range_ratio sample_kwargs["prefix_len"] = args.prefix_len dataset_cls = RandomDataset elif args.dataset_name == "sharegpt": dataset_cls = ShareGPTDataset if args.backend == "vllm-chat": sample_kwargs["enable_multimodal_chat"] = True elif args.dataset_name == "sonnet": assert tokenizer.chat_template or tokenizer.default_chat_template, ( "Tokenizer/model must have chat template for sonnet dataset." ) dataset_cls = SonnetDataset sample_kwargs["prefix_len"] = args.prefix_len sample_kwargs["return_prompt_formatted"] = True elif args.dataset_name == "burstgpt": dataset_cls = BurstGPTDataset elif args.dataset_name == "hf": common_kwargs["no_stream"] = args.no_stream if args.dataset_path in VisionArenaDataset.SUPPORTED_DATASET_PATHS: dataset_cls = VisionArenaDataset common_kwargs["dataset_subset"] = None common_kwargs["dataset_split"] = "train" sample_kwargs["enable_multimodal_chat"] = True elif args.dataset_path in InstructCoderDataset.SUPPORTED_DATASET_PATHS: dataset_cls = InstructCoderDataset common_kwargs["dataset_split"] = "train" elif args.dataset_path in ConversationDataset.SUPPORTED_DATASET_PATHS: dataset_cls = ConversationDataset common_kwargs["dataset_subset"] = args.hf_subset common_kwargs["dataset_split"] = args.hf_split sample_kwargs["enable_multimodal_chat"] = True elif args.dataset_path in AIMODataset.SUPPORTED_DATASET_PATHS: dataset_cls = AIMODataset common_kwargs["dataset_subset"] = None common_kwargs["dataset_split"] = "train" else: raise ValueError(f"Unknown dataset name: {args.dataset_name}") # Remove None values sample_kwargs = {k: v for k, v in sample_kwargs.items() if v is not None} return dataset_cls(**common_kwargs).sample(**sample_kwargs) def main(args: argparse.Namespace): if args.seed is None: args.seed = 0 print(args) random.seed(args.seed) # Sample the requests. print("*****args.tokenizer:", args.tokenizer) tokenizer = AutoTokenizer.from_pretrained( args.tokenizer, trust_remote_code=True, local_files_only=True, use_fast=True ) tokenizer.padding_side = "left" # 新增 requests = get_requests(args, tokenizer) is_multi_modal = any(request.multi_modal_data is not None for request in requests) request_outputs: Optional[list[RequestOutput]] = None if args.backend == "hf": assert args.tensor_parallel_size == 1 elapsed_time = run_hf( requests, args.model, tokenizer, args.n, args.hf_max_batch_size, args.trust_remote_code, args.disable_detokenize, ) else: raise ValueError(f"Unknown backend: {args.backend}") total_num_tokens = sum(r.prompt_len + r.expected_output_len for r in requests) total_output_tokens = sum(r.expected_output_len for r in requests) total_prompt_tokens = total_num_tokens - total_output_tokens print( f"Throughput: {len(requests) / elapsed_time:.2f} requests/s, " f"{total_num_tokens / elapsed_time:.2f} total tokens/s, " f"{total_output_tokens / elapsed_time:.2f} output tokens/s" ) print(f"Total num prompt tokens: {total_prompt_tokens}") print(f"Total num output tokens: {total_output_tokens}") # Output JSON results if specified if args.output_json: results = { "elapsed_time": elapsed_time, "num_requests": len(requests), "total_num_tokens": total_num_tokens, "requests_per_second": len(requests) / elapsed_time, "tokens_per_second": total_num_tokens / elapsed_time, } with open(args.output_json, "w") as f: json.dump(results, f, indent=4) save_to_pytorch_benchmark_format(args, results) def validate_args(args): # === Deprecation and Defaulting === if args.dataset is not None: warnings.warn( "The '--dataset' argument will be deprecated in the next release. " "Please use '--dataset-name' and '--dataset-path' instead.", stacklevel=2, ) args.dataset_path = args.dataset if not getattr(args, "tokenizer", None): args.tokenizer = args.model # === Backend Validation === valid_backends = {"vllm", "hf", "mii", "vllm-chat"} if args.backend not in valid_backends: raise ValueError(f"Unsupported backend: {args.backend}") # === Dataset Configuration === if not args.dataset and not args.dataset_path: print("When dataset path is not set, it will default to random dataset") args.dataset_name = "random" if args.input_len is None: raise ValueError("input_len must be provided for a random dataset") # === Dataset Name Specific Checks === # --hf-subset and --hf-split: only used # when dataset_name is 'hf' if args.dataset_name != "hf" and ( getattr(args, "hf_subset", None) is not None or getattr(args, "hf_split", None) is not None ): warnings.warn( "--hf-subset and --hf-split will be ignored \ since --dataset-name is not 'hf'.", stacklevel=2, ) elif args.dataset_name == "hf": if args.dataset_path in ( VisionArenaDataset.SUPPORTED_DATASET_PATHS.keys() | ConversationDataset.SUPPORTED_DATASET_PATHS ): assert args.backend == "vllm-chat", ( f"{args.dataset_path} needs to use vllm-chat as the backend." ) # noqa: E501 elif args.dataset_path in ( InstructCoderDataset.SUPPORTED_DATASET_PATHS | AIMODataset.SUPPORTED_DATASET_PATHS ): assert args.backend == "vllm", ( f"{args.dataset_path} needs to use vllm as the backend." ) # noqa: E501 else: raise ValueError(f"{args.dataset_path} is not supported by hf dataset.") # --random-range-ratio: only used when dataset_name is 'random' if args.dataset_name != "random" and args.random_range_ratio is not None: warnings.warn( "--random-range-ratio will be ignored since \ --dataset-name is not 'random'.", stacklevel=2, ) # --prefix-len: only used when dataset_name is 'random', 'sonnet', or not # set. if ( args.dataset_name not in {"random", "sonnet", None} and args.prefix_len is not None ): warnings.warn( "--prefix-len will be ignored since --dataset-name\ is not 'random', 'sonnet', or not set.", stacklevel=2, ) # === LoRA Settings === if getattr(args, "enable_lora", False) and args.backend != "vllm": raise ValueError("LoRA benchmarking is only supported for vLLM backend") if getattr(args, "enable_lora", False) and args.lora_path is None: raise ValueError("LoRA path must be provided when enable_lora is True") # === Backend-specific Validations === if args.backend == "hf" and args.hf_max_batch_size is None: raise ValueError("HF max batch size is required for HF backend") if args.backend != "hf" and args.hf_max_batch_size is not None: raise ValueError("HF max batch size is only for HF backend.") if ( args.backend in {"hf", "mii"} and getattr(args, "quantization", None) is not None ): raise ValueError("Quantization is only for vLLM backend.") if args.backend == "mii" and args.dtype != "auto": raise ValueError("dtype must be auto for MII backend.") if args.backend == "mii" and args.n != 1: raise ValueError("n must be 1 for MII backend.") if args.backend == "mii" and args.tokenizer != args.model: raise ValueError("Tokenizer must be the same as the model for MII backend.") # --data-parallel is not supported currently. # https://github.com/vllm-project/vllm/issues/16222 if args.data_parallel_size > 1: raise ValueError( "Data parallel is not supported in offline benchmark, \ please use benchmark serving instead" ) def create_argument_parser(): parser = FlexibleArgumentParser(description="Benchmark the throughput.") parser.add_argument( "--backend", type=str, choices=["vllm", "hf", "mii", "vllm-chat"], default="vllm", ) parser.add_argument( "--dataset-name", type=str, choices=["sharegpt", "random", "sonnet", "burstgpt", "hf"], help="Name of the dataset to benchmark on.", default="sharegpt", ) parser.add_argument( "--no-stream", action="store_true", help="Do not load the dataset in streaming mode.", ) parser.add_argument( "--dataset", type=str, default=None, help="Path to the ShareGPT dataset, will be deprecated in\ the next release. The dataset is expected to " "be a json in form of list[dict[..., conversations: " "list[dict[..., value: <prompt_or_response>]]]]", ) parser.add_argument( "--dataset-path", type=str, default=None, help="Path to the dataset" ) parser.add_argument( "--input-len", type=int, default=None, help="Input prompt length for each request", ) parser.add_argument( "--output-len", type=int, default=None, help="Output length for each request. Overrides the " "output length from the dataset.", ) parser.add_argument( "--n", type=int, default=1, help="Number of generated sequences per prompt." ) parser.add_argument( "--num-prompts", type=int, default=1000, help="Number of prompts to process." ) parser.add_argument( "--hf-max-batch-size", type=int, default=None, help="Maximum batch size for HF backend.", ) parser.add_argument( "--output-json", type=str, default=None, help="Path to save the throughput results in JSON format.", ) parser.add_argument( "--async-engine", action="store_true", default=False, help="Use vLLM async engine rather than LLM class.", ) parser.add_argument( "--disable-frontend-multiprocessing", action="store_true", default=False, help="Disable decoupled async engine frontend.", ) parser.add_argument( "--disable-detokenize", action="store_true", help=( "Do not detokenize the response (i.e. do not include " "detokenization time in the measurement)" ), ) # LoRA parser.add_argument( "--lora-path", type=str, default=None, help="Path to the LoRA adapters to use. This can be an absolute path, " "a relative path, or a Hugging Face model identifier.", ) parser.add_argument( "--prefix-len", type=int, default=None, help=f"Number of prefix tokens to be used in RandomDataset " "and SonnetDataset. For RandomDataset, the total input " "length is the sum of prefix-len (default: " f"{RandomDataset.DEFAULT_PREFIX_LEN}) and a random context length " "sampled from [input_len * (1 - range_ratio), " "input_len * (1 + range_ratio)]. For SonnetDataset, " f"prefix_len (default: {SonnetDataset.DEFAULT_PREFIX_LEN}) " "controls how much of the input is fixed lines versus " "random lines, but the total input length remains approximately " "input_len tokens.", ) # random dataset parser.add_argument( "--random-range-ratio", type=float, default=None, help=f"Range ratio (default : {RandomDataset.DEFAULT_RANGE_RATIO}) " "for sampling input/output length, " "used only for RandomDataset. Must be in the range [0, 1) to " "define a symmetric sampling range " "[length * (1 - range_ratio), length * (1 + range_ratio)].", ) # hf dtaset parser.add_argument( "--hf-subset", type=str, default=None, help="Subset of the HF dataset." ) parser.add_argument( "--hf-split", type=str, default=None, help="Split of the HF dataset." ) parser = AsyncEngineArgs.add_cli_args(parser) return parser if __name__ == "__main__": parser = create_argument_parser() args = parser.parse_args() if args.tokenizer is None: args.tokenizer = args.model validate_args(args) main(args) ''' python nf4_bm.py \ --backend hf \ --dataset-name random \ --input-len 512 \ --output-len 128 \ --hf-max-batch-size 32 \ --model /models/z50051264/summary/Qwen2.5-7B-nf python nf4_bm.py \ --backend hf \ --dataset-name random \ --input-len 512 \ --output-len 128 \ --hf-max-batch-size 32 \ --model /models/z50051264/vllm-0.10.0/benchmarks/pangu-ok-model-nf4 python nf4_bm.py \ --backend hf \ --dataset-name random \ --input-len 512 \ --output-len 128 \ --hf-max-batch-size 32 \ --model /models/z50051264/vllm-0.10.0/benchmarks/pangu ''' 修改上面的代码,支持并发测试; 类似于这种: for parallel_num in args.parallel_num: for i, prompt_tokens in enumerate(args.prompt_tokens): output_tokens = args.output_tokens[i] input_requests = get_dataset_requests(args, tokenizer, prompt_tokens, output_tokens, parallel_num) request_latency_record: List[Tuple[int, int, List]] = [] asyncio.run(benchmark(request_latency_record, args.backend, api_url, input_requests, args.best_of, args.use_beam_search, args.request_rate, parallel_num, args.epochs, args.app_code, args.tokenizer, args.served_model_name, args.num_scheduler_steps, args.use_spec_decode)) if args.use_spec_decode: update_spec_output_tokens(request_latency_record, tokenizer, args.backend) statistics_and_print_performance_data(args, prompt_tokens, output_tokens, parallel_num, request_latency_record, all_latency_record)
07-29
评论
成就一亿技术人!
拼手气红包6.0元
还能输入1000个字符
 
红包 添加红包
表情包 插入表情
 条评论被折叠 查看
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值