want to port JIT to MIPS - stack/code segment

本文探讨了在MIPS平台上进行JIT编译时遇到的栈对齐问题,包括如何确保栈始终对齐以及编译生成的代码如何避免栈对齐问题。文中还讨论了AssemblerBuffer的作用及内存分配的对齐方式。

摘要生成于 C知道 ,由 DeepSeek-R1 满血版支持, 前往体验 >

http://marc.info/?l=webkit-dev&m=123572053829137&w=2

want to port JIT to MIPS - stack/code segment


in ARM, we have a rule set called EABI (Embedded Application Binary
Interface). It says the stack must always be word aligned, and must be 2
words (8 bytes) aligned if you call other functions. The WebKit
interpreter callbacks returns either a single pointer (sometimes an int
contains a boolean value) or double pointers. These return values can be
passed through registers, no need to pre-allocate stack space for them.
The functions generated by g++ are also EABI compilant, so we don't need
worry about the stack at all.

I think AssemblerBuffer is only temporary hold the generated machine
instructions. When the compilation phase is done, you need to call
AssemblerBuffer::executableCopy, which allocates a new executable memory
space and that space is aligned by ExecutableAllocator.

Cheers,
Zoltan

> Zoltan,
> thanks a lot! yeah the issue is just JIT related.
> Do I need to take care of stack in JIT code, say before emit asm call I
> align the stack? I guess no need because Mips always aligned to 32bits,
> and the only double functions in webkit return result in registers not
> memory.
> For AssemblerBuffer.h I think it is different because the initial 256bytes
> buffer may be not aligned to 32bit. I'll add __attribute__ ((aligned (4)))
> or 8.
> rgds
> joe


I am not sure I understand your questions. The code blocks are allocated
by mmap() or VirtualAlloc(), thus they are aligned to 4K. Smaller chunks
are aligned by roundUpAllocationSize() function. Now the alignemt is
sizeof(void*) in both x86 and ARM. See ExecutableAllocator.h

The current jit implementations don't store temporary variables on the
stack, they allocate a fixed size buffer after the entry, and only free
that when you leave the jit. This approach is much easier than keep
tracking of the stack.

Cheers,
Zoltan

> gcc handles it well for X86. now on Mips I need to do followings right?
> 1. make sure (re)allocated code buffer aligned to 64bits and gcc malloc()
> only guarantee 32bits
> 2. before any call instruction in JIT code, make sure stack is aligned to
> 64bit also.
> PPC no JIT thus no problem right?
> rgds
> joe

内容概要:该研究通过在黑龙江省某示范村进行24小时实地测试,比较了燃煤炉具与自动/手动进料生物质炉具的污染物排放特征。结果显示,生物质炉具相比燃煤炉具显著降低了PM2.5、CO和SO2的排放(自动进料分别降低41.2%、54.3%、40.0%;手动进料降低35.3%、22.1%、20.0%),但NOx排放未降低甚至有所增加。研究还发现,经济性和便利性是影响生物质炉具推广的重要因素。该研究不仅提供了实际排放数据支持,还通过Python代码详细复现了排放特征比较、减排效果计算和结果可视化,进一步探讨了燃料性质、动态排放特征、碳平衡计算以及政策建议。 适合人群:从事环境科学研究的学者、政府环保部门工作人员、能源政策制定者、关注农村能源转型的社会人士。 使用场景及目标:①评估生物质炉具在农村地区的推广潜力;②为政策制定者提供科学依据,优化补贴政策;③帮助研究人员深入了解生物质炉具的排放特征和技术改进方向;④为企业研发更高效的生物质炉具提供参考。 其他说明:该研究通过大量数据分析和模拟,揭示了生物质炉具在实际应用中的优点和挑战,特别是NOx排放增加的问题。研究还提出了多项具体的技术改进方向和政策建议,如优化进料方式、提高热效率、建设本地颗粒厂等,为生物质炉具的广泛推广提供了可行路径。此外,研究还开发了一个智能政策建议生成系统,可以根据不同地区的特征定制化生成政策建议,为农村能源转型提供了有力支持。
Traceback (most recent call last): File "/home/chenjiayi/code/params-tuning/deploy/pt_to_ts.py", line 58, in <module> export_model() File "/home/chenjiayi/code/params-tuning/deploy/pt_to_ts.py", line 46, in export_model traced_model = torch.jit.trace(fused_model, example_inputs) File "/home/chenjiayi/anaconda3/envs/pointcloud/lib/python3.9/site-packages/torch/jit/_trace.py", line 794, in trace return trace_module( File "/home/chenjiayi/anaconda3/envs/pointcloud/lib/python3.9/site-packages/torch/jit/_trace.py", line 1056, in trace_module module._c._create_method_from_trace( File "/home/chenjiayi/anaconda3/envs/pointcloud/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl return forward_call(*args, **kwargs) File "/home/chenjiayi/anaconda3/envs/pointcloud/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1488, in _slow_forward result = self.forward(*input, **kwargs) File "/home/chenjiayi/code/params-tuning/deploy/../net/forward_net.py", line 65, in forward emb = self.embedder(x, c) File "/home/chenjiayi/anaconda3/envs/pointcloud/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl return forward_call(*args, **kwargs) File "/home/chenjiayi/anaconda3/envs/pointcloud/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1488, in _slow_forward result = self.forward(*input, **kwargs) File "/home/chenjiayi/code/params-tuning/deploy/../net/forward_net.py", line 31, in forward c_scale, c_shift = self.c_emb(c).chunk(2, dim=-1) File "/home/chenjiayi/anaconda3/envs/pointcloud/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl return forward_call(*args, **kwargs) File "/home/chenjiayi/anaconda3/envs/pointcloud/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1488, in _slow_forward result = self.forward(*input, **kwargs) File "/home/chenjiayi/anaconda3/envs/pointcloud/lib/python3.9/site-packages/torch/nn/modules/container.py", line 217, in forward input = module(input) File "/home/chenjiayi/anaconda3/envs/pointcloud/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl return forward_call(*args, **kwargs) File "/home/chenjiayi/anaconda3/envs/pointcloud/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1488, in _slow_forward result = self.forward(*input, **kwargs) File "/home/chenjiayi/anaconda3/envs/pointcloud/lib/python3.9/site-packages/torch/nn/modules/linear.py", line 114, in forward return F.linear(input, self.weight, self.bias) RuntimeError: mat1 and mat2 must have the same dtype
06-07
<think>我们正在解决PyTorch中因数据类型不匹配导致的RuntimeError,尤其是在使用`torch.jit.trace`时出现的`mat1andmat2musthavethesamedtype`错误。根据错误信息,问题在于张量的标量类型不匹配(例如Int和Double)。以下将提供系统的解决方案。###问题原因当使用`torch.jit.trace`跟踪模型时,模型中的矩阵乘法(如`matmul`)或线性层(`nn.Linear`)要求输入张量(mat1)和权重张量(mat2)具有相同的数据类型。常见的类型不匹配包括:-`torch.int32`vs`torch.float32`-`torch.float32`vs`torch.float64`-`torch.int64`vs`torch.float32`引用自PyTorch文档:矩阵运算要求输入张量具有匹配的类型[^1]。###解决方案####1.**统一数据类型**-**方法A:将输入数据转换为模型权重类型**默认情况下,PyTorch模型的权重为`torch.float32`。如果输入数据为整数类型(如`torch.int64`)或双精度浮点(`torch.float64`),需显式转换:```python#检查模型第一层的权重类型(通常代表全部权重)weight_dtype=next(model.parameters()).dtype#转换输入数据input_data=input_data.to(weight_dtype)#使用转换后的数据跟踪traced_model=torch.jit.trace(model,input_data)```-**方法B:将模型转换为输入数据类型**若输入数据为`torch.float64`且需保持高精度,则将模型权重转为双精度:```pythonmodel=model.double()#转为torch.float64traced_model=torch.jit.trace(model,input_data)#input_data应为torch.float64```####2.**检查中间层输出类型(适用于复杂模型)**若模型包含自定义层或分支结构,中间层输出类型可能与模型权重类型不一致:```python#示例:调试中间层类型defforward_debug(self,x):x=self.layer1(x)print("Layer1outputdtype:",x.dtype)#检查中间输出x=self.layer2(x)returnx#临时替换forward方法original_forward=model.forwardmodel.forward=forward_debug.__get__(model,model.__class__)model.eval()withtorch.no_grad():_=model(input_data)model.forward=original_forward#恢复原方法```####3.**使用`to()`同步设备与类型**同时处理设备和类型问题:```pythondevice=torch.device('cuda'iftorch.cuda.is_available()else'cpu')model.to(device).to(torch.float32)#确保模型在正确设备上且为float32input_data=input_data.to(device).to(torch.float32)```####4.**避免整数类型参与线性运算**若输入为整数索引(如嵌入层),需确保输入到线性层前转换为浮点:```pythonclassCustomModel(nn.Module):def__init__(self):super().__init__()self.embed=nn.Embedding(1000,300)self.linear=nn.Linear(300,10)defforward(self,x):x=self.embed(x)#输出为float32#若x为整数,此处需转换:x=x.float()x=self.linear(x)returnx```###预防性措施1.**在模型中添加类型断言**:```pythonclassSafeLinear(nn.Linear):defforward(self,x):assertx.dtype==self.weight.dtype,\f"类型不匹配:输入{x.dtype},权重{self.weight.dtype}"returnsuper().forward(x)#替换模型中的线性层model.linear=SafeLinear(300,10)```2.**使用`torch.set_default_dtype`统一默认类型**:```pythontorch.set_default_dtype(torch.float32)#设置全局默认类型```###示例代码```pythonimporttorchimporttorch.nnasnn#示例模型classNet(nn.Module):def__init__(self):super(Net,self).__init__()self.fc=nn.Linear(10,5)defforward(self,x):returnself.fc(x)#初始化模型和输入model=Net()input_data=torch.randn(3,10,dtype=torch.float64)#双精度输入#解决方案:统一类型model=model.double()#或input_data=input_data.float()#跟踪模型traced_model=torch.jit.trace(model,input_data)```§§相关问题§§1.如何在PyTorch中设置默认张量类型?2.`torch.jit.trace`和`torch.jit.script`有什么区别?3.PyTorch中哪些操作会隐式改变张量类型?4.如何调试PyTorch模型中的类型不一致问题?[^1]:PyTorch官方文档:矩阵运算的数据类型要求。</think> ### 解决 `torch.jit.trace` 中的数据类型不匹配问题 当使用 `torch.jit.trace` 时出现 `mat1 and mat2 must have the same dtype` 错误,这表明在模型计算过程中存在数据类型冲突(如 Float 与 Double)。以下是系统性解决方案: #### 1. 定位类型冲突点 ```python # 检查模型中各层的权重类型 for name, param in model.named_parameters(): print(f"{name}: {param.dtype}") # 显示所有参数类型 # 在forward中插入类型检查 def forward(self, x): print(f"输入类型: {x.dtype}") # 定位第一个冲突点 x = self.layer1(x) print(f"layer1输出类型: {x.dtype}") # ... 每层输出后添加检查 return x ``` #### 2. 数据类型统一方案 ```python # 方案A:转换输入数据类型(推荐) traced_model = torch.jit.trace(model, input_data.to(torch.float32)) # 统一为Float32 # 方案B:转换整个模型类型 model = model.double() # 全部转为Float64 traced_model = torch.jit.trace(model, input_data) # 保持输入为Float64 # 方案C:使用.to()统一类型 dtype = torch.float32 # 指定目标类型 traced_model = torch.jit.trace( model.to(dtype), input_data.to(dtype) ) ``` #### 3. 特殊层处理 某些层会产生类型转换: ```python # 自定义层示例:确保输入输出类型一致 class SafeLinear(nn.Module): def forward(self, x): if x.dtype != self.weight.dtype: x = x.to(self.weight.dtype) # 自动类型对齐 return F.linear(x, self.weight, self.bias) ``` #### 4. JIT特定注意事项 `torch.jit.trace` 对类型敏感度高于常规执行: ```python # 在trace前添加类型断言 def _force_type(self, x): assert x.dtype == torch.float32, f"需要float32输入, 实际得到 {x.dtype}" return x # 插入模型入口处 model.prepend_module('type_checker', _force_type) ``` #### 5. 完整调试流程 ```mermaid graph TD A[检测错误] --> B[检查输入类型] B --> C{匹配模型权重类型?} C ----> D[正常trace] C ----> E[转换输入/模型类型] E --> F[逐层检查输出类型] F --> G[修改自定义层类型处理] G --> H[添加类型断言] H --> D D --> I[成功导出] ``` ### 根本原因 `torch.jit.trace` 在编译时严格验证类型一致性: 1. 静态图要求所有张量类型预先确定 2. 混合精度操作(如 `float32 * float64`)在 Eager 模式可能隐式转换,但 JIT 中禁止 3. 自定义层中的类型转换操作在 trace 过程中可能丢失 ### 预防措施 ```python # 在模型入口统一类型 class TypeSafeModel(nn.Module): def __init__(self, base_model): super().__init__() self.base_model = base_model def forward(self, x): target_type = next(self.base_model.parameters()).dtype return self.base_model(x.to(target_type)) ```
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值