Abstract: Positional encoding is crucial for the Transformer to effectively process multimodal feature information in multispectral object detection. However, existing studies often directly apply ...