Platform-independent is Nothing

最新推荐文章于 2025-11-30 18:24:05 发布

最新推荐文章于 2025-11-30 18:24:05 发布 · 120 阅读

文章标签：

本文通过一个使用Perl处理文本文件的实际案例，揭示了在不同操作系统间处理文本文件时可能遇到的问题，特别是换行符的不同导致的匹配失败，并提出了相应的解决办法。

Last Friday I had to process some text files, adding some tags before some patterns in a text file. Perl is born for such a task. I first copied all the patterns into a file in Windows XP, then I switched to Linux to use Perl to finishing the task. But it never worked in the way I expected. The script was simple just reading target file line by line, then testing whether matching the patterns which read from files and chomped, if matching adding the tag. But it never worked, the line never matched the pattern, even they did match.

Suddenly, I realized the problem, the file containing the patterns was created in Windows XP, but I read and chomped it in Linux, So newline terminator '\r\n' was not chomped completely. As a result, lines never matched patterns except for last pattern, which was last line of that file and did not contain newline. So when ran it in Windows XP, I got expected results.

Here is the simple scene, the file is:

freedom
is
nothing
but
a
chance
to
be
better

Here is the Perl script:

#!/usr/bin/perl -w
my $src = "file.txt";
open SOURCE, $src or die "failed to open $src, $!";
chomp(@strings = <SOURCE>);
print "@strings\n";

Here is the output running in Linux:

user@desktop:~/home$ ls
file.txt  toy.pl

user@desktop:~/home$ file file.txt 
file.txt: ASCII text, with CRLF line terminators

user@desktop:~/home$ perl toy.pl file.txt 
 betterg

user@desktop:~/home$

if run toy.pl in windows, it get expected result:

z:\Explorer\assets>perl toy.pl
freedom is nothing but a chance to be better

So, Perl is doing the right thing here, the reason is that I shouldn't process Windows' text files directly in Linux, we should have converted it to Unix format or specified $/ to \r\n.

This is a trivial issue, but revealing an idea that real platform independence is nothing. Nothing can run correctly without a change across different platforms. C, C++, so called platform independent, but you have to pay extra attentions to word size, byte order and architecture gaps between different platforms. Java so called "compile once run everywhere", byte code is indeed platform independent, but byte code must run upon Java Virtual Machine, which is absolutely platform dependent, as a result, Java is platform dependent, to some extent at least!