Re: CRLF fun stuff again...


Subject: Re: CRLF fun stuff again...
From: Paul Schinder (schinder@leprss.gsfc.nasa.gov)
Date: Sat Feb 10 2001 - 09:38:03 EST


At 11:11 AM +0000 2/10/01, Duncan Sinclair wrote:
>Hi,
>
>Sorry for the continuing mailbox pollution...
>
>Steve Freitas writes:
>>I wrote:
>>> But the transformation is reversible. the conversion is as follows:
>>>
>>> CR -> LF
>>> LF -> CR
>>>
>>> You do the transformation twice and you get back an identical file.
>>
>>Not true with all files.
>
>Yes true for all files. No matter what the 8-bit values "13" and "10"
>mean, you can do this swap twice and you'll get back the exact same
>file.

No, it isn't reversible. If a file originally contains both CR and
LF (most binary files), and you convert the CR to LF, it's no longer
possible to tell which LF's are converted CR's and which LF's were
there to begin with. If you do the reverse transformation, then
there will be no LF's in the file where originally there were some.
You simply fooled yourself with a single binary file that happened to
have only CR. Try one that has both to begin with, and you'll find
it's corrupted at the end.

You're also wrong about the Mac OS side. Not all files that are text
have file type TEXT, and not all files that are file type TEXT are
text files. TEXT is simply a label attached to a file by a program.
The operating system makes no effort to ensure that TEXT files are
actually text files.

Turning on CRLF conversion is a very bad idea. ftp has the same
problem, and the same data corruption occurs if you're not careful.

>
>Here's a test with a binary jpeg file....
>
> (Using GNU tr - Sun's "tr" doesn't cope with binary files - if you
> repeat this test make sure you use one that does.)
>
>quartz:~% sum duncan.jpg
>32290 17 duncan.jpg
>
>Original file's checksum is 32290
>
>quartz:~% tr '\r\n' '\n\r' < duncan.jpg > foo.jpg
>quartz:~% sum foo.jpg
>13378 17 foo.jpg
>
>Transform once, checksum is 13378
>
>quartz:~% tr '\r\n' '\n\r' < foo.jpg > bar.jpg
>quartz:~% sum bar.jpg
>32290 17 bar.jpg
>
>Transform a second time, checksum is back to 32290
>
>Convinced yet???
>
>Cheers,
>
>
>Duncan.

-- 
--
Paul J. Schinder
NASA Goddard Space Flight Center
Code 693
schinder@leprss.gsfc.nasa.gov



This archive was generated by hypermail 2b28 : Sun Oct 14 2001 - 03:04:32 EDT