Subject: Re: German Umlauts - next questions
From: Axel Rose (rose@sj.com)
Date: Thu Aug 02 2001 - 06:47:44 EDT
Hello Harald,
with Perl you can help yourself.
First a little help script to find the encodings:
#!/usr/bin/perl
printf( "ä = %x\n", ord('ä') );
printf( "ö = %x\n", ord('ö') );
printf( "ü = %x\n", ord('ü') );
printf( "Ä = %x\n", ord('Ä') );
printf( "Ö = %x\n", ord('Ö') );
printf( "Ü = %x\n", ord('Ü') );
printf( "ß = %x\n", ord('ß') );
__END__
run on Mac results:
ä = 8a
ö = 9a
ü = 9f
Ä = 80
Ö = 85
Ü = 86
ß = a7
run under Unix results:
ä = e4
ö = f6
ü = fc
Ä = c4
Ö = d6
Ü = dc
ß = df
Then use this to build your own converter:
#!/usr/bin/perl -w
use strict;
my $filename = "achja-äöü-ÄÖÜ-ß-/-entreé";
print isoencode( $filename ), "\n";
sub isoencode
{
my $input_name = $_[0];
# unvollstaendig!! TODO
# aus ISO-8859-1 nach ES Kodierung
# /, ae, oe, ue, Ae, Oe, Ue, sz
$input_name =~ s/\//:2f/g;
$input_name =~ s/\xe4/:8a/g;
$input_name =~ s/\xf6/:9a/g;
$input_name =~ s/\xfc/:9f/g;
$input_name =~ s/\xc4/:80/g;
$input_name =~ s/\xd6/:85/g;
$input_name =~ s/\xdc/:86/g;
$input_name =~ s/\xdf/:a7/g;
# DEL Char 0x7f, 127
$input_name =~ s/\x7f/:7f/g;
return( $input_name );
}
__END__
With input in MacRoman it is easier because you don't need
a translation table. Just translate \x8a to ":8a".
#!/usr/bin/perl -w
use strict;
my $s = "abc-äöü-def";
# beware the "/" in Mac filenames
$s =~ s/\//:2f/g;
$s =~ s/[\x7f-\xff]/&enc($&)/ge;
print $s, "\n";
sub enc
{
my $arg = shift;
return sprintf( ":%x", ord( $arg ) );
}
__END__
This of course is incomplete and only provided as a basis for
your own customization.
HTH
Axel
----------------------------------------------------------------------
Axel Rose, Springer & Jacoby Digital GmbH & Co. KG, mailto:rose@sj.com
pub PGP key 1024/A21CB825 E0E4 BC69 E001 96E9 2EFD 86CA 9CA1 AAC5
"Was man nicht weiß, das eben braucht man.
Und was man weiß, kann man nicht brauchen."
This archive was generated by hypermail 2b28 : Sun Oct 14 2001 - 03:04:47 EDT