[ Added comp.lang.perl,comp.mail.mime and comp.misc as receiver.
Folloups to comp.mail.mime and comp.misc or whateverr you think appropriate. ]
?Hi all, Recently I've seen files using BASE64 encoding,
?What is it??? and How do you decode it??? Are there programs
?which decode/encode for DOS???
?
?Please respond by posting to this newsgroup.
[ Wrong newsgroup -- BASE64 definetely ins't _compression_.
It expands, not compress, input data. ]
?Thanks to all and have a great day.
BASE64 is binary -> ascii encoding, which is mostly used by
MIME (Multipurpose Internet Mail Extensions).
RFC 1521: The base64 encoding is adapted from RFC 1421, with one change: base64
RFC 1521: eliminates the "*" mechanism for embedded clear text.
RFC 1521:
RFC 1521: A 65-character subset of US-ASCII is used, enabling 6 bits to be
RFC 1521: represented per printable character. (The extra 65th character, "=",
RFC 1521: is used to signify a special processing function.)
RFC 1521:
RFC 1521: NOTE: This subset has the important property that it is
RFC 1521: represented identically in all versions of ISO 646, including US
RFC 1521: ASCII, and all characters in the subset are also represented
RFC 1521: identically in all versions of EBCDIC. Other popular encodings,
RFC 1521: such as the encoding used by the uuencode utility and the base85
RFC 1521: encoding specified as part of Level 2 PostScript, do not share
RFC 1521: these properties, and thus do not fulfill the portability
RFC 1521: requirements a binary transport encoding for mail must meet.
RFC 1521:
RFC 1521: The encoding process represents 24-bit groups of input bits as output
RFC 1521: strings of 4 encoded characters. Proceeding from left to right, a
RFC 1521: 24-bit input group is formed by concatenating 3 8-bit input groups.
RFC 1521: These 24 bits are then treated as 4 concatenated 6-bit groups, each
RFC 1521: of which is translated into a single digit in the base64 alphabet.
RFC 1521: When encoding a bit stream via the base64 encoding, the bit stream
RFC 1521: must be presumed to be ordered with the most-significant-bit first.
RFC 1521: That is, the first bit in the stream will be the high-order bit in
RFC 1521: the first byte, and the eighth bit will be the low-order bit in the
RFC 1521: first byte, and so on.
RFC 1521:
RFC 1521: Each 6-bit group is used as an index into an array of 64 printable
RFC 1521: characters. The character referenced by the index is placed in the
RFC 1521: output string. These characters, identified in Table 1, below, are
RFC 1521: selected so as to be universally representable, and the set excludes
RFC 1521: characters with particular significance to SMTP (e.g., ".", CR, LF)
RFC 1521: and to the encapsulation boundaries defined in this document (e.g.,
RFC 1521: "-").
RFC 1521:
RFC 1521: Table 1: The Base64 Alphabet
RFC 1521:
RFC 1521: Value Encoding Value Encoding Value Encoding Value Encoding
RFC 1521: 0 A 17 R 34 i 51 z
RFC 1521: 1 B 18 S 35 j 52 0
RFC 1521: 2 C 19 T 36 k 53 1
RFC 1521: 3 D 20 U 37 l 54 2
RFC 1521: 4 E 21 V 38 m 55 3
RFC 1521: 5 F 22 W 39 n 56 4
RFC 1521: 6 G 23 X 40 o 57 5
RFC 1521: 7 H 24 Y 41 p 58 6
RFC 1521: 8 I 25 Z 42 q 59 7
RFC 1521: 9 J 26 a 43 r 60 8
RFC 1521: 10 K 27 b 44 s 61 9
RFC 1521: 11 L 28 c 45 t 62 +
RFC 1521: 12 M 29 d 46 u 63 /
RFC 1521: 13 N 30 e 47 v
RFC 1521: 14 O 31 f 48 w (pad) =
RFC 1521: 15 P 32 g 49 x
RFC 1521: 16 Q 33 h 50 y
Perhaps following litle perl script gives idea of decoding this:
(Ther definetly have much shorter scripts for this, shortes perl decoders
for base64 are 2-3 lines, but they are quire cryptic.)
$val{'='} = 'EOF';
$res=0;
$bit=0;
sub char {
return 1 if !defined $val{$c};
return 0 if $val{$c} eq 'EOF';
local($val) = $val{$c};
$res <<=6;
$res |= $val;
$bit += 6;
if ($bit >= 8) {
local($ch) = $res >> ($bit - 8);
$res -= $ch << ($bit - 8);
$bit -= 8;
print pack('C',$ch);
#print $ch , ' ';
}
return 1;
while(<>) {Quote:}
while (s/^(.)//) { &char($1); }
Where $val{$c} is value of character $c in decoding table.Quote:}
And pack('C',$ch) converts values $ch to correspond character.
These two while loops call subroutine 'sub char' once per
input charater. That 'return 1 if !defined $val{$c};' causes that
bad characters are skipped. I think that left of that is quite clear,
if you know C.
For decoding BASE64 and other MIME attachments from mail, look mpack
-program: ftp.andrew.cmu.edu:pub/mpack/
(Mpack is also for DOS, but I can't say where there is compiled binaries.
That is source distribution.)
[ Answer cc'ed to questioner. ]
--
- Kari E. Hurtta / El?m? on monimutkaista