Q: What is Base64 encoding???

Q: What is Base64 encoding???

Post by Kari E. Hurt » Thu, 23 Mar 1995 21:57:26



[ Added comp.lang.perl,comp.mail.mime and comp.misc as receiver.
  Folloups to comp.mail.mime and comp.misc or whateverr you think appropriate. ]


?Hi all, Recently I've seen files using BASE64 encoding,
?What is it??? and How do you decode it??? Are there programs
?which decode/encode for DOS???
?
?Please respond by posting to this newsgroup.

[ Wrong newsgroup -- BASE64 definetely ins't _compression_.
  It expands, not compress, input data. ]

?Thanks to all and have a great day.

BASE64 is binary -> ascii encoding, which is mostly used by
MIME (Multipurpose Internet Mail Extensions).

RFC 1521: The base64 encoding is adapted from RFC 1421, with one change: base64
RFC 1521: eliminates the "*" mechanism for embedded clear text.
RFC 1521:
RFC 1521: A 65-character subset of US-ASCII is used, enabling 6 bits to be
RFC 1521: represented per printable character. (The extra 65th character, "=",
RFC 1521: is used to signify a special processing function.)
RFC 1521:
RFC 1521:    NOTE: This subset has the important property that it is
RFC 1521:    represented identically in all versions of ISO 646, including US
RFC 1521:    ASCII, and all characters in the subset are also represented
RFC 1521:    identically in all versions of EBCDIC.  Other popular encodings,
RFC 1521:    such as the encoding used by the uuencode utility and the base85
RFC 1521:    encoding specified as part of Level 2 PostScript, do not share
RFC 1521:    these properties, and thus do not fulfill the portability
RFC 1521:    requirements a binary transport encoding for mail must meet.
RFC 1521:
RFC 1521: The encoding process represents 24-bit groups of input bits as output
RFC 1521: strings of 4 encoded characters. Proceeding from left to right, a
RFC 1521: 24-bit input group is formed by concatenating 3 8-bit input groups.
RFC 1521: These 24 bits are then treated as 4 concatenated 6-bit groups, each
RFC 1521: of which is translated into a single digit in the base64 alphabet.
RFC 1521: When encoding a bit stream via the base64 encoding, the bit stream
RFC 1521: must be presumed to be ordered with the most-significant-bit first.
RFC 1521: That is, the first bit in the stream will be the high-order bit in
RFC 1521: the first byte, and the eighth bit will be the low-order bit in the
RFC 1521: first byte, and so on.
RFC 1521:
RFC 1521: Each 6-bit group is used as an index into an array of 64 printable
RFC 1521: characters. The character referenced by the index is placed in the
RFC 1521: output string. These characters, identified in Table 1, below, are
RFC 1521: selected so as to be universally representable, and the set excludes
RFC 1521: characters with particular significance to SMTP (e.g., ".", CR, LF)
RFC 1521: and to the encapsulation boundaries defined in this document (e.g.,
RFC 1521: "-").
RFC 1521:
RFC 1521:                          Table 1: The Base64 Alphabet
RFC 1521:
RFC 1521:    Value Encoding  Value Encoding  Value Encoding  Value Encoding
RFC 1521:         0 A            17 R            34 i            51 z
RFC 1521:         1 B            18 S            35 j            52 0
RFC 1521:         2 C            19 T            36 k            53 1
RFC 1521:         3 D            20 U            37 l            54 2
RFC 1521:         4 E            21 V            38 m            55 3
RFC 1521:         5 F            22 W            39 n            56 4
RFC 1521:         6 G            23 X            40 o            57 5
RFC 1521:         7 H            24 Y            41 p            58 6
RFC 1521:         8 I            25 Z            42 q            59 7
RFC 1521:         9 J            26 a            43 r            60 8
RFC 1521:        10 K            27 b            44 s            61 9
RFC 1521:        11 L            28 c            45 t            62 +
RFC 1521:        12 M            29 d            46 u            63 /
RFC 1521:        13 N            30 e            47 v
RFC 1521:        14 O            31 f            48 w         (pad) =
RFC 1521:        15 P            32 g            49 x
RFC 1521:        16 Q            33 h            50 y

Perhaps following litle perl script gives idea of decoding this:
(Ther definetly have much shorter scripts for this, shortes perl decoders
 for base64 are 2-3 lines, but they are quire cryptic.)



$val{'='} = 'EOF';

$res=0;
$bit=0;

sub char {

        return 1 if !defined $val{$c};
        return 0 if $val{$c} eq 'EOF';
        local($val) = $val{$c};
        $res <<=6;
        $res |= $val;
        $bit += 6;
        if ($bit >= 8) {
                local($ch) = $res >> ($bit - 8);
                $res -= $ch << ($bit - 8);
                $bit -= 8;
                print pack('C',$ch);
                #print $ch , ' ';
        }
        return 1;

Quote:}

while(<>) {
        while (s/^(.)//) { &char($1); }    

Quote:}

Where $val{$c} is value of character $c in decoding table.
And pack('C',$ch) converts values $ch to correspond character.
These two while loops call subroutine 'sub char' once per
input charater. That 'return 1 if !defined $val{$c};' causes that
bad characters are skipped. I think that left of that is quite clear,
if you know C.

For decoding BASE64 and other MIME attachments from mail, look mpack
-program: ftp.andrew.cmu.edu:pub/mpack/        

(Mpack is also for DOS, but I can't say where there is compiled binaries.
 That is source distribution.)

[ Answer cc'ed to questioner. ]
--
- Kari E. Hurtta                             /  El?m? on monimutkaista


 
 
 

1. Bug with detection of mismatch of encoding declaration and detected encoding in ACEXML_PARSER

    ACE VERSION: 5.3.1

    HOST MACHINE and OPERATING SYSTEM:
        Windows 2000 SP3

    CONTENTS OF $ACE_ROOT/ace/config.h:
        #define ACE_HAS_STANDARD_CPP_LIBRARY 1
        #include "ace/config-win32.h"

    AREA/CLASS/EXAMPLE AFFECTED:
        ACEXML_PARSER, Parser.cpp

    DOES THE PROBLEM AFFECT:
        COMPILATION?     NO
        LINKING?                NO
        EXECUTION?          YES

    SYNOPSIS:
        In Parser.cpp, line 302 - 310:

  if (ACE_OS::strstr (astring,
       this->instream_->getEncoding()) != 0)
  {
   ACE_ERROR ((LM_ERROR,
      ACE_TEXT ("Detected Encoding is %s : Declared Encoding is %s"),
      this->instream_->getEncoding(), astring));
   this->report_fatal_error (ACE_TEXT ("Encoding declaration doesn't match detected encoding") ACEXML_ENV_ARG_PARAMETER);
   return;
  }

here the operator != in the if statement should be operator == since only when ACE_OS::strstr() returns 0, indicating detected encoding string is NOT a substring of declared encoding, the encoding declaration should be considered not matching the detected encoding.

   SAMPLE FIX/WORKAROUND:
        Change operator != to operator ==

2. PPP software?

3. Wow, I am glad that I am not responsible for any more PC's.....

4. error 10056

5. I am damned if I do, and I am damned if I don't

6. Sprint Instant Foncard

7. QS Editor/Librarian

8. hotmail compose functionality...

9. Qs: Who do you recommend and why for electronic distribution?

10. qs Quantity Survey Software

11. IS0 9000 / QS 9000