raw char * buffer to char * printable string conversion issue

raw char * buffer to char * printable string conversion issue

Post by Tom » Sat, 15 Apr 2000 04:00:00



Hi all,

I'll try to be brief... I am trying to convert a string (well a character
buffer) that can contain values 0 ... 255 to a pritable string (kinda a
normal C-like notation). Basically I'll be happy if I could find a system
function that would convert a string containing a NULL character in the
middle (the actual memory could look like this: 0x61 0x6c 0x00 0x61 0x6c)
into a printable form of "al\x0al" or "al\0al".

I would be even more happy if there was a routine that would take "al\0al"
and a pointer as input and would produce the required buffer and returned
it's size (not the number of character to the first NULL).

Now, I know it's reasonably easy to convert \x0 manually, but then I'll have
to do the same with \n (not \x0A),  \r, \t .... and then allow the user to
specify non-printable characters as a hex number \x00 - \xFF.

I know sscanf and sprintf (snprintf) are very powerful but I couldn't make
them do it ... I tried various combinations (see below - PPS), but with no
luck. Maybe someone had already gone through this hassle and knows how to do
it in a simple manner (kind of automagically)?

Thanks
Tom

PS. I know that literal strings of the form "al\0al" are converted to
appropriate buffers during compilation, but what I need is to get this input
from a file/user and then convert it to raw buffer or alternativelly
(preferrably take a raw buffer and convert it to user form - like in the
samle code below).

PPS. sample trial code

#include <string.h>
#include<stdio.h>
#include<string>

int main(int argc, char **argv)
{
   char pbuffer[1024];
   const char * const buffer = "ali\0is\0tall";

   size_t len = 11;
   size_t i = 0;

   std::string str;

   printf("\n\tchar *          = %s\n\tchar buffer     = ",buffer);
   for(i = 0; i< len; ++i) printf("%02x ",buffer[i]);
   printf("\n");

   // See what STL has for us
   str = string(buffer,len);
   printf("\tstring (str)    = %s\n",str.c_str());
   std::cout << "\tstring as such  = " << str << std::endl;

   // See if we can sacan 11 characters and convert them together
   memset(pbuffer,0, 1024);
   sscanf(buffer,"%11c",pbuffer);
   printf("\tscanf (11c) is  = %s (",pbuffer);
   for(i = 0; i< strlen(pbuffer); ++i)
      printf("%02x ",pbuffer[i]);
   printf(")\n");

   // See what happens if we sprintf 11 characters
   memset(pbuffer,0, 1024);
   snprintf(pbuffer,11,"%c",buffer);
   printf("\tsprintf (11c) is= %s (",pbuffer);
   for(i = 0; i< len; ++i)
      printf("%02x ",pbuffer[i]);
   printf(")\n");

   // See what happens if we use snprintf to repeat the 11 char printf
   memset(pbuffer,0, 1024);
   snprintf(pbuffer,11,"%s",buffer);
   printf("\tsprintf (11s) is= %s (",pbuffer);
   for(i = 0; i< len; ++i)
      printf("%02x ",pbuffer[i]);
   printf(")\n");

   // And try doing them one by one
   memset(pbuffer,0, 1024);
   for(i = 0; i< len; ++i)
      sprintf(pbuffer,"%s%c",pbuffer,buffer[i]);
   printf("\tloop result is  = %s (",pbuffer);
   for(i = 0; i< len; ++i)
      printf("%02x ",pbuffer[i]);
   printf(")\n");

   printf("\n\n");

   return argc;

Quote:}


within the e-mail):

 char *          = ali
 char buffer     = 61 6c 69 00 69 73 00 74 61 6c 6c
 string (str)    = ali

 scanf (11c) is  = ali (61 6c 69 )
 sprintf (11c) is= \340 (ffffffe0 00 00 00 00 00 00 00 00 00 00 )
 sprintf (11s) is= ali (61 6c 69 00 00 00 00 00 00 00 00 )
 loop result is  = aliistall (61 6c 69 69 73 74 61 6c 6c 00 00 )

 
 
 

raw char * buffer to char * printable string conversion issue

Post by Jason Gree » Sun, 16 Apr 2000 04:00:00



> I'll try to be brief... I am trying to convert a string (well a character
> buffer) that can contain values 0 ... 255 to a pritable string (kinda a
> normal C-like notation). Basically I'll be happy if I could find a system
> function that would convert a string containing a NULL character in the
> middle (the actual memory could look like this: 0x61 0x6c 0x00 0x61 0x6c)
> into a printable form of "al\x0al" or "al\0al".

> I would be even more happy if there was a routine that would take "al\0al"
> and a pointer as input and would produce the required buffer and returned
> it's size (not the number of character to the first NULL).

I think you are saying you want to convert a buffer containing binary
data into the string form used in C/C++ source code.

This function will do that for you.  But beware that it does not
perform bounds checking on str.  So you need to allocate at least 4
times as much memory for str as for buf just in case all the chars
turn out to be of the form \xxx.  This could be fixed though.

#include <ctype.h>
#include <string.h>

void BufToStr(const char* buf, int buflen, char* str)
{
    int i;

    for (i=0, *str=0; i<buflen; i++)
        switch(buf[i])
        {
            case '\n' : strcat(str, "\\n");  break;
            case '\r' : strcat(str, "\\r");  break;
            case '\t' : strcat(str, "\\t");  break;
            case '\"' : strcat(str, "\\\""); break;
            case '\\' : strcat(str, "\\\\"); break;
            default   : sprintf(strchr(str,0),
                                isprint(buf[i])? "%c":"\\%03o",
                                buf[i]&0xFF);
        }

Quote:}


 
 
 

raw char * buffer to char * printable string conversion issue

Post by Tom » Tue, 18 Apr 2000 04:00:00


Quote:

> This function will do that for you.  But beware that it does not
> perform bounds checking on str.  So you need to allocate at least 4
> times as much memory for str as for buf just in case all the chars
> turn out to be of the form \xxx.  This could be fixed though.

<cut>

Thanks.

In the meantime I have implemented raw->printable and printable->raw
conversion functions similar to yours.... maybe the bit of printable->raw
was bit bigger coz I wanted to allow both hex and octal notation for all
characters (kinda string parsing was needed). At least I know that my ideas
are not totally out of blue :)

Regards
Tom

PS. What I originally hoped for was a single function that would do it....
like OS/GNU/..... printableToBinary and binaryToPrintable routines.

 
 
 

raw char * buffer to char * printable string conversion issue

Post by Jason Gree » Tue, 18 Apr 2000 04:00:00



> In the meantime I have implemented raw->printable and printable->raw
> conversion functions similar to yours.... maybe the bit of printable->raw
> was bit bigger coz I wanted to allow both hex and octal notation for all
> characters (kinda string parsing was needed). At least I know that my ideas
> are not totally out of blue :)

Take care with the embedded hex notation...

You may be well aware of this but many users are not.  The number of
hex digits after the \x is not limited to two as you might think.  So
the following will not get you free beer:

    printf("Free\x20Beer!\n");

Instead the compiler will attempt to insert the character 20BEE hex
into the string and will fail miserably.

To be sure you should use 3 character octal:

    printf("Free\040Beer!\n");

But any of these will also keep you happy:

    printf("Free\40Beer!\n");
    printf("Free\x20""Beer!\n");
    printf("Free\x20\x42\x65\x65r!\n");

Quote:> PS. What I originally hoped for was a single function that would do it....
> like OS/GNU/..... printableToBinary and binaryToPrintable routines.

I have never seen such functions.  But then I have never looked. ;-)
 
 
 

raw char * buffer to char * printable string conversion issue

Post by Tom » Wed, 19 Apr 2000 04:00:00


Quote:> You may be well aware of this but many users are not.  The number of
> hex digits after the \x is not limited to two as you might think.  So
> the following will not get you free beer:

>     printf("Free\x20Beer!\n");

Good point.... the thing is when I implement raw -> printable and
printable->raw I can impose any restrictions I wish to impose to make the
results valid. So the user is only allowed to give two digit hex numbers and
anythign after that is considered a string. I don't give the compiler a
choice... the compiler never sees the string entered by the user. So
although I may not conform the the ANSII standard for string notation I
shall feel satisfied for now :))

<CUT>

Quote:

> > PS. What I originally hoped for was a single function that would do
it....
> > like OS/GNU/..... printableToBinary and binaryToPrintable routines.

> I have never seen such functions.  But then I have never looked. ;-)

Shame about that. I could be fully compliant then :)
Tom
 
 
 

1. HOWTO filter single non-printable chars with a sequence of printable chars

Hi,

The context of this problem is: some mailers on Windows insert CP1252
encoded characters in mails but set charset=iso-8859-1. I use mutt on
Linux, which can filter mails before displaying them, so it's just a
question of defining a filter command line. The problematic CP1252
characters are: left/right single/double quotes, and em/en dashes. A
one-to-one mapping seemed to suffice for these, so I used tr. Now I
bumped into the horizontal ellipsis, octal 205 under CP1252, which I
think should be translated to '...'. This is where I got stuck.

I came up with this filter:

awk 'index("\205", $0) { gsub("\205", "...") } { print }'

but it feels like an overkill to filter every single mail through tr,
for the one-to-one mappings, and then awk, for the one-to-many
mappings.

Would anybody have a suggestion of a cleaner, more idiomatic way of
accomplishing this ?

Thanks for the patience in reading this
Paulo

2. lp remote!printer questions

3. [2.5] const char* to char* conversion in console.h

4. Network programming

5. unsigned char to a char* conversion ?

6. "Too many open files" error - how does one change limit ?

7. [2.4] const char* to char* conversion in console.h

8. script to process mail

9. char *strcasestr(char *haystack, char *needle) a simple case independent strstr()

10. return number string from a char string

11. Print line with non-printable chars

12. Non-printable chars in filenames: how do you get rid of them?

13. KernelJanitor - Change applicable char *foo to char foo[]