Need help with Unicode

Need help with Unicode

Post by Neil Hain » Sat, 20 Jun 1998 04:00:00



Hi,

Has anyone implemented unicode?  I have the unicode standard, but find
it's implementation independent descriptions too far removed.  Can anyone
give me some C examples for (let's keep it simple) transforming ascii
to unicode and unicode to ascii.

TIA,

Neil

 
 
 

Need help with Unicode

Post by Stephen Bayne » Tue, 23 Jun 1998 04:00:00



> Hi,

> Has anyone implemented unicode?  I have the unicode standard, but find
> it's implementation independent descriptions too far removed.  Can anyone
> give me some C examples for (let's keep it simple) transforming ascii
> to unicode and unicode to ascii.

At the simplest level.

ASCII to unicode:

        unicode_char = ascii_char;

Unicode to ASCII:
        if( unicode_char <= 0xFF )
            ascii_char = unicode_char;
        else
            report_error_cant_be_converted();

[ASCII is a subset of UNICODE consisting of all characters <= 0xFF.
ISO8859-1 is a subset of UNICODE consisting of all characters <= 0x100]

Beyond the simplest level it gets complex. The wide character set supported
by your C compiler may or may not be unicode. If it has then you have a good start
with the conversion routines it provides. The new C9X standard is proposing
some additional support for using unicode numberings to describe character literals
and extra characters in variable names. X windows has some wide character support,
I am not sure if it is unicode based or not.

Encoding convertion is not too difficult, it is just a matter of lots and lots
of conversion tables. Displaying unicode is more difficult. Apart from things
like the direction of writing different character sets and that some unicode
characters can be regarded as combinations of others, unicode is a character
encoding, not a glyph encoding. Some characters in some languages are displayed
usign different glyphs in different contexts (for example at the end of the word).
This requires support from the font and its display system for it to work.
I am not sure what is available here.

Look out for the various internationalization FAQs:

  Programming for Internationalization FAQ

Try comp.unix.questions,comp.std.internat,comp.software.international,comp.lang.c,
comp.windows.x,comp.std.c,comp.answers,news.answers

  Finding Fonts for Internationalization FAQ

Try comp.std.internat,comp.software.international,comp.fonts,comp.windows.x,
comp.os.ms-windows.programmer.misc

--

Philips Semiconductors Ltd                  
Southampton SO15 0DJ                        +44 (01703) 316431
United Kingdom                              My views are my own.
Do you use ISO8859-1? Yes if you see ? as copyright, as division and ? as 1/2.

 
 
 

1. Extended Unicode: Help !

Does anyone out there know anything about character sets for Chinese, Japanese
and Korean in EUC. If so please can you help.

I am having problems working out chacter sets for the above and how they use
double byte mapping. Is it true double byte character mapping, or does the
this depend on the first high or low bit. In other words: Is the followung
correct.

EUC
ASCII   occupies HEX 'xx00'' xx is from HEX '20' to HEX '7E'.

2. help - Apache crash

3. Apache 2.0 Unicode Problem!!!! Help!

4. File types

5. unicode (HELP)

6. XView 3.2 menus

7. Help!......Problem with Unicode Display for French on Redhat Linux

8. Question regarding Red Hat 8.0 kernel compilation

9. Need help getting connected, bootp help needed

10. ARCNET Drivers Needed for Linux...help help help!

11. <help>Linux newbies need help with CD and soundcard<help>

12. ARCNET Drivers Needed for Linux...help help help!

13. NEED Help on routing and Samba ! HELP HELP!!