about use codewarrior powerplant LString class

about use codewarrior powerplant LString class

Post by wal » Sat, 26 Jul 2003 16:45:30



You know CW PP LString class only can handling less than 255 bytes
string,but i want to handling string more than 255 bytes, also can use
the functions in LString class. please tell me a method,thank you.
q^_^p
 
 
 

about use codewarrior powerplant LString class

Post by David Phillip Oste » Sun, 27 Jul 2003 16:17:13




> You know CW PP LString class only can handling less than 255 bytes
> string,but i want to handling string more than 255 bytes, also can use
> the functions in LString class. please tell me a method,thank you.
> q^_^p

Metrowerks has a fine implementation of the standard C++ string class,
that you get by saying:

#include <string>
using std::string;

You can convert from a string to an LStr255 with:

string s("from C to C++, make an initialized string");

LStr255 myPascalString(s.data(), s.size());

You can convert from an LStr255 to a string with:

string s1(myPascalString.TextPtr(), myPascalString.Length());

My advice is: keep everything in standard C++ strings, and learn to use
them. Convert to LStrings only when you need to. If you plan to
support Carbon (8, 9, or X), you should learn about CFStrings in
Mac OS and Powerplant, and still keep everything in C++ strings,
using the UTF-8 encoding.

The basic way of moving text into an out of PowerPlant user interface
objects used to be GetDescriptor(), and SetDescriptor(), but as you
point out, these take Str255s, so they are limited to 256 characters.

LPane also declares CopyCFDescriptor() (caller must CFRelease() the
result) and SetCFDescriptor(), that take CFStrings. CFStrings don't
have the 255 char length limit, and natively handle Unicode, and
conversion from UTF-8 to unicode and back.

Here what I use for converting between standard C++ strings and
CFStrings:

CFStringRef myCFString =
  CFStringCreateWithBytes(kCFAllocatorDefault,
      s.data(), s.size(), kCFStringEncodingUTF8, false);

and to go the other way:
CFIndex cfLen, slen;
cfLen = CFStringGetLength(myCFString);
// find out how much space the conversion will take.
CFStringGetBytes(myCFString, CFRangeMake(0, cfLen), kCFStringEncodingUnicode, 0, false, NULL, 0, &slen);
//make a string that big
string s(slen);
// write to it.
CFStringGetBytes(myCFString, CFRangeMake(0, cfLen), kCFStringEncodingUnicode, 0, false, &s[0], slen, &slen);

The advantage of UTF-8 is that 7 bit ASCII is already UTF-8,
UTF-8 strings contain no null bytes, so you can use them with
legacy functions like printf(), and Metrowrks has good support
for them with an extension to the standard "locale" facility.

Disadvantage is sane sorting of characters outside of 7-bit ascii.

 
 
 

about use codewarrior powerplant LString class

Post by Howard Hinnan » Mon, 28 Jul 2003 02:53:10


In article

| The advantage of UTF-8 is that 7 bit ASCII is already UTF-8,
| UTF-8 strings contain no null bytes, so you can use them with
| legacy functions like printf(), and Metrowrks has good support
| for them with an extension to the standard "locale" facility.
|
| Disadvantage is sane sorting of characters outside of 7-bit ascii.

It is just hard to resist a fun programming challenge. :-)  So please
take this in that spirit.  I read this and wondered how fast I could
cobble together "sane" sorting of UTF-8 stored in std::string, using
the extensions in MSL C++.  I borrowed from other posts I've made in
this area and threw the following together.  Please note that it is
barely tested, not terribly efficient, and not terribly well thought
out as a general purpose tool.  So it may not pass the sanity test! :-)
It is literally just cobbled together just for demonstration purposes
(and for fun).

The basic idea of the code below is to expand the std::string into a
std::wstring using the std::__utf_8<wchar_t> code conversion facet
found in <locale>.  Then sort a vector<pair<wstring,string>> using the
wstring as the key, and using MSL C++'s collate_byname facet to specify
a culturally sensitive rule (which you must supply).  Then pass back
the string part of each sorted pair.

#include <locale>
#include <string>
#include <algorithm>

struct my_collate
        : public std::collate_byname<wchar_t>
{
    my_collate();

Quote:};

my_collate::my_collate()
   : std::collate_byname<wchar_t>("C")
{
   rule_.set_rule(L"< A, a < B, b < C, c"
                  L"< D, d < E, e < , "
                  L"< F, f < G, g < H, h"
                  L"< I, i < , < J, j"
                  L"< Jh, jh < K, k < L, l"
                  L"< ll < M, m < N, n < O, o"
                  L"< P, p < Q, q < R, r < rr"
                  L"< S, s < Sh, sh < T, t < U, u"
                  L"< , < , < V, v < W, w"
                  L"< X, x < Y, y < Z, z");

Quote:}

std::wstring
Utf8ToWChar(const std::string& narrow)
{
   if (narrow.empty())
      return std::wstring();
   std::__utf_8<wchar_t> cvt;
   std::wstring wide(narrow.size(), '\0');
   std::mbstate_t state;
   const char* from_beg = &narrow[0];
   const char* from_end = from_beg + narrow.size();
   const char* from_nxt;
   wchar_t* to_beg = &wide[0];
   wchar_t* to_end = to_beg + wide.size();
   wchar_t* to_nxt;
   std::codecvt_base::result r = cvt.in(state,
            from_beg, from_end, from_nxt, to_beg, to_end, to_nxt);
   switch (r)
   {
   case std::codecvt_base::error:
   case std::codecvt_base::partial:
   case std::codecvt_base::noconv:
      throw std::runtime_error("unknown problem");
   case std::codecvt_base::ok:
      wide.resize((std::wstring::size_type)(to_nxt - to_beg));
   }
   return wide;

Quote:}

class wless_first
{
public:
    typedef std::pair<std::wstring, std::string> Pair;
    explicit wless_first(const std::locale loc) : loc_(loc) {}
    bool operator()(const Pair& p1, const Pair& p2) const
        {return loc_(p1.first, p2.first);}
private:
    std::locale loc_;

Quote:};

template <class ForwardIt>
void
sort_utf8(ForwardIt first, ForwardIt last)
{
    typedef std::vector<std::pair<std::wstring, std::string> > Map;
    Map v;
    for (ForwardIt i = first; i < last; ++i)
        v.push_back(Map::value_type(Utf8ToWChar(*i), *i));
    std::locale loc(std::locale(), new my_collate);
    std::sort(v.begin(), v.end(), wless_first(loc));
    for (Map::const_iterator i = v.begin(); i < v.end(); ++i, ++first)
        *first = i->second;

Quote:}

#include <iostream>

int main()
{
   std::vector<std::string> v;
   v.push_back("AAAAB");
    v.push_back("AAAAa");
    v.push_back("RrrS");
    v.push_back("RrsS");
    v.push_back("e");
    v.push_back("E");
    v.push_back("f");
    sort_utf8(v.begin(), v.end());
    std::ostream_iterator<std::string> out(std::cout, "\n");
    std::copy(v.begin(), v.end(), out);

Quote:}

If I were going to sink more time into this (which I'm not), the first
thing I would do is figure out how to avoid passing the original
strings around so much.

--
Howard Hinnant
Metrowerks

 
 
 

about use codewarrior powerplant LString class

Post by David Phillip Oste » Mon, 28 Jul 2003 06:09:14




> In article


> | Disadvantage is sane sorting of characters outside of 7-bit ascii.

> It is just hard to resist a fun programming challenge. :-)  

...

Thank you! I'm filing this post in my personal Tips folder, along with
your previous postings of tips about MSL.

Quote:> If I were going to sink more time into this (which I'm not), the first
> thing I would do is figure out how to avoid passing the original
> strings around so much.

So, I took your challenge, and tried to avoid passing the original
strings around so much. I tried replacing the string copies with
references:

typedef std::pair<std::wstring, std::string & > Pair;
typedef std::vector<std::pair<std::wstring, std::string & > > Map;

but that doesn't work because of the well known reference to a
reference problem (solved with boost::ref() from boost.org), and
because the final step of the sort copies from the vector of pairs
back to the original sequence in a permuted order, and if we only
have references, we clobber some of the original strings.

So, instead, I just compute a permutation vector of indices, and
use it to move from a copy into the original:

--------- just the changed code ---------
class wless_first
{
public:
    typedef std::pair<std::wstring, int> Pair;
    explicit wless_first(const std::locale loc) : loc_(loc) {}
    bool operator()(const Pair& p1, const Pair& p2) const
        {return loc_(p1.first, p2.first);}
private:
    std::locale loc_;

Quote:};

template <class ForwardIt>
void
sort_utf8(ForwardIt first, ForwardIt last)
{
    typedef std::vector<std::pair<std::wstring, int> > Map;
    Map v;
    int index = 0;
    std::vector<ForwardIt::value_type> copyInput;
    std::copy(first, last, back_inserter(copyInput));
    for (ForwardIt i = first; i < last; ++i){
        v.push_back(Map::value_type(Utf8ToWChar(*i), index++));
    }
    std::locale loc(std::locale(), new my_collate);
    std::sort(v.begin(), v.end(), wless_first(loc));
    for (Map::const_iterator i = v.begin(); i < v.end(); ++i, ++first){
      ForwardIt j = copyInput.begin();
      std::advance(j, (*i).second);
        *first= *j;
    }
Quote:}

------------------

But, what I'd really like to have is a routine, that given a
pair of random access iterators that define a range, and a
forward iterator to a container of integer indices,
efficiently executes the permutation in place.

If I had PermuteInPlace(), I could get rid of copyInput in
the above code.

Thanks again.

David Phillip Oster

 
 
 

about use codewarrior powerplant LString class

Post by Howard Hinnan » Mon, 28 Jul 2003 09:09:38


In article

| So, instead, I just compute a permutation vector of indices, and
| use it to move from a copy into the original:
|
| --------- just the changed code ---------
| class wless_first
| {
| public:
|     typedef std::pair<std::wstring, int> Pair;
|     explicit wless_first(const std::locale loc) : loc_(loc) {}
|     bool operator()(const Pair& p1, const Pair& p2) const
|         {return loc_(p1.first, p2.first);}
| private:
|     std::locale loc_;
| };
|
| template <class ForwardIt>
| void
| sort_utf8(ForwardIt first, ForwardIt last)
| {
|     typedef std::vector<std::pair<std::wstring, int> > Map;
|     Map v;
|     int index = 0;
|     std::vector<ForwardIt::value_type> copyInput;
|     std::copy(first, last, back_inserter(copyInput));
|     for (ForwardIt i = first; i < last; ++i){
|         v.push_back(Map::value_type(Utf8ToWChar(*i), index++));
|     }
|     std::locale loc(std::locale(), new my_collate);
|     std::sort(v.begin(), v.end(), wless_first(loc));
|     for (Map::const_iterator i = v.begin(); i < v.end(); ++i, ++first){
|       ForwardIt j = copyInput.begin();
|       std::advance(j, (*i).second);
|         *first= *j;
|     }
| }
| ------------------
|
| But, what I'd really like to have is a routine, that given a
| pair of random access iterators that define a range, and a
| forward iterator to a container of integer indices,
| efficiently executes the permutation in place.
|
| If I had PermuteInPlace(), I could get rid of copyInput in
| the above code.

<nod>, yeah, this is a good direction.

Search comp.lang.c++.moderated, subject "stable_sort and iter_swap",
posts by John Potter and Dennis Yelle.  They have permutation logic in
there, but it isn't cleanly seperated out into its own template
function.  I haven't thoroughly studied it.  But I suspect it is good
logic though.  And a generic permute function would be handy (maybe
even standardizable? maybe also an inverse permute?).

One other thing I'm noticing is that std::pair doesn't have a swap
overload:

template <class T1, class T2>
inline
void
swap(pair<T1,T2>& p1, pair<T1,T2>& p2)
{
   swap(p1.first, p2.first);
   swap(p1.second, p2.second);

Quote:}

The MSL C++ sort heavily uses swap, and currently swapping pairs is
just using the generic swap (standards defect?).  So you might get a
performance advantage with the above swap.  You would have to drop it
into namespace std (which is technically illegal).  Alternatively you
could make the second part of the pair a user defined int like type:

class my_int
{
// make it act like an int

Quote:};

typedef std::pair<std::wstring, my_int> Pair;

And then you could specialize swap in my_int's namespace:

template <class T1>
inline
void
swap(pair<T1,my_int>& p1, pair<T1,my_int>& p2)
{
   swap(p1.first, p2.first);
   swap(p1.second, p2.second);

Quote:}

Now you've got 100% legal code.  And your wstrings will sort using swap
instead of copy and assign.

Have fun, and stop before you burn up too much time! :-)

--
Howard Hinnant
Metrowerks