How do I translate all upper case to smart mixed case?

How do I translate all upper case to smart mixed case?

Post by gas.. » Thu, 06 Aug 1992 12:16:59



Hi all,

I'm looking to convert all-upper case output to mixed case -- sentances
start with capitals, all others are lowercase.  Here's the rules
I need to follow:

After 1 period, next non-whitespace character is capitalized.
No case switching after 2 or more periods... to allow for elipsis.
After 1 or more CR, next non-whitespace character is capitalized.

My knowledge ends with the command "tr /A-Z/ /a-z/" which of course
is rather less smart than I need.

Can the above be done reasonably simply?

Mucho thanks,
Nate.

--

Nathan Gasser       ><>        

 
 
 

How do I translate all upper case to smart mixed case?

Post by chris ro » Fri, 07 Aug 1992 03:00:08



Quote:>I'm looking to convert all-upper case output to mixed case -- sentances
>start with capitals, all others are lowercase.  Here's the rules
>I need to follow:
>After 1 period, next non-whitespace character is capitalized.
>No case switching after 2 or more periods... to allow for elipsis.
>After 1 or more CR, next non-whitespace character is capitalized.

I think that last rule should read "after 1 or more _blank_ lines",
otherwise you'll capitalize the first word-character in every line.

Quote:>My knowledge ends with the command "tr /A-Z/ /a-z/" which of course
>is rather less smart than I need.
>Can the above be done reasonably simply?

In sed and awk?  Simply?  Bwahahahaha.  In lex, maybe.
Here's a stab at a perl solution.

        #!/usr/local/bin/perl -p

        $nocap = 0, next unless /\w/;
        $_ = "\L$_";
        s/([^.]\.\s+)(\w)/$1\u$2/g;
        s/\bi\b/I/g;
        $nocap = 1, s/\w/\u$&/ unless $nocap;

In English:

        cap next line if this line is blank (actually, if it has no words)
        make everything lowercase
        cap first word-character following a lone .
        cap lone 'i'  (bonus feature :-)
        cap first char in first line or any line following a blank line

You can augment this with all manner of random hacks --
capitalize names following "Mr." or "Mrs.", check all
words against a dictionary of personal and place names,
etc.


 WARNING: article may contain flammable material.  Do not expose to open
   flames.  In case of accidental ignition, douse keyboard with water.

 
 
 

How do I translate all upper case to smart mixed case?

Post by Tom Christianse » Fri, 07 Aug 1992 04:09:21


Here's a version from Larry Wall that handles proper nouns.

--tom

------- Forwarded Message

Date:         22 Jan 90 20:24:27 GMT

Subject:      Re: All-uppercase text to mixed case
Organization: Jet Propulsion Laboratory, Pasadena, CA
Newsgroups:   comp.lang.perl

Here is something I whipped up for Peter Yee.  It does the same thing,
only more so.  It uses /usr/dict/words plus an exception dictionary.
The one supplied is obviously the start of one for translating NASA articles,
though there's still some stuff missing from it.

This won't work quite right until patch 9 comes out, with a fix for /\b/i.

Among other things, this program assumes that words containing no vowels
are acronyms, and should be capitalized.

Larry Wall

#!/bin/sh
: make a subdirectory, cd to it, and run this through sh.
echo 'If this kit is complete, "End of kit" will echo at the end'
echo Extracting unuc
sed >unuc <<'!STUFFY!FUNK!' -e 's/X//'
X#!/usr/bin/perl
X
Xprint STDERR "Loading proper nouns...\n";
Xopen(DICT,"/usr/dict/words") || die "Can't find /usr/dict/words: $!\n";
Xwhile (<DICT>) {
X    if (/^[A-Z]/) {
X       chop;
X       ($lower = $_) =~ y/A-Z/a-z/;
X       $proper{$lower} = $_;
X    }
X}
Xclose DICT;
Xprint STDERR "Loading exceptions...\n";
X
Xopen(PATS,"unuc.pats") || die "Can't find unuc.pats: $!\n";
X
X$prog = <<'EOT';
Xwhile (<>) {
X    next if /[a-z]/;
X    y/A-Z/a-z/;
X    s/(\w+)/$proper{$1} ? $proper{$1} : $1/eg;
X    s/^(\s*)([a-z])/$1 . (($tmp = $2) =~ y:a-z:A-Z:,$tmp)/e;
X    s/([-.?!]["']?(\n\s*|  \s*)["']?)([a-z])/$1 . (($tmp = $3) =~ y:a-z:A-Z:,$t
mp)/eg;
X    s/\b([b-df-hj-np-tv-xz]+)\b/(($tmp = $1) =~ y:a-z:A-Z:,$tmp)/eg;
X    s/([a-z])'([SDT])\b/$1 . "'" . (($tmp = $2) =~ y:A-Z:a-z:,$tmp)/eg;
XEOT
Xwhile (<PATS>) {
X    chop;
X    next if /^$/;
X    next if /^#/;
X    if (! /;$/) {
X       $foo = $_;
X       $foo =~ y/A-Z/a-z/;
X       print STDERR "Dup $_\n" if $proper{$foo};
X       $foo =~ s/([^\w ])/\\$1/g;
X       $foo =~ s/ /(\\s+)/g;
X       $foo = "\\b" . $foo if $foo =~ /^\w/; # XXX till patch 9
X       $foo .= "\\b" if $foo =~ /\w$/;
X       $i = 0;
X       ($bar = $_) =~ s/ /'$' . ++$i/eg;
X       $_ = "s/$foo/$bar/gi;";
X    }
X    $prog .= '    ' . $_ . "\n";
X}
Xclose PATS;
X$prog .= "}\ncontinue {\n    print;\n}\n";
X
X$/ = '';
X#print $prog;

!STUFFY!FUNK!
echo Extracting unuc.pats
sed >unuc.pats <<'!STUFFY!FUNK!' -e 's/X//'
XA.M.
XAir Force
XAir Force Base
XAir Force Station
XAmerican
XApr.
XAriane
XAug.
XAugust
XBureau of Labor Statistics
XCIT
XCaltech
XCape Canaveral
XChallenger
XChina
XCorporation
XCrippen
XDaily News in Brief
XDaniel Quayle
XDec.
XDiscovery
XEdwards
XEndeavour
XFeb.
XFord Aerospace
XFri.
XGeneral Dynamics
XGeorge Bush
XHeadline News
XHOTOL
XI
XII
XIII
XIV
XIX
XInstitute of Technology
XJPL
XJan.
XJul.
XJun.
XKennedy Space Center
XLDEF
XLong Duration Exposure Facility
XLong March
XMar.
XMarch
XMartin
XMartin Marietta
XMercury
XMon.
Xin May
Xs/\bmay (\d)/May $1/g;
Xs/\boffice of (\w)/'Office of ' . (($tmp = $1) =~ y:a-z:A-Z:,$tmp)/eg;
XNational Science Foundation
XNASA Select
XNew Mexico
XNov.
XOMB
XOct.
XOffice of Management and Budget
XPresident
XPresident Bush
XRichard Truly
XRocketdyne
XRussian
XRussians
XSat.
XSep.
XSoviet
XSoviet Union
XSoviets
XSpace Shuttle
XSun.
XThu.
XTue.
XU.S.
XUnion of Soviet Socialist Republics
XUnited States
XVI
XVII
XVIII
XVice President
XVice President Quayle
XWed.
XWhite Sands
XKaman Aerospace
XAerospace Daily
XAviation Week
XSpace Technology
XWashington Post
XLos Angeles Times
XNew York Times
XAerospace Industries Association
Xpresident of
XJohnson Space Center
XSpace Services
XInc.
XCo.
XHughes Aircraft
XCompany
XOrbital Sciences
XSwedish Space
XArnauld
XNicogosian
XMagellan
XGalileo
XMir
XJet Propulsion Laboratory
XUniversity
XDepartment of Defense
XOrbital Science
XOMS
XUnited Press International
XUnited Press
XUPI
XAssociated Press
XAP
XCable News Network
XCape York
XZenit
XSYNCOM
XEastern
XWestern
XTest Range
XJcsat
XJapanese Satellite Communications
XDefence Ministry
XDefense Ministry
XSkynet
XFixed Service Structure
XLaunch Processing System
XAsiasat
XLaunch Control Center
XEarth
XCNES
XGlavkosmos
XPacific
XAtlantic
!STUFFY!FUNK!
echo ""
echo "End of kit"
: I do not append .signature, but someone might mail this.
exit

------- End of Forwarded Message

--

    "UNIX was not designed to stop you from doing stupid things, because
     that would also stop you from doing clever things." -- Doug Gwyn

 
 
 

1. upper case vs lower case ****newbie*****

hey guys......thanks for the help.....
now i need to input a variable so that whether the user inputs upper or
lower case it will accept it
hence  this will work for "Q" but not for "q" ............. anyone wanna
help out .......... thanks again

read COIN_VAR
if [ $COIN_VAR = "Q" ];then                                         #1st If
statement
  echo "$Q_MESS,
         You have deposited $Q_AMT cents. Please insert `expr 50 - $Q_AMT`
more cents"
else

2. Cry for help: how to make exp faster?

3. lower case <-> upper case

4. Help: Xconfig and ghosting

5. upper case vs lower case account names

6. ok> Prompt is Not OK

7. Script to Convert Upper Case Filenames to Lower Case

8. Looking for dll tools

9. change lower case word to upper case using sed?

10. Convert upper case to lower case

11. Converting from lower case to upper case using sed

12. Help: how to convert lower case to upper case?

13. Upper case to lower case