Using XHTML entities in XML documents: Legal?

Using XHTML entities in XML documents: Legal?

Post by Peter C. Chapi » Sun, 06 Jul 2003 20:49:06



I have a need to include Greek letters in some of my XML documents (the
documents contain astronomical information and many stars are named using
Greek letters). Following some earlier postings on the subject of
entities. I did the following

---- top of file ----
<?xml version="1.0"?>

<!-- I added this to an existing document. -->
<!DOCTYPE observation-set [
<!ENTITY % HTMLsymbol PUBLIC
   "-//W3C//ENTITIES Symbols for XHTML//EN"
   "xhtml-symbol.ent">
%HTMLsymbol;
]>

<?xml-stylesheet type="text/xsl" href="AOML.xsl"?>

<!-- This is the existing document root. -->
<observation-set
  xmlns="http://www.ecet.vtc.edu/~pchapin/AOML_0.0"
  xmlns:xhtml="http://www.w3.org/1999/xhtml"
  xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
  xsi:schemaLocation="http://www.ecet.vtc.edu/~pchapin/AOML_0.0
                     AOML.xsd">

  <!-- Now I believe I can use &alpha;, &beta;, etc. here. -->

</observation-set>
---- end of file ----

I'm attempting to borrow the entity definitions that were created for
XHTML. I downloaded the file xhtml-symbol.ent from the W3C and have a
copy locally in the same folder as the XML document that references it.
My desire was to now be able to use things like &alpha; and &beta; in my
XML document.

This mostly works. In particular, it works fine with IEv6. My XML
documents also validate (no complaints about undefined entities) with XSV
and XMLSpy (using MSXML, I believe). Also if I use Xalan to style
the document, it generates appropriate HTML. In fact I was able to prove
that Xalan is reading the external file containing the entity
definitions: I temporarily changed the definition of &alpha; to be the
same as &beta;. When Xalan wrote its output it serialized the character I
had written as "&alpha;" in the XML document into "&beta;" in the output
HTML document. Very cool.

However, with Mozilla v1.3 I get "undefined entity" errors. Even if I
include in the internal subset an explicit definition of the entities I'm
using, Mozilla still doesn't seem to notice them. Is this a problem with
Mozilla or am I missing something in my document? It is my desire to
support Mozilla so disregarding this problem is not really an option.

On a possibly related note, the Xerces (v2.3.0) parser seems to notice
the entities but it produces errors of this sort:

[Error] AO-2003-06-16.xml:16:75: Element type "observation-set" must be
declared.

The (line, column) of the error points to the end of the opening
observation-set tag. This error does not occur if I remove the <!DOCTYPE
observation-set [...]>. It almost seems as if Xerces sees the DOCTYPE
declaration and commits itself to the idea that a DTD is being used when,
in fact, the document uses an XML Schema. (It complains about all the
other elements as well, not just the document element). However, neither
XSV nor MSXML seemed to have that problem. Is this an issue with Xerces
or is mixing DOCTYPE and XML Schemas a bad thing?

Thanks for any clarification you can provide.

Peter

 
 
 

Using XHTML entities in XML documents: Legal?

Post by Martin Honne » Sun, 06 Jul 2003 21:28:50



> I have a need to include Greek letters in some of my XML documents (the
> documents contain astronomical information and many stars are named using
> Greek letters). Following some earlier postings on the subject of
> entities. I did the following

> ---- top of file ----
> <?xml version="1.0"?>

As you don't specify an encoding for your XML you use UTF-8 or UTF-16
both of which are capable to encode Greek letters without the need to
use entities.
So why do you need to use entities?

--

        Martin Honnen
        http://JavaScript.FAQTs.com/

 
 
 

Using XHTML entities in XML documents: Legal?

Post by Peter C. Chapi » Mon, 07 Jul 2003 04:02:32



says...

Quote:> As you don't specify an encoding for your XML you use UTF-8 or UTF-16
> both of which are capable to encode Greek letters without the need to
> use entities.
> So why do you need to use entities?

Well, I don't have an editor that allows me to easily enter or view Greek
letters. I have been meaning to look into the matter of editing "Unicode"
files (that is, files that use characters above U+007F to a non-trivial
extent). I haven't walked that road as yet and I guess I figured the
entity solution would address the matter for the half dozen or so greek
characters that I need per document in my current situation.

Since posting my original note I spent some time with the Mozilla bug
database. It turns out that Mozilla doesn't (at least old versions) read
external entities (apparently non-validating parsers are not required to
do so). Furthermore once it encounters a reference to an external entity
it stops processing the internal DTD subset. Apparently this is according
to the XML specification.

However, unlike my earlier assertion Mozilla does read the internal DTD
subset. The reason it didn't notice the Greek entity definitions that I
tried before was because I put them *after* the reference to the external
entity. When I remove the external entity entirely it works fine.

Thus I can get the effect I want if I define all the Greek letter
entities in the internal DTD subset of each document that I produce. That
is not ideal but it is workable, I think.

Peter

 
 
 

Using XHTML entities in XML documents: Legal?

Post by Andreas Prilo » Mon, 07 Jul 2003 04:09:52



Quote:>> So why do you need to use entities?

> Well, I don't have an editor that allows me to easily enter or view Greek
> letters.

Then use  &#number;  references.
 http://www.unics.uni-hannover.de/nhtcapri/multilingual2.html#greek

--
Top posting.
What's the most irritating thing on Usenet?

 
 
 

Using XHTML entities in XML documents: Legal?

Post by Peter C. Chapi » Mon, 07 Jul 2003 11:10:09




Quote:> > Well, I don't have an editor that allows me to easily enter or view Greek
> > letters.

> Then use  &#number;  references.

The document is far more readible and writable using, for example
"&alpha;" than it is using "&#945;". While the numeric references do work
they don't really seem like a very nice solution in this case. I read and
write these documents manually.

Peter

 
 
 

Using XHTML entities in XML documents: Legal?

Post by chris.dan » Mon, 07 Jul 2003 23:59:13



> Well, I don't have an editor that allows me to easily enter or view Greek
> letters. I have been meaning to look into the matter of editing "Unicode"
> files (that is, files that use characters above U+007F to a non-trivial
> extent). I haven't walked that road as yet and I guess I figured the
> entity solution would address the matter for the half dozen or so greek
> characters that I need per document in my current situation.

Windows:

http://www.esperanto.mv.ru/UniRed/ENG/

Java:

http://www4.vc-net.ne.jp/~klivo/sim/simeng.htm

Unices:

http://www.yudit.org/

hth,
Chris

 
 
 

1. using XML to store legal documents

Has anyone had any experience with storing legal documents as XML files (in
particular within SQL Server 2000). Basically I'm going to be storing a
legal document which contains figures/dates etc. which will need to be
accessable for use in reports.

e.g. (a very simplified example)

<legal-doc>
<section1>
    This document comencing on <date>21/12/2001</date>, is legally bing blah
blah blah.
</section1>
</legal-doc>

Basically, I need to be able to access the <date> node (this is no problem),
but also be able to edit the text in the <section1> node (both the text
before the <date> node and after it), this is where I'm stuck. Does anyone
have experience of doing something similar? if so, would you mind giving me
a few pointers on how to best go about this.

By the way if it's not already obvious I am a newbie!!

Thanks in advance

Steve

2. no title

3. PD DTD for legal documents (esp. contracts)?

4. Game Developers Wanted - Palo Alto, CA, USA

5. Dividing large XHTML document into smaller ones using XSLT and Xalan

6. HELP!!NO BOOT DISK!

7. Using Entities to Nest XML Documents

8. Database region

9. same child element used more than once in same element - legal?

10. how can I maintain a document from multiple source files using external entities?

11. How to declare an entity for a document that uses a schema?

12. VS Add-Ins for XML