PageBox: servlet running in sandbox on J2EE PageBox

for
Presentation FAQ Reference Customisation Runtime Forms Demo Math Verification Downloads Troubleshooting

Advanced functions in Cuckoo

Element support

Horizontal rule

To create a horizontal rule (<hr/>) in the HTML page, create an empty paragraph in Horizontal style like this:


Preserve

The <pre> element preserves the text format. Within a <pre> element all characters are interpreted literally and retained in display, including multiple spaces, tabs, carriage returns and linefeeds.

You can use two Word styles to create <pre> elements:

  • Preserve. Preserve is a character style and allows you to preserve the text format of a character string.

  • PreservePar. PreservePar is a paragraph style and allows you to preserve the text format of a multi-line area.

Mathematics

Cuckoo 0.0.5 and above support characters and styles used in Mathematics and Chemistry.

It leverages on Word and browser capabilities.

Mathematics and Greek support in Word 97/2000

Subscript and Superscript

In Mathematics you often need to write things such as x2.

In Chemistry you often need to write things such as H2O.

Cuckoo leverages on Word Superscript and Subscript to display smaller characters above or below the normal characters.

If you don’t yet have configured Word with Superscript and Subscript here is how to do it.

Open up the Tools... Customize menu and choose the All Commands category:

Scroll down to find the Subscript and Superscript options. Then click the desired option and drag it on the toolbar. Your toolbar should now contain x2 and x2:

Now if you want to write x2 type x as usual, click on x2 button, type 2 and click again on x2 to revert to normal mode.

Note:

Normally you don’t need to configure Word because Subscript and Superscript are already saved in cuckoo.dot.

Equation Editor

The Equation Editor comes with Microsoft Office or Works (including OEM editions). However it is not installed under a standard installation. Therefore you need to retrieve your installation CDROM and do a custom installation. Once the Equation Editor is installed, just follow the same procedure as for Superscript and Subscript:

  • Open up the Tools... Customize menu and choose the Insert category

  • Scroll down to find the Equation Editor option. Then click and drag the option on the toolbar. Your toolbar should now contain Öa

You can learn how to use the Equation Editor on spot.pcc.edu/academ/math/faculty/simonds/handouts/equation/ and on

www.rdg.ac.uk/SerDepts/su/Topic/WordProc/WoP2Kequ01/.

When Cuckoo finds an Equation in the document it generates a small .gif image and a <img> element.

Symbols

A Font named Symbol supports Mathematical symbols and Greek characters in Word.

If you type 'a' with the Symbol font a a is displayed.

The Symbol font is the same as the Symbol font of browsers. You can find the codes of Greek and Mathematical characters here. However in Word Symbol characters are stored on two bytes, for instance F022 or -4074 for ".

Issues

Support of Greek and Mathematical characters is buggy. There is a gap between what should be supported and what really works and there are problems on browsers as well as in Word.

Browsers

HTML 4.0 uses Unicode as its base character set and we should be able for instance to display " with &#8704;. Furthermore named entities were defined and we should be able to use &forall;, arguably easier to remember.

Before HTML 4.0 the only solution was to use the Symbol font. We could code " with

<font face="Symbol">&#34;</font> where 34 is 22 in decimal.

The math browser page shows how well your browser supports HTML 4 and Symbol font.

The table below summarizes the situation:

Browser

Behavior with default installation

Old browsers and Opera 5

Only Symbol font

Internet Explorer 5 and 6

Partial support of named entities and Unicode + support of Symbol font

Netscape 6.2, Mozilla 0.9.6 and Opera 6

Full support of named entities and Unicode. No support of Symbol font

To summarize it is impossible to display symbols and Greek characters on all browsers without checking the browser type.

Insert Symbol

Whatever symbol you choose, Word Insert Symbol inserts x0028 or '(' in the document. Ok the character code is stored somewhere else (it is displayed!) but impossible to retrieve with VBA. Therefore we created a Word document containing all supported characters.

Download and inflate this file. Then Copy | Paste Mathematical symbols from this file to your document.

For Greek characters, it is faster to type the Latin equivalent. You can easily remember that a a is an a, a b is a b and so on.

Unicode

Browsers support UTF-8 (not so well in case of Netscape 4.7) with good reasons because it is a compact format where characters are represented on a minimal number of bytes. UTF-8 is the Unicode Transformation Format that serializes a Unicode code point as a sequence of one to four bytes. Here is the transformation table:

Code points

1st byte

2nd byte

3rd byte

4th byte

U+0000...U+007F

00...7F

U+0080...U+07FF

C2...DF

80...BF

U+0800...U+0FFF

E0

A0...BF

80...BF

U+1000...U+FFFF

E1...EF

80...BF

80...BF

U+10000...U+3FFFF

F0

90...BF

80...BF

80...BF

U+40000...U+FFFFF

F1...F3

80...BF

80...BF

80...BF

U+100000...U+10FFFF

F4

80...8F

80...BF

80...BF

There are other representations of Unicode characters. The most important is UTF-16/UCS-2, which is supported by Windows API and by programs such as NotePad. It uses 16 bits to code 63K. It can use 2 x 16 bits to represent 1M characters. UCS stands for Universal Multiple-Octet Coded Character Set and is defined by ISO 10646.

For more information read the Unicode standard.

We can also represent Unicode characters in a 8-bit ASCII file using entity codes – code starting with &#, for instance &#8704. For more information you can look at http://www.bbsinc.com/iso8859.html and at the RFC 2070. However in the entity codes, characters are coded in UCS-2. A great source of information is A tutorial on character code issues by Jukka Korpela.

Implementation

cuckoo.dot

cuckoo.dot processes characters written with the Symbol font.

If the character is a character listed in the Math browser page and in the Word document, it converts it into the equivalent Unicode numbered entity, for instance &#8704; for ". Otherwise it copies the ASCII value of the character:

If r.text > ChrW(61440) Then ' If the character is F0xx

Dim k As Integer

k = AscW(r.text) ' Convert the character in integer

processMathChar = txt & Chr(4096 + k)

Else

processMathChar = txt & r.text

End If

cuckoo.dot also identifies Subscript and Superscript and wraps the character(s) in a <span> element.

math-gen.js

cuckoo.dot generates XML and HTML 4 files appropriate for a display in Netscape 6, Mozilla 0.9.6 and Opera 6.

These files will not display all Greek characters and Mathematical symbol on other browsers.

math-gen.js is a WSH script that makes two things:

  1. It converts Unicode numbered entities into Symbol character codes

  2. It inserts a script that detects the browser version and redirects to the HTML 4 version when the browser is Netscape 6, Mozilla or Opera 6:

function redirect() {

window.location = window.location.toString().replace(".html", "-nav6.html");

}

var agt=navigator.userAgent.toLowerCase();

var is_major = parseInt(navigator.appVersion);

var is_nav = ((agt.indexOf("mozilla")!=-1) && (agt.indexOf("spoofer")==-1)&& (agt.indexOf("compatible") == -1) && (agt.indexOf("opera")==-1)&& (agt.indexOf("webtv")==-1) && (agt.indexOf("hotjava")==-1));var is_nav6up = (is_nav && (is_major >= 5));

var is_gecko = (agt.indexOf("gecko") != -1);

var is_opera6 = (agt.indexOf("opera 6") != -1 || agt.indexOf("opera/6") != -1);

if (is_nav6up || is_gecko || is_opera6)

setTimeout("redirect();", 10);

Note:

math-gen.js is designed to process HTML files generated by cuckoo-gen.js.

UTF16to8.exe

UTF16to8 is a small utility that converts Unicode numbered entities into UTF-8.

Run UTF16to8 for HTML 4 files (Netscape 6, Mozilla, Opera 6 and IE 6 with the proper font installed), therefore not to files processed by math-gen.js.

The implementation is quite simple:

/*

* UTF-16 to UTF-8

* 1) 00000yyy yyxxxxxx 110yyyyy 10xxxxxx

* 2) zzzzyyyy yyxxxxxx 1110zzzz 10yyyyyy 10xxxxxx

*/

// code contains what is behind &# and before ; in the entity code,

// parsed by sscanf(buf, "%d", &code);

if (code < 2048) { // Upper byte with 3 bits or less set

int upper = code >> 6;

upper += 0xc0; // 11000000

char bupper = (char)upper;

int lower = code & 0x3f;

lower += 0x80; // 10000000

char blower = (char)lower;

// For little endian

fwrite(&bupper, 1, 1, ofd);

fwrite(&blower, 1, 1, ofd);

} else {

int supper = code >> 12;

supper += 0xe0; // 11100000

char bsupper = (char)supper;

int upper = code & 0xfc0;

upper >>= 6;

upper += 0x80; // 10000000

char bupper = (char)upper;

int lower = code & 0x3f;

lower += 0x80; // 10000000

char blower = (char)lower;

// For little endian

fwrite(&bsupper, 1, 1, ofd);

fwrite(&bupper, 1, 1, ofd);

fwrite(&blower, 1, 1, ofd);

}

The use of UTF16to8 is optional. However, if your pages are written in non-Latin languages, UTF16to8 at least divides the page size by two. Yours pages are also easier to read.

MathGreek

We defined a MathGreek Word style with the Symbol font.

We also defined a MathGreek CSS class:

.MathGreek {

font-size:small;

font-family:Symbol;

}

You don’t need to use MathGreek because Cuckoo tests the font.

However if you define another style with the Symbol font, you must also define a CSS class with a Symbol font-family.

math-verif

Click on math-verif.html to see the different advanced functions of Cuckoo.

Contact:support@pagebox.net
©2001-2004 Alexis Grandemange. Last modified .