Package edu.nrao.sss.html

HTML support.

See:
          Description

Class Summary
HtmlAnchor An HTML anchor (HTML.Tag.A).
HtmlAttribute An attribute (see HTML.Attribute) of an HTML element.
HtmlElement Parent to other HTML elements.
HtmlTable An HTML table.
HtmlTableCell A cell in an HTML table.
HtmlTableRow A row in an HTML table.
HtmlTableTagHandler A handler of table-related HTML tags that may be called by subclasses of HTMLEditorKit.Parser.
 

Enum Summary
HtmlTableCell.Type Types of table cells.
HtmlTableRow.Type Types of table rows.
 

Package edu.nrao.sss.html Description

HTML support.

This package was first created in order to aid the parsing of HTML table elements. Though the package is more general than this, the table element is still the current focus of this package, and is the most well-developed class herein. If we see a need for more HTML support, we will enhance this package and dilute the emphasis on tables.

This package relies on the javax.swing.text.html package, which is oriented to allowing the swing widgets to use HTML in their display, and which supports HTML 3.2, but not 4.x. The out-of-date nature of java's package has an impact on this package.

Main Features of HTML Elements

The HtmlElement class is the parent of all HTML elements. The chief feature of this class is that it may be created and valued programatically and then asked to write itself as HTML. Clients need not be concerned with how a given element is written as HTML. At this time, the only concrete elements are HtmlAnchor and the table elements (table, table row, table cell). All elements may have one or more attributes, represented by the HtmlAttribute class.

HTML Tables

The HtmlTable class is mainly a collection of rows. Its most important features are:

  1. Ability to populate, or create, itself from an HTML source. (See readHtmlFrom(Reader, int) and createFromHtml(Reader).)
  2. Ability to populate itself from delimited text. (See readTextFrom(Reader, String).)
  3. Ability to write itself as HTML. (See writeHtmlTo(Writer, int, boolean).)
  4. Ability to write itself as text. (See writeTextTo(Writer, String, String).)

Limitations
It is a common construct to put tables in the cells of other tables. The HtmlTableCell class has only an unparsedContents property -- it does not recognize when it is holding a table. The parsing process of an outer table will put the entire contents of a cell in this unparsedContents property. If the software suspects that the contents might be a table, those contents can be parsed by the HtmlTable class. At this time, though, you cannot let the cell know anything about the semantics of its content.

As noted above, the java package does not support HTML 4.x. This means that it knows nothing about the thead, tbody, and tfoot elements. This package has some recognition of these tags and should parse a table with these tags properly. It will also use these tags, if appropriate, when converting a table to HTML. When java begins supporting HTML 4.x, we will revisit these tags. The most likely change will be that the row type will be moved out of the HtmlTableRow class, and that the HtmlTable will hold a header, a body, and a footer section.

Since:
2007-03-16
Author:
David M. Harland


Copyright © 2009. All Rights Reserved.