Agent System POND 1.2 (28.2.2002)

FIM.Util.WWW.Form
Class HTMLForm

java.lang.Object
  |
  +--FIM.Util.WWW.Form.HTMLForm
All Implemented Interfaces:
Element, Serializable

public class HTMLForm
extends Object
implements Element, Serializable

Models a HTML form. The internal support of HTMLDocument ONLY works when using a GUI, so we cannot use it for handling it directly. Builds up on HTMLDocument.
Sample code for parsing a certain form and submitting it:

StringReader in=new StringReader(srcPageAsString);
HTMLEditorKit editorKit=new HTMLEditorKit();
HTMLDocument doc=(HTMLDocument)editorKit.createDefaultDocument();
doc.putProperty("IgnoreCharsetDirective",new Boolean(true));
editorKit.read(in,doc,0);
HTMLForm form=new HTMLForm(doc,nameOfTheFormToParse);
form.setBaseURL(baseURLOfTheForm);
......
// Modify the contained elements
......
String resultPage=form.doSubmit(null);

The elements of the form are stored in a flat structure and groups (e. g. radiobuttons or checkboxes with the same name) are stored separartely.

>See the HTML 4.01 specification

Version:
1.0, 1.7.2000
Author:
Michael Sonntag
See Also:
Serialized Form

Field Summary
protected  MutableAttributeSet formAttributes
          Attributes of the form.
protected  Vector formElements
          The elements of the form.
protected  String formName
          The name of the form.
protected  Hashtable groups
          The groups of elements within the form.
 
Constructor Summary
HTMLForm(HTMLDocument doc, int nthForm)
          Parses a certain form from a document by its position number (first form=1).
HTMLForm(HTMLDocument doc, String formName)
          Creates a new HTML form by extracting the information from a document.
 
Method Summary
 String doSubmit(ActionElement submitter)
          Submits the form.
 String doSubmit(ActionElement submitter, CookieStore store, URL referer)
          Submits the form.
 URL getActionURL()
          Retrieve the URL where the form is submitted to.
 AttributeSet getAttributes()
          Fetches the collection of attributes this element contains.
 URL getBaseURL()
          Retrieve the base URL of the form.
 Document getDocument()
          Fetches the document associated with this element.
 Element getElement(int index)
          Fetches the child element at the given index.
 int getElementCount()
          Gets the number of child elements contained by this element.
 int getElementIndex(int offset)
          Gets the child element index closest to the given offset.
 FormElement[] getElements(String name)
          Fetches the child elements (=form element) with the given name.
 int getEndOffset()
          Fetches the offset from the beginning of the document that this element ends at.
static HTMLForm getForm(String formPage, int nthForm)
          Creates a new HTML form by extracting the information from a document by its position number (first form=1).
static HTMLForm getForm(String formPage, String formName)
          Creates a new HTML form by extracting the information from a document.
static HTMLForm[] getForms(HTMLDocument doc)
          Parses all forms in the document and returns them as an array.
static HTMLForm[] getForms(String formsPage)
          Parses all forms in the document and returns them as an array.
 ElementGroup getGroup(String groupName)
          Retrieve the group object for a certain name.
 String getMethod()
          Retrieve the method for submitting the form.
 String getName()
          Fetches the name of the form.
 Element getParentElement()
          Fetches the parent element, which is always null, as forms have no parent.
 String getPostText(FormElement elem, int length)
          Retrieve the text visible immediately after an element.
 String getPreText(FormElement elem, int length)
          Retrieve the text visible immediately before an element (50 characters).
 int getStartOffset()
          Fetches the offset from the beginning of the document that this form begins at.
 boolean isDocumentLocalSubmit()
          Check whether this dorm will be submitted to the same host this document is from.
 boolean isLeaf()
          Is this element a leaf element?
 ElementGroup newGroup(HTMLDocument doc, String groupName, boolean singleSelectionOnly)
          Creates a new group element and adds it to the list of groups.
protected  void parseForm(HTMLDocument.Iterator iter)
          Parses a single form.
 void resetForm()
          Resets the form to its initial state.
 void setBaseURL(URL baseURL)
          Sets the base URL of the form.
 String toString()
          Returns a string representation of the form
 
Methods inherited from class java.lang.Object
, clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait
 

Field Detail

formName

protected String formName
The name of the form.

formElements

protected Vector formElements
The elements of the form. They are stored in the order they are encountered. Elements are stored flat, so no hierarchy takes place. Grouping is done through the separate groups. Only those elements providing data (at least possibly) are included.

formAttributes

protected MutableAttributeSet formAttributes
Attributes of the form.

groups

protected Hashtable groups
The groups of elements within the form. E. g. all checkboxes with thesame name or all radiobuttons belonging together or a group of elements to select (SELECT element).
Constructor Detail

HTMLForm

public HTMLForm(HTMLDocument doc,
                String formName)
         throws ParseException
Creates a new HTML form by extracting the information from a document. If multiple forms are in the document, the name of the form must be provided else the first form will be returned.
Parameters:
doc - the document in which the form is contained
formName - the value of the "name" attribute of the desired form. If null, the first form encountered is parsed
Throws:
ParseException - if an error occured during parsing (wrong element names, ...)

HTMLForm

public HTMLForm(HTMLDocument doc,
                int nthForm)
         throws ParseException
Parses a certain form from a document by its position number (first form=1).
Parameters:
doc - the HTML document to extract the form from
nthForm - number of the form to return from the document
Throws:
ParseException - if an error occured during parsing (wrong element names, ...)
Method Detail

parseForm

protected void parseForm(HTMLDocument.Iterator iter)
                  throws ParseException
Parses a single form. Creates all the subelements according to the content.
Parameters:
iter - the form to parse (in HTMLDocument style)
Throws:
ParseException - if an error occured during parsing (wrong element names, ...)

getForm

public static HTMLForm getForm(String formPage,
                               String formName)
                        throws ParseException
Creates a new HTML form by extracting the information from a document. If multiple forms are in the document, the name of the form must be provided else the first form will be returned.
Parameters:
formPage - the String of the age in which the form is contained
formName - the value of the "name" attribute of the desired form. If null, the first form encountered is parsed
Returns:
the parsed form from the page
Throws:
ParseException - if an error occured during parsing (wrong element names, ...)

getForm

public static HTMLForm getForm(String formPage,
                               int nthForm)
                        throws ParseException
Creates a new HTML form by extracting the information from a document by its position number (first form=1).
Parameters:
formPage - the String of the age in which the form is contained
nthForm - number of the form to return from the document
Returns:
the parsed form from the page
Throws:
ParseException - if an error occured during parsing (wrong element names, ...)

getForms

public static HTMLForm[] getForms(String formsPage)
                           throws ParseException
Parses all forms in the document and returns them as an array.
Parameters:
formspage - the page with the forms to parse
Returns:
an array of all forms in the page (length=0 if no forms found)
Throws:
ParseException - if an error occured during parsing (wrong element names, ...)

getForms

public static HTMLForm[] getForms(HTMLDocument doc)
                           throws ParseException
Parses all forms in the document and returns them as an array.
Parameters:
doc - the document to parse
Returns:
an array of all forms in the document (length=0 if no forms found)
Throws:
ParseException - if an error occured during parsing (wrong element names, ...)

getMethod

public String getMethod()
Retrieve the method for submitting the form. Should either be "GET" or "POST".
Returns:
method for submitting the form

setBaseURL

public void setBaseURL(URL baseURL)
Sets the base URL of the form.
Parameters:
baseURL - the new base URL for the form

getBaseURL

public URL getBaseURL()
Retrieve the base URL of the form. Returns null if none specified in the document or explicitely set.
Returns:
the base URL of the form or null

isDocumentLocalSubmit

public boolean isDocumentLocalSubmit()
Check whether this dorm will be submitted to the same host this document is from.
Returns:
true if the form is sent to the same host it is from

getActionURL

public URL getActionURL()
Retrieve the URL where the form is submitted to. Does NOT include the parameters if the method for submitting is GET.
Returns:
URl for sending the form to

resetForm

public void resetForm()
Resets the form to its initial state. All elements are reset.

doSubmit

public String doSubmit(ActionElement submitter)
                throws IOException
Submits the form. The parameter identifies the lement causing this action to happen (e. g. when using multiple submit buttons only the one invoked will be passed along). Uses a local CookieStore and no referer.
Parameters:
submitter - the element causing the form to submit or null
Returns:
the page the server sent in return to the post
Throws:
IOException - if an error occured during submitting the form or receiving the response

doSubmit

public String doSubmit(ActionElement submitter,
                       CookieStore store,
                       URL referer)
                throws IOException
Submits the form. The parameter identifies the lement causing this action to happen (e. g. when using multiple submit buttons only the one invoked will be passed along).
Parameters:
submitter - the element causing the form to submit or null
store - the cookies to send/receive
referer - the referer for this form
Returns:
the page the server sent in return to the post
Throws:
IOException - if an error occured during submitting the form or receiving the response

getPreText

public String getPreText(FormElement elem,
                         int length)
Retrieve the text visible immediately before an element (50 characters).
Parameters:
elem - the element to get the surrounding text for
length - the length of the text to get
Returns:
the text before the element

getPostText

public String getPostText(FormElement elem,
                          int length)
Retrieve the text visible immediately after an element.
Parameters:
elem - the element to get the surrounding text for
length - the length of the text to get
Returns:
the text after the element

getDocument

public Document getDocument()
Fetches the document associated with this element.
Specified by:
getDocument in interface Element
Returns:
the document

getParentElement

public Element getParentElement()
Fetches the parent element, which is always null, as forms have no parent.
Specified by:
getParentElement in interface Element
Returns:
always null

getName

public String getName()
Fetches the name of the form.
Specified by:
getName in interface Element
Returns:
the element name

getAttributes

public AttributeSet getAttributes()
Fetches the collection of attributes this element contains.
Specified by:
getAttributes in interface Element
Returns:
the attributes for the element

getStartOffset

public int getStartOffset()
Fetches the offset from the beginning of the document that this form begins at.
Specified by:
getStartOffset in interface Element
Returns:
the starting offset >= 0

getEndOffset

public int getEndOffset()
Fetches the offset from the beginning of the document that this element ends at.
Specified by:
getEndOffset in interface Element
Returns:
the ending offset >= 0

getElementIndex

public int getElementIndex(int offset)
Gets the child element index closest to the given offset. The offset is specified relative to the begining of the document, not the beginning of the form.
Specified by:
getElementIndex in interface Element
Parameters:
offset - the specified offset >= 0
Returns:
the element index >= 0

getElementCount

public int getElementCount()
Gets the number of child elements contained by this element.
Specified by:
getElementCount in interface Element
Returns:
the number of child elements >= 0

getElement

public Element getElement(int index)
Fetches the child element at the given index.
Specified by:
getElement in interface Element
Parameters:
index - the specified index >= 0
Returns:
the child element

getElements

public FormElement[] getElements(String name)
Fetches the child elements (=form element) with the given name. May return multiple elements (e. g. radiobuttons returns all elements of the group, but not the group itself). Comparison of the name is case-insensitive.
Parameters:
name - the specified name (if null all elements are returned)
Returns:
the child elements (array of length 0 if none found)

isLeaf

public boolean isLeaf()
Is this element a leaf element?
Specified by:
isLeaf in interface Element
Returns:
true if a leaf element (no elements in the form) else false

getGroup

public ElementGroup getGroup(String groupName)
Retrieve the group object for a certain name.
Parameters:
groupName - the name of the group
Returns:
the group of elements

newGroup

public ElementGroup newGroup(HTMLDocument doc,
                             String groupName,
                             boolean singleSelectionOnly)
Creates a new group element and adds it to the list of groups. If a group with this name already exists, this group is returned and no new one created. Groups can only contain subclasses of SelectableElement
Parameters:
doc - the document this group belongs to
groupName - the name of the group
singleSelectionOnly - if true, only one element of the group may be selected at any time.
Returns:
the created group
See Also:
SelectableElement

toString

public String toString()
Returns a string representation of the form
Overrides:
toString in class Object
Returns:
string representation of the form

Agent System POND 1.2 (28.2.2002)

Submit a bug

Copyright 2001,2002 Michael Sonntag & Institute for Information Processing and Microprocessor Technology (FIM), Johannes-Kepler-University Linz, Altenbergerstr. 69, A-4040 Linz, Austria.