FIM.Util.WWW
Class WWWPageUtils
java.lang.Object
|
+--FIM.Util.WWW.WWWPageUtils
- public class WWWPageUtils
- extends Object
Utility functions for managing WWW-Pages. Allows retrieving webpages and posting forms. Includes
support for using cookies.
- Version:
- 1.0, 1.7.2000
- Author:
- Michael Sonntag
Field Summary |
static int |
MAX_REDIRECT_DEPTH
Maximum number of redirects to follow (MAX_REDIRECT_DEPTH+1 will be the final page). |
Method Summary |
static String |
canonicalize(String str)
Takes a string of HTML as an input an canonicalizes it.
|
static String |
fetchForm(String page,
CookieStore cookies,
String request,
URL referer,
boolean post)
Send a form and retrieve the response. |
static String |
fetchForm(URL page,
CookieStore cookies,
String request,
URL referer,
boolean post)
Send a form and retrieve the response. |
static String |
fetchPage(String page,
CookieStore cookies)
Retrieve a page and return it as a single string. |
static String |
fetchPage(URL page,
CookieStore cookies)
Retrieve a page and return it as a single string. |
static String |
stripTags(String str)
Strips all tags from a string. |
Methods inherited from class java.lang.Object |
, clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
MAX_REDIRECT_DEPTH
public static final int MAX_REDIRECT_DEPTH
- Maximum number of redirects to follow (MAX_REDIRECT_DEPTH+1 will be the final page).
WWWPageUtils
public WWWPageUtils()
fetchPage
public static String fetchPage(String page,
CookieStore cookies)
throws MalformedURLException,
IOException
- Retrieve a page and return it as a single string. Lines are separated by '\n'.
- Parameters:
page
- the page to fetchcookies
- the CookieStore
to use (if null
cookies will be ignored and none given out)- Returns:
- the page as a single string
- Throws:
IOException
- if an error occured during fetching the pageMalformedURLException
- if the string could not be converted to an URL
fetchPage
public static String fetchPage(URL page,
CookieStore cookies)
throws IOException
- Retrieve a page and return it as a single string. Lines are separated by '\n'.
Must be synchronized as
fecthPage
and fetchForm
both modify
HttpURLConnection.setFollowRedirects()
, which is a static method and the same
for all connections.
- Parameters:
page
- the page to fetchcookies
- the CookieStore
to use (if null
cookies will be ignored and none given out)- Returns:
- the page as a single string
- Throws:
IOException
- if an error occured during fetching the page
fetchForm
public static String fetchForm(String page,
CookieStore cookies,
String request,
URL referer,
boolean post)
throws MalformedURLException,
IOException
- Send a form and retrieve the response.
- Parameters:
page
- the URL
the form to post to. Must include the full action (e. g. "http://www.acme.com/doIt), but not the parameters (e. g. "?do=action¶m=now+and+then")cookies
- the CookieStore
to use (if null
cookies will be ignored and none given out)request
- the request of the page (i. e. the content of the inputs; e. g. "?do=action¶m=now+and+then")referer
- the URL
of the page containing the form (request property "Referer") (not set if null
)post
- if true
, submit method is POST, otherwise GET- Returns:
- the page as a single string
- Throws:
IOException
- if an error occured during fetching the pageMalformedURLException
- if the string could not be converted to an URL
fetchForm
public static String fetchForm(URL page,
CookieStore cookies,
String request,
URL referer,
boolean post)
throws IOException
- Send a form and retrieve the response.
- Parameters:
page
- the URL
the form to post to. Must include the full action (e. g. "http://www.acme.com/doIt), but not the parameters (e. g. "?do=action¶m=now+and+then")cookies
- the CookieStore
to use (if null
cookies will be ignored and none given out)request
- the request of the page (i. e. the content of the inputs; e. g. "?do=action¶m=now+and+then")referer
- the URL
of the page containing the form (request property "Referer") (not set if null
)post
- if true
, submit method is POST, otherwise GET- Returns:
- the page as a single string
- Throws:
IOException
- if an error occured during fetching the pageMalformedURLException
- if the string could not be converted to an URL
stripTags
public static String stripTags(String str)
- Strips all tags from a string. This will remove everything between '<' and '>'.
Converts <br>, <p> and </p> to "\n" and removes all other linebreaks and tabs.
Also converts the special characters ( = " ", ä = "ä", ...) and types of " ".
- Returns:
- the input string stripped from all HTML-tags
canonicalize
public static String canonicalize(String str)
- Takes a string of HTML as an input an canonicalizes it.
All linebreaks and multiple spaces within tags are removed.
E. g. "<\n / p> will be changed to "</p>".
- Parameters:
str
- the input string- Returns:
- the canonicalized string
Submit a bug
Copyright 2001,2002 Michael Sonntag & Institute for Information Processing and Microprocessor Technology (FIM), Johannes-Kepler-University Linz, Altenbergerstr. 69, A-4040 Linz, Austria.