Having trouble?
Try the newer version at the University of Alberta:
http://taporware.ualberta.ca
Tools Home : HTML Tools : Summarizer

Click here to show HTML tools HTML Tools

Click here to expand XML tools XML tools

Click here to expand plain text tools Plain Text Tools

Click here to expand other tools Other tools

 Beta tools
 Add Tools Demo
 Manual
 About

Summarizer
?
Summary

This tool creates a summary of statistical information on a given document. It enables the user to select what types of information to display in the summary. Options include high frequency words, sentences with high frequency words, high frequency word context, collocation and element/text distribution.

For more details, see here.

Walkthrough

Example: fetch HTML from http://www.w3.org/; extract text between <body> and </body>; list: top five high frequency words found within resulting text, sentences that have one or more of the found high frequency words, five words before and after the first occurence of each high frequency word, high frequency word collocates where words are no more than one word apart.
  1. Source text
    1. Enter `http://www.w3.org/' in the URL field.
  2. Subtext limited to
    1. Enter `body' in the Elements field.
    1. Check List top 10 frequency words and set 10 to 5;
    2. select From all words from the radio button group below;
    3. check off all remaining check boxes and leave other fields on their default values.
    1. Select By Frequency from the Sort drop-down menu.
*
» Source text
  Example: http://www.w3.org/

?
Summary

Determines the HTML source. HTML can be obtained from a URL or by uploading a file.

Fields

Source URL
HTML from the entered URL will be used as the data source for the analysis.

Local file
Use this field to upload a local HTML file for analysis.
» Subtext limited to
(separate multiple elements with a `,')
?
Summary

Limits included text to text that appears within the spacified tag(s). Multiple tags should be delimited by commas. Leaving this field empty will include all text in the aggregate.

Fields

Elements
The text extraction will be restricted to the tag(s) entered here. Multiple tags should be separated by commas.
» Summary limited to





(separate words by `,')




?
Summary

Determines which types of information will be presented in the results.

Fields

List top n frequency words
Selecting this will include a list of the top n words in the results.

From all words
Takes into account all words in the document when searching for high frequency words.

Matching pattern
Allows the user to define a pattern that will limit which words will be accounted for when searching for high frequency words.

From word list
Limits the words accounted for when searching for high frequency words to those present in a user-defined word list.

Not from word list
Limits the words accounted for when searching for high frequency words to those not present in a user-defined word list.

Type in
Allows the user to manually enter a list of words that may be reffered to when From word list or Not from word list is selected in the above options.

From local file
Allows the user to upload a local file containing a list of words which will be referred to when From word list or Not from word list is selected in the above options.

List sentences that have n or more high frequency words
Selecting this option will display sentences containing n or more high frequency words.

For each high frequency word, list (first|first three|all) context(s) with context length of n words before and after
Displays high frequency words within their found context (first, first three or all occurences) with n words before and after each match.

List collocation within n words of the high frequency words
Lists collocates of high frequency words that appear no more than n words apart.

Elements against text distribution
Displays the number of occurences per element and the average number of words found within each unique element.
» Results
?
Summary

Allows the user to choose how the results will be formatted and whether they should be displayed in a new browser window.

Fields

Sort
Allows you to sort the results in one of several ways.

Display as
Determines the format in which results will be delivered

Open results in new window
Checking this box will display the results in a new window. This option is selected by default. In some cases pop-up blockers may disallow windows from being created, in which case this option may be de-selected.
`*' indicates a required field

 

 

TAPoRware Project, McMaster University,