Having trouble?
Try the newer version at the University of Alberta:
http://taporware.ualberta.ca
Tools Home : Other Tools : Aggregator

Click here to show HTML tools HTML Tools

Click here to expand XML tools XML tools

Click here to expand plain text tools Plain Text Tools

Click here to expand other tools Other tools

 Beta tools
 Add Tools Demo
 Manual
 About

Aggregate text from different sources
?
Summary

This tool aggregates texts/subtexts from different locations into a single text. The original texts can be from different locations, such as the internet or/and on your local machine. Aggregating subtexts requires all documents to share a common subtext tag, i.e. limiting the subtext to body requires all texts to have a <body> tag. The aggregator tool will grab the contents from the texts to form a single new text.

For more information, click here.

Walkthrough

Example: fetch HTML from http://www.w3.org/, http://www.cnn.com/ and http://www.kuro5hin.org/; extract text found between <body> and </body>; strip HTML from the results.
  1. Source text
    1. Select the URL(s) and enter `http://www.w3.org/', `http://www.cnn.com/' and `http://www.kuro5hin.org/' in the text area below the radio button;
  2. Subtext limited to
    1. Enter `body' in the Elements field.
  3. How to handle markup
    1. Select the Strip tags option. This will ensure that all tags are stripped from the source text.
*
» Source text


(list of URLs)

» Plus
?
Summary

Determines the texts that will be aggregated into a single documents. Documents can be on the web or uploaded.

Fields

Source text
Determines the texts that will be used in the aggregate.

URLs
A list of URLs form which text will be fetched. Separate URLs should be separated by new lines.

Local file with list of links
Local file containing a list of URLs. URLs in the file should be delimited by new lines.

Plus
Allows you to upload a local file for inclusion in the aggregate.
» How to handle markup


?
Summary

Determines whether or not tags will be stripped or retained when aggregating texts.

Fields

Strip tags
Selecting this option will strip all tags from the component texts.

Smart strip selected tags to form XML Corpus
Selecting this will cause non-XML texts to be commented out.
» Results
?
Summary

Allows the user to choose how the results will be formatted and whether they should be displayed in a new browser window.

Fields

Display as
Determines the format in which results will be delivered

Open results in new window
Checking this box will display the results in a new window. This option is selected by default. In some cases pop-up blockers may disallow windows from being created, in which case this option may be de-selected.
`*' indicates a required field

 

 

TAPoRware Project, McMaster University,