Having trouble?
Try the newer version at the University of Alberta:
http://taporware.ualberta.ca
Tools Home : HTML Tools : Find Text — Co-occurrence

Click here to show HTML tools HTML Tools

Click here to expand XML tools XML tools

Click here to expand plain text tools Plain Text Tools

Click here to expand other tools Other tools

 Beta tools
 Add Tools Demo
 Manual
 About

Find Text — Co-occurrence
?
Summary

This tool looks for two words a certain distance apart from one another. By entering a primary and secondary pattern, TAPoR will search the document for occurrences of the two patterns found within the user-specified limits of words, sentences, or lines. If desired, the results can be narrowed to include words only found within certain tags.

Note: When "Paragraphs" is selected as context, the element name user entered will be overwritten by "p" tag. This may cause some confusion because some html texts do not use "p" tag to identify paragraphs. In this case, please use other options.

For more details, see here.

Walkthrough

Example: fetch HTML from http://www.w3.org/; extract text found between the <body> and </body> tags; limit the results to strings of text that contain both `web' and `W3C', where both words are found no more than ten words apart.
  1. Source text
    1. Enter `http://www.w3.org/' in the URL field.
  2. Subtext limited to
    1. Enter `body' in the Elements field.
  3. What to find
    1. Enter `web' in the Primary pattern field;
    2. enter `W3C' in the Co-pattern field.
  4. Context for concordance
    1. From the Context drop-down menu, select Words;
    2. enter `10' in the Context length field.
*
» Source text
  Example: http://www.w3.org/

?
Summary

Determines the HTML source. HTML can be obtained from a URL or by uploading a file.

Fields

Source URL
HTML from the entered URL will be used as the data source for the analysis.

Local file
Use this field to upload a local HTML file for analysis.
*
» Subtext limited to
(separate multiple elements with a `,')
?
Summary

Limits included text to text that appears within the spacified tag(s). Multiple tags should be delimited by commas. Leaving this field empty will include all text in the aggregate.

Fields

Elements
The text extraction will be restricted to the tag(s) entered here. Multiple tags should be separated by commas.
*
» What to find
?
Summary

Searches for the occurence of two text patterns in the text.

Fields

Primary pattern
Enter a word to search for in co-occurrence with the co-pattern.

Co-pattern
Enter a word to search for in co-occurrence with the primary pattern.
» Context for concordance
?
Summary

Allows user to define context type (e.g. words or sentences) and length.

Fields

Context

Words
places the search term in context by the specified number of words.

Lines
places the search term in context by the specified number of lines.

Sentences
places the search term in context by the specified number of sentences.

Context Length
Indicates the number of words/lines/sentences to be displayed before and after the search term for context purposes.
» Results
?
Summary

Allows the user to choose how the results will be formatted and whether they should be displayed in a new browser window.

Fields

Display as
Determines the format in which results will be delivered

Open results in new window
Checking this box will display the results in a new window. This option is selected by default. In some cases pop-up blockers may disallow windows from being created, in which case this option may be de-selected.
`*' indicates a required field

 

 

TAPoRware Project, McMaster University,