Having trouble?
Try the newer version at the University of Alberta:
http://taporware.ualberta.ca
Tools Home : HTML Tools : Find Text — Collocation

Click here to show HTML tools HTML Tools

Click here to expand XML tools XML tools

Click here to expand plain text tools Plain Text Tools

Click here to expand other tools Other tools

 Beta tools
 Add Tools Demo
 Manual
 About

Find Text — Collocation
?
Summary

This tool takes a word from the user and returns all words directly before and directly after it based on the given context (i.e. words, lines, sentences). Results can be sorted alphabetically, by frequency, or by Z-score (an indication of how far and in what direction that item deviates from its distribution's mean, expressed in units of its distribution's standard deviation).

Note: If your select context of "Words" with long context length or you select other context: Lines, Sentences or Paragraph, it is very likely that the specified pattern appears in each text of corcordance more than once. If this occurs, the words of collocation will be counted more than once as well. So the counts of collocates are not accurate. For the same reason, the zScore values are not accurate. We will find way to fix this later.

Note: When "Paragraphs" is selected as context, the element name user entered will be overwritten by "p" tag. This may cause some confusion because some html texts do not use "p" tag to identify paragraphs. In this case, please use other options.

For more details, see here.

Walkthrough

Example: fetch HTML from http://www.w3.org/; extract text found between <body> and </body>; limit the resulting text to strings that contain instances of the word `web'; display up to ten words before and after each match; sort co-occurring words alphabetically.
  1. Source text
    1. Enter `http://www.w3.org/' in the URL field.
  2. Subtext limited to
    1. Enter `body' in the Elements field.
  3. What to find
    1. Select Word(s) to find and enter `web' in the adjacent text field.
  4. Context for concordance
    1. From the Context drop-down menu, select Words;
    2. enter `10' in the Context length field.
  5. Results
    1. Select co-occuring words alphabetically from the Sort drop-down menu.
*
» Source text
  Example: http://www.w3.org/

?
Summary

Determines the HTML source. HTML can be obtained from a URL or by uploading a file.

Fields

Source URL
HTML from the entered URL will be used as the data source for the analysis.

Local file
Use this field to upload a local HTML file for analysis.
» Subtext limited to
(separate multiple elements with a `,')
?
Summary

Limits included text to text that appears within the spacified tag(s). Multiple tags should be delimited by commas. Leaving this field empty will include all text in the aggregate.

Fields

Elements
The text extraction will be restricted to the tag(s) entered here. Multiple tags should be separated by commas.
*
» What to find
?
Summary

Filters results based on the occurence of words, phrases or patterns in the input text.

Fields

Word(s) to find
Enter a word or phrase to be searched for within the text. Separate multiple words with commas. Quotations are not needed in searching for phrases.

Pattern to Find
Enter a regular expression to search for within the text. Unix style searching may also be used.
» Context for concordance
?
Summary

Allows user to define context type (e.g. words or sentences) and length.

Fields

Context

Words
places the search term in context by the specified number of words.

Lines
places the search term in context by the specified number of lines.

Sentences
places the search term in context by the specified number of sentences.

Context Length
Indicates the number of words/lines/sentences to be displayed before and after the search term for context purposes.
» Results
?
Summary

Allows the user to choose how the results will be formatted and whether they should be displayed in a new browser window.

Fields

Sort
Allows you to sort the results in one of several ways.

Display as
Determines the format in which results will be delivered

Open results in new window
Checking this box will display the results in a new window. This option is selected by default. In some cases pop-up blockers may disallow windows from being created, in which case this option may be de-selected.
`*' indicates a required field

 

 

TAPoRware Project, McMaster University,