jacson logo courtesy from Doris & Frank.How it works? > Plugin Classes > Filters2005-10-04 00:11:20 v0.90
Jacson, say Jackson, think J-Scan

Filters


What is Jacson?
Getting started
How it works?
  Plugin Classes
    Sources
    Generic Sources
    Filters
    Evaluators
    Reports
    Selections
    Stemmers & Handlers
  Technologies used
  Software used
Contact
How to Contribute?
Developer Information
Other information
 

Interface - de.spieleck.app.jacson.JacsonFilter

A filter processes chunks from a source and provides chunks to later stages of the processing. On the reader side a filter looks like another source, i.e. the interface JacsonFilter is a sub interface of JacsonChunkSource.

Every filter can ...

  • ... reject chunks in the process.
  • ... modify chunks in the process.
  • ... introduce new chunks in the processing.
  • ... store information for later use.
Note that a filter can do more than one thing, it can reject some chunks, modify some others and the same filter can introduce new chunks into the processing.

The implementation of Jacson currently comes with the following filters in package de.spieleck.app.jacson.filter:
FilterDescriptionType
ConstFilter Either passes all chunks unmodified or blocks all incoming chunks. Its major purpose is to provide a basic implementation to be subclassed by other filters. reject
CaseFilter Converts all chunks to upper- or lowercase (this is XML configurable). modify
FileInsertFilter Uses information from the chunk to lookup a line from an external file and insert it as a chunk. modify
GroupingFilter reorganizes the incoming chunks and groups them using a configurable delimiter token modify
HeadFilter Cuts the first n (configurable) chunks from input and forwards them. Somewhat like UNIX head. reject
HeadPadFilter Prepends something as or to the first chunk. Somewhat like awk's begin. modify
JacsonStateFilter Blocks or accepts chunks depending on a variable in the current JacsonState. reject
PadFilter Attaches a prefix and a postfix to a chunk. modify
RegExpContainsFilter Only lets through (or block)chunks which contain a configurable regular Expression. reject
RegExpExtractFilter Extracts one or more pieces from the incoming chunk (by regular expression bracket operators) merges them with a configurable separator and passes it as a new chunk. modify
RegExpMatchFilter Only let through (or block) chunks which match (as a whole) the regexp. reject
RegExpSubstFilter Performs a configurable regular expression replacement on the chunk before passing it along. modify
ReplaceFilter Replaces chunks by a configurable lookup table. You can choose weather other chunks are blocked or pass unmodified. modify
SelectionExtractFilter Use a Selection to break pieces of a chunk as new chunks. insert
SelectionFilterFilter This very powerful filter breaks pieces from a chunk by a Selection pipes them through a separate chain of JacsonFilters (and Evals) and the remerges the chunks into the original ones. modify
SortFilter This filter sorts all incoming chunks. modify
StartsWithFilter Blocks all chunks which either start or not start (configurable) with a certain configurable String. reject
StemmingFilter Uses a stemmer class to normalize chunks which are words. Currently (0.6) Jacson has a German and an English Stemmer included in de.spieleck.app.lang. modify
SubstitutionFilter Replaces a chunk according to a configurable Mapping. modify
TailFilter Cuts the last n (configurable) chunks from output and forwards them. Somewhat like UNIX tail. reject
TailPadFilter Appends something as or to the last chunk. Somewhat like awk's end. modify
TrimFilter Trim away surrounding blanks from a chunk. modify
UniqFilter Extracts chunks which equal directly preceeding ones. Those are either omitted or the only presented (inverse). reject
URLDecodeFilter Performs URL-decoding on chunks. This is incompletely implemented, just enough for my use case. modify
WebSessionFilter This filter is used to logfile analysis to observe the usual definition of ID-String + Timeout to define a Webuser-Session. ID-Strings are determined by a regular expression, usually either Client-IP or the combination of Client-IP+Browser-Id. modify

NewsfeedRSS feed
FilefeedRSS feed
Sourceforge Logo