Building Chunker
Chunking is an analysis of a sentence which identifies the constituents (noun groups, verbs, verb groups, etc.) which are correlated. These are non-overlapping regions of text. Usually, each chunk contains a head, with the possible addition of some function words and modifiers either before or after depending on languages. These are non-recursive in nature i.e. a chunk cannot contain another chunk of the same category.
Some of the groups possible are:
Noun Group
Verb Group
For example, the sentence 'He reckons the current account deficit will narrow to only 1.8 billion in September.' can be divided as follows:
[NP He ] [VP reckons ] [NP the current account deficit ] [VP will narrow ] [PP to ] [NP only 1.8 billion ] [PP in ] [NP September ]
Each chunk has an open boundary and close boundary that delimit the word groups as a minimal non-recursive unit.