|
by
Visiting Instructor
Dr. Damian Conway
Next Public Offering:
TBA; Help Us Schedule!
Description
This 1-day seminar, written and presented by
Dr. Damian Conway
will show you how to use a range of standard Perl
features and several CPAN modules (including Conway's
Parse::RecDescent
and
Text::Balanced) to decipher and process a variety of complex
data and command formats. It's a practical introduction to the
techniques of grammar-based recursive-descent parsing.
You might like to
read comments from an attendee
at
The Perl Conference 4.0 presentation of this
seminar.
Attendees will learn:
- how to design and build parsers to process Apache
configuration files and log data,
- how to process structured expressions (e.g. search engine
queries),
- how to balance nested brackets and match delimiters without a
regular expression,
- how to fold, spindle and mutilate the comments in a C program,
- how to dissect C++ type declarations with a self-adapting parser,
- how to allow embedded Perl code in your own data format or
command language,
- how to deal with ambiguous data by parsing it in multiple
universes simultaneously,
- how to get Parse::RecDescent to write most of your grammar for you,
- how to parse modular text (e.g. with source with #includes in it),
- how to pre-filter your source code by tricking Perl into
(nearly) parsing Perl,
- how to debug Parse::RecDescent parsers efficiently and how to
improve the efficiency of your Parse::RecDescent grammars,
- how to convert natural language queries into SQL
- how to pull pesky unmatched <P> tags from HTML,
- how to write a program that does stand-up comedy! 8-}
See below for the full Seminar Outline.
NOTE:
The first part of this seminar was presented in
1999 as Tutorial P22 at The Perl Conference 3.0.
Both parts were presented in tutorial sessions
at The Perl Conference 4.0 in Monterey, during July, 2000.
Who Should Attend
The techniques presented in this course have general applicability,
and will be useful to anyone who needs to process structured input
of any kind.
This seminar is designed for those familiar with basic
Perl programming,
and having experience using regular expressions, subroutines,
hashes and arrays,
references,
data structures built on hashes and arrays,
and using the methods of object-oriented Perl modules.
Most of these pre-requisites can be satisfied by attending
the Consultix
"Perl Programming, plus Modules"
course (or having equivalent experience),
and studying the following resources:
Author & Instructor
Dr. Damian Conway
holds a Ph.D. in Computer Science and is a Senior Research Fellow with the
School of Computer Science and Software
Engineering at Monash University, Melbourne, Australia.
He is the author of numerous well-known Perl modules including:
Class::Contract,
Text::Autoformat,
Parse::RecDescent,
Text::Balanced,
Lingua::EN::Inflect,
Class::Multimethods,
Switch,
Quantum::Superpositions,
NEXT,
Filter::Simple,
Attribute::Handlers,
Inline::Files,
and Coy (all available from your local
CPAN mirror).
Damian was the winner of the 1998, 1999, and 2000 Larry Wall
Award competitions for the most practical Perl utility program.
He is a member of the technical
committee for
The Perl Conference,
a former columnist for
The Perl Journal,
author of the book
Object Oriented Perl,
a member of the Perl 6 design team, and a popular public speaker.
In 2001, Damian received the first YAS Perl Development Grant and
spent the year working on projects for the betterment of Perl.
He is continuing this work in 2002 under a similar grant from
The Perl Foundation.
instructors,
including
Visiting Instructor Dr. Damian Conway,
are
renowned for their ability
to communicate complex concepts in simple terms and to make the study of dry technical material enjoyable.
We
pride ourselves in providing training experiences that
our customers rave about!
Seminar Outline
Part I
- A brief history of parsing
- grammars, rules, recursive descent, etc.
- Implementing parsers
- top-down vs bottom-up approaches
- Useful tools
- Text::Balanced, Parse::Yapp, perl-byacc, Parse::RecDescent
- Simple parsing
- Parsing delimited text, parsing Perl subsets
- Parsing data
- Parsing Apache log files
- optional subrules, list parsing
- run-time parser generation
- Parsing input
- The Text::Query modules
- OO parsing
- operator precedence, lists, look-ahead, rejections, etc.
- Parsing code
- parsing C and C++
- stateful grammars
- porting yacc grammars (including left-recursion)
- self-extending parsers, committing rules, deferred actions
- grammar precompilation
- Parsing natural language
- generating SQL queries for natural language input
- synthetic stand-up via reciprocal parsers
Part II
- Miscellaneous advanced features of Text::Balanced
- precompiling delimiter extractions
- extracting tagged text
- extracting Perl variables
- extracting mixed components
- Miscellaneous extra features of Parse::RecDescent
- Named items (the %item array)
- Debugging grammars: <trace>, <warn>, <hint>, and <nocheck>
- Context information: $thisline, $lastoffset, @itempos, etc.
- Extreme prejudice: the <fail> directive
- Non-deterministic parsing
- tracking "goodness-of-match"
- the <score> and <autoscore> directives
- Pre-tokenization
- the <token> directive
- token-based parsing
- Automatic grammar generation
- autoactions
- autostubbing
- autotrees
- the <perl_quotelike>, <perl_codeblock>, and <perl_variable>
directives
- Generic rules
- the <matchrule> directive
- subrule arguments: @arg and %arg
- Handling distributed text
- processing file inclusions recursively
- processing file inclusions by input modification
- other uses of input modification
- Semi-grammatical parsing
- when Parse::RecDescent is overkill and regexes don't appeal
- CSV revisited, text interpolation, simple command interfaces
- Self-modification
- Run-time parser generation and self-extending parsers
revisited
- A self-modifying Apache config/log file parser
- (Nearly) parsing Perl
- parsing with Text::Balanced on Occam's Razor
- source code filtering
- Metagrammars
- building a grammar for parsing grammars
- beat poetry and postmodern literature
|