Articoli Manifesto Tools Links Canali Libri Contatti ?
Libri / Recensioni

O'Reilly Short Cuts: Getting Started with Pyparsing [EN]

Abstract
This is the sad true: parsing is boring. And writing parser is even worst.
Data di stesura: 05/12/2007
Data di pubblicazione: 10/12/2007
Ultima modifica: 10/12/2007
di Giovanni Giorgi Discuti sul forum   Stampa

O'Reilly Short Cuts: Getting Started with Pyparsing [EN]
O'Reilly User Group

If you can choose a scripting language for parsing you can think to do it in perl.

For this way, take a big breath and go in the black sea of perl's funny regexp. They are funny only if you have that special love for the regular expressions.

But if you are more confortable with python, pyparser[2] is a better solution.

Pyparser is a library written in Python, for building parser described with a BNF (Backus-Naur Form [3]).

O'Reilly has just published a "Short Cuts" e-book written by Paul McGuire [1]; in less then 70 pages you get a very good insight of pyparser.

Even if you are new to python, the book is very easy to read.

And if you do not know nothing about parser and Backus & Naur, you will find an easy path to understand it. Parsing is a tricky topic because of the grammar theory behind it, but for all-day work, you can follow the McGuire introduction.

After some simple example, you'll dive into a small web page parser.

It is very amazing how you can do extraction from web pages without a complex Sax parser, and using only a very compact grammar.

After this intro examples, the manual take us to a more complex task: a lisp-like expression language parser called S-Expression.

This example is important because complex data structure are oftern recursive as S-Expression are.

The last chapter, "Search Engine in 100 Lines of Code", is a well-written example, and show us how to build a small search-engine-grammar.

So this e-book is a "must" if you need to do even simple parsing and you... do not want to become crazy with too regular expressions!

Informazioni sull'autore

Giovanni Giorgi, classe 1974. Dopo il diploma di liceo Classico, si è laureato in Informatica nel febbraio 2000, e attualmente lavora nel campo del software finanziario (trading on line, soluzioni web).
Appassionato di linguaggi di programmazione, si interessa anche di politica e letteratura.

È possibile consultare l'elenco degli articoli scritti da Giovanni Giorgi.

Altri articoli sul tema Libri / Recensioni.

Risorse

  1. O'Reilly Short Cuts: Getting Started with Pyparsing
    http://www.oreilly.com/catalog/9780596514235/index.html
  2. Pyparser web site.
    http://pyparsing.wikispaces.com/
  3. Backus-Naur Form
    http://en.wikipedia.org/wiki/Backus-Naur_Form
Discuti sul forum   Stampa

Cosa ne pensi di questo articolo?

Discussioni

Questo articolo o l'argomento ti ha interessato? Parliamone.