There is a Wiki Object Model (WOM) which was greatly inspired by XLinq object model to represent parsed result. It is also used for intermidiate representation for grammar. WOM consists of the following classes:
WomDocument - this class represents the whole document and has methods to read and write WOM to/from XML.
WomName - this is a new class which I did not check in yet to represent individual atomic string in WomNameTable.
WomNameTable - class inherited for XmlNameTable and can be used in XmlReader and XmlWriter. It represents list of atomic strings stored as WomName objects. It is thread safe class which can be shared across multiple threads. It can significantly reduce memory use. It is implemented as a singleton.
WomProperty - name value pair to represent a property (attribute in XML terms)of a WomElement.
While WOM can be considered as almost complete. I would estimate parser classes to be about 60% done. Parser is targeting to implement something similar to Parsing Expression Grammar" (PEG) (see Wikipedia). Which is regular expressions with non-terminals (rules). It is very powerful and can cover broader range of grammars comparing with LL or LR, but price you pay is higher memory utilization and slower speed, which I believe is not so significant issue today comparing with situation 20-30 years ago. The implementation which I do is inspired by regular expression implementation in Rotor.
ParserCharSet - represents a set of characters. It can be either character ranges or Unicode character classes.
ParserEngine - after redesign it will contain all required tables to parse source text and method to initiate parsing. It will have such tables as instruction list, rule list, list of char sets.
ParserEngineProcessor - instance of this class will be created every time we parse a new string. It will use ParserEngine tables to parse text. The process is very similar to regular expression parser engine which executes instruction by instruction and backtracks if it cannot match at certain step. It will preserve any intermidiate rule results to provide optimal performance as it is done in PEG.
ParserRule - individual rule which consists of rule name and start position in instruction table.
ParserXmlGrammarReader - should read Wiki grammar written in Xml and transform it to parser WOM. It can use ParserExpressionReader to read expressions written inside of the rules.
NUnit Tests
Number of NUnit tests to test behavior of classes listed above. I have implemented extension to NUnit to allow to provide tests in XML.
Additional Work
So, this is an overview of current state of the parser. I am currently refactoring/implementing ParserEngineProcessor. After that I will need to implement ParserXmlGrammarReader class and create an XSLT to translate WOM to XHTML.
Information about work in progress on FlexWiki parser
2/7/2008 12:07:22 PM - -66.78.121.44
An automatic testing tool (see http://www.nunit.org/ ).
9/24/2008 2:34:26 PM - FLWCOM-jwdavidson
An automatic testing tool (see http://www.nunit.org/ ).