For the last two days I have been struggling with various lexer/parser combinations.
Classical flex + (win)bison combination
As flex and bison were designed to work together integration, is not a problem. There were some issues while compiling (and linking) generated code with Visual Studio, but there is a lot of information on the web (namely, on StackOverflow) and the documentation is rather good, some I did not spend a lot of time on them. I already was going to write some code beyond basics, when I realized, that flex does not have unicode support. So, I began searching the web for alternatives and found
Quex
Next I attempted to use Quex, which boasts full Unicode support, but I resigned after spending a day struggling with bison integration issues.
Ragel
So I ended up with Ragel. It can be integrated with bison, and there is a tutorial available: http://zusatz.wordpress.com/2012/07/08/connecting-ragel-to-bison-in-c/ . Unfortunately, the tutorial does not come with downloadable code, so I created an example from it and hosted it on CodePlex. You will need VS2012, win_bison and ragel to build it.
The sample also includes very basic support for Russian UTF-16 characters.
Tools worth noting
I also considered antlr and boost::spirit.
Additional links: Bison documentation WinBison download IBM tutorial on lex+yacc <
Classical flex + (win)bison combination
As flex and bison were designed to work together integration, is not a problem. There were some issues while compiling (and linking) generated code with Visual Studio, but there is a lot of information on the web (namely, on StackOverflow) and the documentation is rather good, some I did not spend a lot of time on them. I already was going to write some code beyond basics, when I realized, that flex does not have unicode support. So, I began searching the web for alternatives and found
Quex
Next I attempted to use Quex, which boasts full Unicode support, but I resigned after spending a day struggling with bison integration issues.
Ragel
So I ended up with Ragel. It can be integrated with bison, and there is a tutorial available: http://zusatz.wordpress.com/2012/07/08/connecting-ragel-to-bison-in-c/ . Unfortunately, the tutorial does not come with downloadable code, so I created an example from it and hosted it on CodePlex. You will need VS2012, win_bison and ragel to build it.
The sample also includes very basic support for Russian UTF-16 characters.
Tools worth noting
I also considered antlr and boost::spirit.
Additional links: Bison documentation WinBison download IBM tutorial on lex+yacc <