Skip to content

boostorg/tokenizer

Repository files navigation

Boost.Tokenizer is a part of Boost C++ Libraries. The Boost.Tokenizer package provides a flexible and easy-to-use way to break a string or other character sequence into a series of tokens.

License

Distributed under the Boost Software License, Version 1.0.

Properties

  • C++03
  • Header-Only

Build Status

Branch GHA CI Appveyor Coverity Scan codecov.io Deps Docs Tests
master Build Status Build status Coverity Scan Build Status codecov Deps Documentation Enter the Matrix
develop Build Status Build status Coverity Scan Build Status codecov Deps Documentation Enter the Matrix

Overview

break up a phrase into words.

Try it online

#include <iostream>
#include <boost/tokenizer.hpp>
#include <string>

int main(){
    std::string s = "This is,  a test";
    typedef boost::tokenizer<> Tok;
    Tok tok(s);
    for (Tok::iterator beg = tok.begin(); beg != tok.end(); ++beg){
        std::cout << *beg << "\n";
    }
}

Using Range-based for loop (>c++11)

Try it online

#include <iostream>
#include <boost/tokenizer.hpp>
#include <string>

int main(){
    std::string s = "This is,  a test";
    boost::tokenizer<> tok(s);
    for (auto token: tok) {
        std::cout << token << "\n";
    }
}

Documentation

Documentation can be found at Boost.Tokenizer

Related Material

Boost.Tokenizer Chapter 10 at theboostcpplibraries.com, contains several examples including escaped_list_separator.

##Contributing

This library is being maintained as a part of the Boost Library Official Maintainer Program

Open Issues on

##Acknowledgements

From the author:

I wish to thank the members of the boost mailing list, whose comments, compliments, and criticisms during both the development and formal review helped make the Tokenizer library what it is. I especially wish to thank Aleksey Gurtovoy for the idea of using a pair of iterators to specify the input, instead of a string. I also wish to thank Jeremy Siek for his idea of providing a container interface for the token iterators and for simplifying the template parameters for the TokenizerFunctions. He and Daryle Walker also emphasized the need to separate interface and implementation. Gary Powell sparked the idea of using the isspace and ispunct as the defaults for char_delimiters_separator. Jeff Garland provided ideas on how to change to order of the template parameters in order to make tokenizer easier to declare. Thanks to Douglas Gregor who served as review manager and provided many insights both on the boost list and in e-mail on how to polish up the implementation and presentation of Tokenizer. Finally, thanks to Beman Dawes who integrated the final version into the boost distribution.

##License Distributed under the Boost Software License, Version 1.0. (See accompanying file LICENSE_1_0.txt or copy at http://www.boost.org/LICENSE_1_0.txt)