CoreComponents 3.0.0
A Modern C++ Toolkit
All Classes Namespaces Functions Variables Typedefs Enumerations Enumerator Friends Modules Pages Modules
Utf16Iterator Class Reference

Iterate code points of an UTF-16 encoded string. More...

#include <cc/Utf16Iterator>

Public Types

using iterator_category = std::bidirectional_iterator_tag
 
using value_type = char32_t
 
using difference_type = std::ptrdiff_t
 
using pointer = void
 
using reference = char32_t
 

Public Member Functions

 Utf16Iterator ()
 Create an invalid iterator.
 
 Utf16Iterator (const std::uint16_t *start, const std::uint16_t *end, const std::uint16_t *pos, ByteOrder endian=localEndian())
 Create a new iterator.
 
 Utf16Iterator (const Utf16Iterator &b)=default
 
Utf16Iteratoroperator++ ()
 Prefix increment operator: step one code point forward
 
Utf16Iteratoroperator-- ()
 Prefix decrement operator: step one code point backward
 
Utf16Iterator operator++ (int)
 Postfix increment operator: step one code point forward and return old position.
 
Utf16Iterator operator-- (int)
 Postfix decrement operator: step one code point backward and return old position.
 
Utf16Iteratoroperator+= (ptrdiff_t d)
 Assignment addition operator: iterate in forward direction a given distance d.
 
Utf16Iteratoroperator-= (ptrdiff_t d)
 Assignment substraction operator: iterate in backward direction a given distance d.
 
Utf16Iterator operator+ (ptrdiff_t d) const
 Addition operator: get iterator in forward direction at distance d.
 
Utf16Iterator operator- (ptrdiff_t d) const
 Substraction operator: get iterator in backward direction at distance d.
 
std::ptrdiff_t operator- (const Utf16Iterator &b) const
 Difference operator: compute distance in number of characters.
 
 operator bool () const
 Cast to bool operator: indicate if this iterator can step forward another code point.
 
char32_t operator* () const
 Dereference operator: decode current code point.
 
size_t operator+ () const
 Unary plus operator: return the current decoding position as a byte offset.
 
bool operator== (const Utf16Iterator &b) const
 Compare for equality.
 
bool operator!= (const Utf16Iterator &b) const
 Compare for in-equality.
 

Detailed Description

Iterate code points of an UTF-16 encoded string.

The Utf16Iterator allows iterating the Unicode code points of an UTF-16 word sequence. The iterator will always halt at the string boundaries. If stepping over the string boundary the walker will switch to an invalid state and start delivering zero code points.

If placed at a string's terminating zero character it is possible to step backwards into the string.

Illegal code sequences are run over without error, but possibly illegal code points are delivered in this case. In both forward and backward iteration at least one illegal code point is delivered for a single ill-coded word.

Todo
create a test harness

Constructor & Destructor Documentation

◆ Utf16Iterator() [1/2]

Create an invalid iterator.

◆ Utf16Iterator() [2/2]

Utf16Iterator ( const std::uint16_t * start,
const std::uint16_t * end,
const std::uint16_t * pos,
ByteOrder endian = localEndian() )

Create a new iterator.

Parameters
startPointer to start of UTF-16 encoded string
endPointer to end of UTF-16 encoded string (behind last valid word)
posCurrent position within the UTF-16 encoded string
endianByte order

Member Function Documentation

◆ operator++() [1/2]

Utf16Iterator & operator++ ( )

Prefix increment operator: step one code point forward

◆ operator--() [1/2]

Utf16Iterator & operator-- ( )

Prefix decrement operator: step one code point backward

◆ operator++() [2/2]

Utf16Iterator operator++ ( int )

Postfix increment operator: step one code point forward and return old position.

◆ operator--() [2/2]

Utf16Iterator operator-- ( int )

Postfix decrement operator: step one code point backward and return old position.

◆ operator+=()

Utf16Iterator & operator+= ( ptrdiff_t d)

Assignment addition operator: iterate in forward direction a given distance d.

◆ operator-=()

Utf16Iterator & operator-= ( ptrdiff_t d)

Assignment substraction operator: iterate in backward direction a given distance d.

◆ operator+() [1/2]

Utf16Iterator operator+ ( ptrdiff_t d) const

Addition operator: get iterator in forward direction at distance d.

◆ operator-() [1/2]

Utf16Iterator operator- ( ptrdiff_t d) const

Substraction operator: get iterator in backward direction at distance d.

◆ operator-() [2/2]

std::ptrdiff_t operator- ( const Utf16Iterator & b) const

Difference operator: compute distance in number of characters.

◆ operator bool()

operator bool ( ) const
explicit

Cast to bool operator: indicate if this iterator can step forward another code point.

◆ operator*()

char32_t operator* ( ) const

Dereference operator: decode current code point.

◆ operator+() [2/2]

size_t operator+ ( ) const

Unary plus operator: return the current decoding position as a byte offset.

◆ operator==()

bool operator== ( const Utf16Iterator & b) const

Compare for equality.

◆ operator!=()

bool operator!= ( const Utf16Iterator & b) const

Compare for in-equality.