Enum textwrap::WordSeparator
source · [−]pub enum WordSeparator {
AsciiSpace,
UnicodeBreakProperties,
Custom(fn(line: &str) -> Box<dyn Iterator<Item = Word<'_>> + '_>),
}
Expand description
Describes where words occur in a line of text.
The simplest approach is say that words are separated by one or
more ASCII spaces (' '
). This works for Western languages
without emojis. A more complex approach is to use the Unicode line
breaking algorithm, which finds break points in non-ASCII text.
The line breaks occur between words, please see
WordSplitter
for options of how to handle
hyphenation of individual words.
Examples
use textwrap::core::Word;
use textwrap::WordSeparator::AsciiSpace;
let words = AsciiSpace.find_words("Hello World!").collect::<Vec<_>>();
assert_eq!(words, vec![Word::from("Hello "), Word::from("World!")]);
Variants
AsciiSpace
Find words by splitting on runs of ' '
characters.
Examples
use textwrap::core::Word;
use textwrap::WordSeparator::AsciiSpace;
let words = AsciiSpace.find_words("Hello World!").collect::<Vec<_>>();
assert_eq!(words, vec![Word::from("Hello "),
Word::from("World!")]);
UnicodeBreakProperties
Split line
into words using Unicode break properties.
This word separator uses the Unicode line breaking algorithm
described in Unicode Standard Annex
#14 to find legal places
to break lines. There is a small difference in that the U+002D
(Hyphen-Minus) and U+00AD (Soft Hyphen) don’t create a line break:
to allow a line break at a hyphen, use
WordSplitter::HyphenSplitter
.
Soft hyphens are not currently supported.
Examples
Unlike WordSeparator::AsciiSpace
, the Unicode line
breaking algorithm will find line break opportunities between
some characters with no intervening whitespace:
#[cfg(feature = "unicode-linebreak")] {
use textwrap::core::Word;
use textwrap::WordSeparator::UnicodeBreakProperties;
assert_eq!(UnicodeBreakProperties.find_words("Emojis: 😂😍").collect::<Vec<_>>(),
vec![Word::from("Emojis: "),
Word::from("😂"),
Word::from("😍")]);
assert_eq!(UnicodeBreakProperties.find_words("CJK: 你好").collect::<Vec<_>>(),
vec![Word::from("CJK: "),
Word::from("你"),
Word::from("好")]);
}
A U+2060 (Word Joiner) character can be inserted if you want to manually override the defaults and keep the characters together:
#[cfg(feature = "unicode-linebreak")] {
use textwrap::core::Word;
use textwrap::WordSeparator::UnicodeBreakProperties;
assert_eq!(UnicodeBreakProperties.find_words("Emojis: 😂\u{2060}😍").collect::<Vec<_>>(),
vec![Word::from("Emojis: "),
Word::from("😂\u{2060}😍")]);
}
The Unicode line breaking algorithm will also automatically suppress break breaks around certain punctuation characters::
#[cfg(feature = "unicode-linebreak")] {
use textwrap::core::Word;
use textwrap::WordSeparator::UnicodeBreakProperties;
assert_eq!(UnicodeBreakProperties.find_words("[ foo ] bar !").collect::<Vec<_>>(),
vec![Word::from("[ foo ] "),
Word::from("bar !")]);
}
Custom(fn(line: &str) -> Box<dyn Iterator<Item = Word<'_>> + '_>)
Find words using a custom word separator
Implementations
sourceimpl WordSeparator
impl WordSeparator
Trait Implementations
sourceimpl Clone for WordSeparator
impl Clone for WordSeparator
sourcefn clone(&self) -> WordSeparator
fn clone(&self) -> WordSeparator
1.0.0 · sourcefn clone_from(&mut self, source: &Self)
fn clone_from(&mut self, source: &Self)
source
. Read more