This function uses Hunspell Stemmer to stem a vector of words. It uses the (Portuguese Brazilian) dictionary by default, and unlike hunspell::hunspell_stem it returns only one stem per word.

stem_modified_hunspell(words, complete = TRUE)

Arguments

words

character vector of words to be stemmed

complete

wheter words must be completed or not (T)

Details

Then it uses the rslp stemmer in the hunspell stemmed result.

As hunspell_stem can return a list of stems for each word, the function takes the stems that appears the most in the vector for each word.

Examples

words <- c("balões", "aviões", "avião", "gostou", "gosto", "gostaram") ptstem:::stem_modified_hunspell(words)
#> [1] "balões" "aviões" "aviões" "gostou" "gostou" "gostou"