Stemming with Snowball
Speaker: Olly Betts
Track: Internationalization, Localization and Accessibility
Type: Short talk (20 minutes)
Room: Petit amphi
Time: Jul 18 (Fri): 11:30
Duration: 0:20
Search is pervasive in the modern world, but for many human languages an effective search feature benefits significantly from stemming.
Snowball provides stemming for 30 human languages, and can generate code for 9 programming languages, with further programming languages supported via bindings to the generated C code. It’s widely used behind the scenes, including by Lucene, PostreSQL and Xapian. If you’ve used the search features on the Debian website, lists or wiki you’ve likely used Snowball.
I’ll give an introduction to Snowball covering what it can do for you, what it can’t do for you, and what you might be able to do for Snowball if you’re interested.