Natural Language Processing in Golang
Libraries for working with human languages.
- dpar - Transition-based statistical dependency parser.
- go-eco - Similarity, dissimilarity and distance matrices; diversity, equitability and inequality measures; species richness estimators; coenocline models.
- go-i18n - A package and an accompanying tool to work with localized text.
- go-mystem - CGo bindings to Yandex.Mystem - russian morphology analyzer.
- go-nlp - Utilities for working with discrete probability distributions and other tools useful for doing NLP work.
- go-stem - Implementation of the porter stemming algorithm.
- go-unidecode - ASCII transliterations of Unicode text.
- go2vec - Reader and utility functions for word2vec embeddings.
- gojieba - This is a Go implementation of jieba which a Chinese word splitting algorithm.
- golibstemmer - Go bindings for the snowball libstemmer library including porter 2
- gounidecode - Unicode transliterator (also known as unidecode) for Go
- icu - Cgo binding for icu4c C library detection and conversion functions. Guaranteed compatibility with version 50.1.
- libtextcat - Cgo binding for libtextcat C library. Guaranteed compatibility with version 2.2.
- MMSEGO - This is a GO implementation of MMSEG which a Chinese word splitting algorithm.
- paicehusk - Golang implementation of the Paice/Husk Stemming Algorithm
- porter - This is a fairly straightforward port of Martin Porter’s C implementation of the Porter stemming algorithm.
- porter2 - Really fast Porter 2 stemmer.
- prose - A library for text processing that supports tokenization, part-of-speech tagging, named-entity extraction, and more.
- RAKE.go - A Go port of the Rapid Automatic Keyword Extraction Algorithm (RAKE)
- segment - A Go library for performing Unicode Text Segmentation as described in Unicode Standard Annex #29
- sentences - A sentence tokenizer: converts text into a list of sentences.
- snowball - Snowball stemmer port (cgo wrapper) for Go. Provides word stem extraction functionality Snowball native.
- stemmer - Stemmer packages for Go programming language. Includes English and German stemmers.
- textcat - A Go package for n-gram based text categorization, with support for utf-8 and raw text
- whatlanggo - A natural language detection package for Go. Supports 84 languages and 24 scripts (writing systems e.g. Latin, Cyrillic, etc).
- when - A natural EN and RU language date/time parser with pluggable rules