Welcome to pressagio’s documentation!

Pressagio is a library that predicts text based on n-gram models. For example, you can send a string and the library will return the most likely word completions for the last token in the string.

Example Usage

The repository contains two example scripts in the folder example to demonstrate how to build a language model and use the model for prediction. You can check the code of those two scripts how to use pressagio in your own projects. Here is how to use the two scripts to predict the next word in a phrase.

First, you have to build a languange model. We will use the script example/text2ngram.py to add 1-, 2- and 3-grams of a given text to a sqlite database. For demonstration purposes we will use a simple text file that comes with pressagio’s tests. You have to run the script three times to create a table for each of the n-grams:

$ python example/text2ngram.py -n 1 -o test.sqlite tests/test_data/der_linksdenker.txt
$ python example/text2ngram.py -n 2 -o test.sqlite tests/test_data/der_linksdenker.txt
$ python example/text2ngram.py -n 3 -o test.sqlite tests/test_data/der_linksdenker.txt

This will create a file test.sqlite in the current directory. We can now use this database to get a prediction for a phrase. We will use the script example/predict.py which uses the configuration file example/example_profile.ini. Note that you will always need a configuration file if you want to use the built-in predictor. To get a prediction call:

$ python example/predict.py
['warm', 'der', 'und', 'die', 'nicht']

The script will just output a list of predictions.

Running the tests

$ python -m unittest discover

Indices and tables