OLIVER SERANG

``If you want your car to get fifty miles per gallon, fine. You can retool your car a little bit. But if I tell you it has to run on a gallon of gas for five hundred miles, you have to start over.''

-Laszlo Block

EVERGREENFOREST INFERENCE ENGINE

EVERGREEN (EN) /
VINTERGRÖNA (SW) /
常緑 (JP) /
À FEUILLES PERSISTANTES (FR) /
SEMPRE-VIVA (PR) /
IMMERGRÜN (DE) /
SEMPERVIRENT (LN):

adjective |

1 Perenially fresh or interesting; enduring

2 An engine for Bayesian inference built around convolution trees.
It is built to withstand models and data with large treewidth / high degree dependencies.

The EvergreenForest Inference engine is an environment of next-generation inference algorithms for

rapidly developing statistical models /
breaking down large sums of random variables /
uncovering patterns from high-throughput scientific experiments /
playing with data and discoverying fun & awesome things!

ABOUT THE EVERGREENFOREST

COME RIGHT IN, WE'RE OPEN!

EvergreenForest is written as an open-source, header-only C++11 library, so simply #include "Evergreen/evergreen.hpp" and you're all set. And it's been tested with both g++ and clang++.

YOUR TABLE AWAITS

The InferenceGraphBuilder type make it easier to build models: simply build insert Dependency types into an InferenceGraphBuilder, and it will do the rest.

Pre-programmed Dependency types include:

TableDependency: For entering discrete distributions. These distributions can have arbitrary numbers of dimensions, and are constructed using the Tensor class from the TRIOT library.

AdditiveDependency: For constraints of the form $$Y = X_1 + X_2 + \cdots X_n.$$ For multivariate problems, users can approximate by adding multiple dependencies $$A = C_1 + C_2 + \cdots + C_n\\ B = D_1 + D_2 + \cdots + D_n$$ or users can use the less efficient but exact true multidimensional dependency $$(A,B) = (C_1,D_1) + (C_2,D_2) + \cdots (C_n,D_n).$$ The multidimensional version will be solved as a multidimensional convolution tree in the model.

ConstantMultiplierDependency: For constraints of the form $$Y = 0.73 \times X.$$ Like the AdditiveDependency, multivariate dependencies of this form can be efficiently approximated as multiple dependencies $$A = 0.9 \times C\\ B = 101.5 \times D$$ or can encode it exactly as a true multidimensional version $$(A,B) = (0.9, 101.5) \times(C,D).$$

You can also create your own derived Dependency types and the underlying MessagePasser types that specify how they perform inference.

SCHEDULING WITHOUT RESERVATIONS

A constructed InferenceGraph (built manually or with an InferenceGraphBuilder) can be solved with loopy belief propagation. Message passing in loopy belief propagation can be performed manually or it can be performed automatically by the Scheduler type. Together, these offer the ability to perform efficient prototyping for problems from multiple areas of work.

Included derived types of Scheduler:

FIFOScheduler: Simple and lightweight, this scheduler passes messages in a lazy manner, which can achieve greater performance from trimmed convolution trees.

PriorityScheduler: A solid all-purpose scheduler that passes the message that has changed most since it was last passed.

RandomSubtreeScheduler: A scheduler that achieves good performance on tree-like graphs (such as HMMs), this builds two random subtrees of the original InferenceGraph and then alternates between a full pass on each tree.

And if a use-case would benefit from a new type of Scheduler, the object oriented design means that new types derived from Scheduler can be constructed by users on a problem-specific basis.

MADE WITH NATURAL AND ARTIFICIAL INTELLIGENCE

EvergreenForest includes key technologies not yet implemented anywhere else: lazy, trimmed convolution trees, lazy p-norm rings approximation to p-convolution, from-scratch PMF implementations using TRIOT, TRIOT convolution for small problems, template-recursive FFT convolution for large problems, and a novel template-recursive & cache-oblivious algorithm for bit-reversal permutation.

CLASSICAL MODERN: A SYMPHONY IN C(++11)

EvergreenForest is split into modules, including a module for the TRIOT tensor library, a module for FFT, a module for p-convolution (the fastest implementation in existence), a module for probability mass functions (PMFs), a module with the core engine components (such as graphs, schedulers, etc.), and the Evergreen module with wrappers (e.g., InferenceGraphBuilder types). These can be used all together in harmony via the header Evergreen/evergreen.hpp or can be used as standalone libraries in other projects. When used in a standalone manner, each module is still header-only, and only requires one header to be included.

A PLAYFUL SAMPLER OF COLD FUSION CUISINE

The source library includes a few demos, which span from simple illustrations (manually build an HMM to locate GC-rich regions of a genome) all the way to complex and new approaches to classic problems (elemental quantification with shared isotope peaks).

TAKE IT TO GO

An unrestrictive MIT software license means that the EvergreenForest library and its constituent modules can be used easily in your project.