### Bayesian Programming

Probability as an extension of logic

Introduction

# A need for new inference algorithms

A modeling methodology is not sufficient to run Bayesian programs. We also require an efficient Bayesian inference engine to automate the probabilistic calculus. This assumes we have a collection of inference algorithms adapted and tuned to more or less specific models and a software architecture to combine them in a coherent and unique tool.

Numerous such Bayesian inference algorithms have been proposed in the literature. The purpose of this book is not to present these different computing techniques and their associated models once more. Instead, we offer a synthesis of this work and a number of bibliographic references for those who would like more detail on these subjects.

Chapters 12 to 15 are dedicated to that.

## Chapter 12 — Bayesian Programming Formalism:

The purpose of this chapter is to present Bayesian Programming formally and to demonstrate that it is very simple and very clear but, nevertheless, very powerful and very subtle. Probability is an extension of logic, as mathematically sane and simple as logic, but with more expressive power than logic.

It may seem unusual to present the formalism at the end of the book. We have done this to help comprehension and to assist intuition without sacrificing rigor. After reading this chapter, anyone can check that all the examples and programs presented earlier comply with the formalism.

## Chapter 13 — Bayesian Models Revisited:

The goal of this chapter is to review the main probabilistic models currently used.

We systematically use the Bayesian Programming formalism to present these models, because it is precise and concise, and it simplifies their comparison. We mainly concentrate on the definition of these models. Discussions about inference and computation are postponed to Chapter 14 and discussions about learning and identification are postponed to Chapter 15.

We chose to divide the different probabilistic models into three categories: the general purpose probabilistic models, the engineering oriented probabilistic models, and the cognitive oriented probabilistic models.

In the first category, the modeling choices are made independently of any specific knowledge about the modeled phenomenon. Most of the time, these choices are essentially made to keep the inference tractable. However, the technical simplifications of these models may be compatible with large classes of problems and consequently may have numerous applications.

In the second category, on the contrary, the modeling choices and simplifications are decided according to some specific knowledge about the modeled phenomenon. These choices could eventually lead to very poor models from a computational viewpoint. However, most of the time, problem-dependent knowledge, such as conditional independence between variables, leads to very significant and effective simplifications and computational improvements.

Finally, in the cognitive-oriented probabilistic models category, different models are presented according to a cognitive classification where common cognitive problems are linked to common Bayesian solutions.

Several of these models were already presented with more detail in the previous chapters. Certain models will appear several times in different categories but presented with a different point of view for each presentation. We think that these repetitions are useful as our goal in this chapter is to give a synthetic overview of all these models.

## Chapter 14 — Bayesian Inference Algorithms Revisited:

This chapter surveys the main available general purpose algorithms for Bayesian inference.

It is well known that general Bayesian inference is a very difficult problem, which may be practically intractable. Exact inference has been proved to be NP-hard [Cooper, 1990], as has the general problem of approximate inference [Dagum and Luby, 1993].

Numerous heuristics and restrictions to the generality of possible inferences have been proposed to achieve admissible computation time. The purpose of this chapter is to make a short review of these heuristics and techniques.

Before starting to crunch numbers, it is usually possible (and wise) to make some symbolic computations to reduce the amount of numerical computation required. The first section of this chapter presents the different possibilities. We will see that these symbolic computations can be either exact or approximate.

Once simplified, the expression obtained must be numerically evaluated. In a few cases exact (exhaustive) computation may be possible thanks to the previous symbolic simplification, but most of the time, even with the simplifications, only approximate calculations are possible. The second section of this chapter describes the principles of the main algorithms.

## Chapter 15 — Bayesian Learning Revisited:

In Chapter 4 we have seen how data are used to transform a “specification” into a “description”: the free parameters of the distributions are instantiated with the data making the joint distribution computable for any value of the variables. This identification process may be considered as a learning mechanism allowing the data to shape the description before any inferences could be made. In this chapter, we consider learning problems in more detail and show how some of them may be expressed as special instances of Bayesian programs.