abstract.tex

For a long time, it was impossible to imagine that a computer could accurately classify and segment images, summarise or generate text and play strategic computer games at a superhuman level. Recently, a family of machine learning algorithms involving artificial neural networks started to excel at those tasks, often outperforming humans and alternative methods. Artificial neural networks are a family of machine learning algorithms capable of approximating functions by extracting increasingly complex hierarchical representations from the data.

This thesis aims to present key results in the approximation theory of artificial neural networks assuming only undergraduate mathematics. This research field studies necessary and sufficient conditions under which neural networks can approximate an arbitrary function belonging to a particular family. The approximation is formalized within a function space. Theorems addressing those issues are known as the universal approximation theorems. We will state and prove various universal approximation theorems for continuous functions on compact sets. Those results will be generalized to spaces of Lebesgue integrable and square-integrable functions. We will also discuss the universal approximation of Borel measurable functions in a probabilistic sense. We will conclude with an experimental study of the relationship between established theoretical results and practical applications.