#2021s #Biggest #Breakthroughs #Math #Computer #Science

In the 1950s, scientists programmed a first-generation computer to learn using the same rules as the human brain. This model is called a neural network. It’s made up of many basic units that compute by taking inputs from other units and passing them on as simple calculations. The answers are recorded by designated output units.

Pathways strengthen, and the model learns, when many different inputs result in the same output. Today, advanced versions of these models, called deep neural networks, are the most successful AI ever created. They power image and speech recognition and recognize patterns in massive sets of data, making astonishing predictions about the future.

But there’s a dark mystery here: We have no idea how deep neural networks work. What happens inside their billions of hidden layers that allows them to converge to a solution? Now, for the first time ever, a group of researchers has found a mathematical key that might just open this black box.

Historically, a lot of methods in machine learning were designed to have simple properties, where the model designer was guaranteed that they would obtain a certain solution if they used a particular algorithm or model. Deep neural networks move away from that paradigm.

The model designer chooses how many units they would like to have in each one of the hidden layers. That’s often referred to as the width of the neural network. Yasaman Bahri and her colleagues at Google’s Brain Team hoped to mathematically simplify

A deep neural network by considering an extreme case: They took the networks’ width to be infinite. One of the goals was to understand what happens in that limit of infinitely wide, deep neural networks to see if we could say anything concrete about it. Incredibly, they could.

Bahri mathematically reduced those infinite-width networks to something simpler called kernel machines, algorithms with a history going back to the 19th century mathematician Carl Friedrich Gauss. A kernel method captures a richness that is similar to a deep neural network in terms

Of its functional dependence on the input, but it’s simpler than a deep neural network because it’s linear and its parameters and neural networks in general are non-linear. Kernel machines find patterns in data by projecting the data into extremely high dimensions.

They map each point on a low-dimensional data set to a point in a higher dimension, and then use the data point’s characteristics to identify a class it belongs to. They do this classification using a geometric object called a hyperplane. Kernel machines can compute infinite-dimensional data while staying firmly in the realm of

Lower dimensional space. Bahri and her team were able to show that kernel methods are mathematically equivalent to an idealized version of a deep net–the first major step to cracking open the black box of deep learning. It’s very rare to have an exact equivalence between two things, in particular if that

Other thing is exactly solvable. You can write down the expressions mathematically. From my perspective, that’s quite appealing. If this equivalence can be extended beyond idealized neural networks, it may explain how practical deep neural networks achieve their astonishing results.

To describe finite width networks, more work is needed, but at least if you have an exactly solvable model where you have that full handle, you can then start to be systematic about what’s missing. And so from my mind, that gives you a starting point.

There is a great debate happening in the world of set theory, the study of collections of numbers and other mathematical objects. It’s about the nature of infinity – or, rather, infinities. The remarkable thing about infinities is that they come in different sizes. Take two infinite sets of numbers.

If every number from one set can be paired to a number from another, the sets are said to be bijective, or the same size. So it’s not just the case that all infinite sets are bijective with each other, that there’s a one-to-one correspondence between them.

You have actually lots of examples of sets such that one is strictly smaller than the other. These questions about the sizes of infinities go back nearly 150 years, when the German mathematician Georg Cantor rocked the mathematical world by discovering something that seems

Counterintuitive: The infinite set of real numbers, which includes every point on the number line, outnumbers natural numbers – even though there are infinitely many of both. After all, you can try to pair up these sets, but you would have skipped over an infinite amount of real numbers between them.

In other words, these infinities are not equal. The smaller one’s size, or cardinality, was classified as aleph-zero, and the next biggest size is aleph-one. Cantor conjectured that this aleph-one is exactly the size of the continuum of real numbers.

In other words, there are no sizes of infinity between the sets of natural and real numbers. This came to be known as the continuum hypothesis. But he could never prove it. The axioms, or basic foundations of mathematics, simply didn’t govern things like Cantor’s infinity problem.

But this year, set theorist David Aspero and his longtime collaborator Ralf Schindler got closer to finding the answer anyway. We had the right scenario in place about 10 years ago, but there was a missing ingredient. And then it just came, just by chance, quite randomly.

The breakthrough came when Asperó used a technique, known as forcing, to create a mathematical object called a witness. He used this witness to verify that an extremely powerful axiom, Martin’s maximum plus plus, actually implies a rival axiom, star. By unifying the two rival axioms, Asperó and Schindler showed that they are both likely

True – implying that an extra size of infinity actually does sit between the sets of natural and real numbers. This would prove that there are more reals than Cantor had thought, finally bringing closure to the 150-year-old mystery and offering a coherent alternative to the continuum hypothesis.

But this is not even close to the end of the story. Newer and stronger axioms are already challenging this result and a battle is underway to decide which side is right. The final chapter on the actual number of real numbers – the true size of the continuum – is yet to be written.

This is a Liouville field–or, at least, an idea of it. It’s a wildly chaotic mathematical surface where the height of every point is chosen at random. Forty years ago, the theoretical physicist Alexander Polyakov found a striking use for these fields: as a model of quantum physics.

Polyakov intuited that they could do this by unlocking the behavior of theoretical objects called strings, and building a simplified model of quantum gravity in two dimensions. It can be seen as a toy model for higher-dimensional cases, but it can also be seen as string theory,

Because string theory in some sense is about two- dimensional surfaces. But Polyakov’s formula for understanding the Liouville field stopped tantalizingly short of being rigorous. Although he attempted to explain it using a path integral developed by Richard Feynman, the Liouville field resisted being described in this way.

Around the year 2013, we started looking at Polyakov’s path integral and according to what physicists were saying, if we could make sense of this path integral, we could directly construct in the continuum quantum gravity. So we took the path integral and we said, “Okay, let’s give it a shot.

We’re going to try to define it.” Mathematician Vincent Vargas and his colleagues set out to precisely describe Polyakov’s path integral using a different approach: probability theory. They began by transforming the Liouville field into a far milder object called a Gaussian free field.

It’s kind of this very rough landscape where there are infinite spikes everywhere. From the point of view of physics, the Gaussian free field is a boring theory. It’s a trivial theory where everything is computable straight away.

And what we realized and which came as a surprise, I think, to the physics community, is that we could express anything that you naturally want to compute on this theory just in terms of something on the Gaussian free field. There were other leads to build on as well.

In 1984, as a potential workaround to his failed path integral, Polyakov began trying to develop a technique called the bootstrap — a mathematical ladder that gradually builds up to a full, complex representation of a field. By chance, two unrelated pairs of physicists in the 1990s built on Polyakov’s bootstrap

And managed to completely solve the Liouville field. They called their formula “DOZZ.” Even though it seemed to work, it was a miraculously lucky guess, and they couldn’t prove it. This formula, which comes out of a black box–I mean, it’s black magic–well, it actually has a probabilistic meaning.

So what we did is show that our path integral probability construction is equivalent, completely equivalent to that bootstrap construction. Last year, they unveiled a new and improved version of Polyakov’s path integral, defined in terms of the Gaussian free field. The work also explains the mysterious origins of the DOZZ formula.

We’re mapping probability theory to the bootstrap, or, if I were just talking on the mathematical side, we’re mapping probability theory to representation theory. And with that, they proved that the Liouville field models quantum gravity, exactly as Polyakov thought it would 40 years ago.

By bridging these two fields of math which are really distinct, a priori it enables us to compute things that physicists don’t know how to compute.

## 0 Comments