The first episode of our podcast, "Machine Learning made Known." Here, Ola, Andrew and Megan have a short interview with our Swiss machine learning engineer, Timo Rohner. He talks about Hilbert, Gödel, and Deep Learning.
“Radio Craftinity” – Episode 1 : The History of Mathematics and Deep Learning
Hosts: Ola, Megan, Andrew
Ola: Hello again, dear listeners, and welcome to Radio Craftinity. It’s Ola, Megan and Andrew.
Andrew: Yo! And this is the next episode of our podcast, “Machine Learning Made Known.” I hope you have your cups of coffee ready, 'cause ours are filled to the top. Let’s begin now with introducing our guest – a Machine Learning Engineer – Timo Rohner!
Timo: Hi guys.
Megan: So, Timo, one of your fields of interest is Deep Learning. It wasn’t until recently that this term became popularized. It’s definitely a hot topic, not only in the AI industry but outside the field as well.
Timo: Yeah, deep learning has gone from being a niche field to a word that often gets thrown in with the AI buzzword. Interest in Deep learning has sky-rocketed. Journalists try to write articles about a topic that they do not understand at all and even everyone’s grandmother starts talking about the endless possibilities and opportunities that AI offers.
Ola: Sure, can you tell us more about how we can use it and if it actually works?
Timo: The reason why deep learning is effective is multifaceted. To be a bit of a simpleton about it, it seems to work and it’s the best we’ve got when it comes to solving specific problems, such as, for example, complex object detection in images. But there’s a far more nuanced argument based on historical development to be made in favor of using Deep Learning. The beginning of the 20th century marked a very important time in mathematics-
Ola: Oh, right, right! Sorry for interrupting, but I think this should be a good time to add that our guest here has also a background in mathematics, right?
Megan: Well, he certainly dresses himself like he does.
Timo: Guys, you know I’m still here right?
M: Ok, ok sorry for that – go on. I’ll try to keep my fellow hosts from interrupting.
Andrew: Hey, I’m innocent.
Ola: Oh yeah, sure, you're always innocent.
Timo: So! The beginning of the 20th century marked a very important time in mathematics. Due to the success of empirical science, especially in the 19th century, scientists demanded that mathematicians justify mathematics in a rigorous way, without relying on Platonism, meaning making reference to the existence of mathematical objects as real existing entities. Multiple schools of thought sought to render mathematics more rigorous. Countless paradoxes appeared and ultimately we got the foundational crisis of mathematics. Whitehead and Russell’s work Principia Mathematica turned out to be a waste of time and Hilbert’s challenge was proven not to be solvable by Goedel.
Andrew: Could you elaborate on David Hilbert and his challenge?
Timo: Sure! David Hilbert was a professor at the University of Königsberg. He wanted mathematics to be grounded in formalism and to render the foundations of mathematics consistent and complete. Consistency relates to the idea of there being no contradictions in mathematics and completeness refers to the idea that every true statement can be proven to be true. Both those things were proven to be very problematic by Goedel. With Goedel’s incompleteness theorems and Turing’s proof that the halting problem had no solution, a certain pessimism arose. The general optimism moved away from finding “perfect” analytical solutions or proving the consistency of formal systems to numerical analysis and computer science. This development is very emblematic of how and why Deep Learning became such a powerful way of solving complex problems in the 21st century.
Megan: We have quite an interesting interplay of concepts within mathematics. So, how exactly does Deep learning provide more efficient solutions for almost every problem within mathematics?
Timo: Actually, Deep Learning can be seen as a sort of move away from finding “solutions” to specific problems to using a brute force approach to find a “solution” whose existence we presuppose without ever actually finding that solution or even knowing whether such a solution exists. In general, we move away from building complex programs that deal with very specific tasks, to writing programs that are very general and then we “teach” the program how to do a very specific thing, not by actually changing the code itself but by feeding the program with a large amount of data. In a way, the informational content was moved from the program and the code itself to training datasets.
Andrew: Could you elaborate more on the precise methodology? What's happening here?
Timo: Sure! We build a model that can be configured with a collection of parameters. Then, we feed that model training data and change the parameters until the model fits our training data as well as possible. And then we sort of hope that our model is able to capture the specific characteristics in the training datasets that are required to “learn” how to accomplish a specific goal. It is only recently that we’ve really started using deep neural networks for complex tasks. It is only as recently as 2006 that we could really start leveraging GPUs to optimize deep Neural networks.
Ola: So, why is this happening? Where does all the hype come from?
Timo: The gist of it is that we are witnessing a synergistic fusion of insights from software engineering, optimization, traditional machine learning and advances in hardware capabilities. And the most important recent change is the appearance of datasets large enough and computers fast enough for old methods to reveal their true power. Many of the core concepts for deep learning have been in place since the 80s or 90s.
Megan: So what happened within last decade that spurred this change?
Timo: The two most crucial components are the availability of massive labeled datasets and GPU computing. Data along with GPUs probably explains most of the improvements we’ve seen. Deep learning is a fire that needs a lot of wood to keep burning, and our data works very well for that purpose.
Ola: Can you explain how exactly GPU improves the process?
Timo: It turns out that neural networks are just a bunch of floating point calculations. We can do those in parallel. GPUs are great at doing these types of calculations. The transition from CPU-based training to GPU has led to massive speedups for these models, and as a result, allowed us to go bigger and use more data.
Ola: Oosh, that was a serious dose of information! I’m sure you could go on and on about those topics, but unfortunately, that is all the time we have for today.
Megan: Right! We almost lost track of time! Anyways, we hope you enjoyed it as much as we did and stay tuned for more interesting talks with our brilliant guests.
Andrew: And hosts! Don’t forget about our hosts.
Ola: Yeah, yeah, you’re awesome Andrew. Everyone knows that.
Megan: Alright, time to wrap it up! You'll hear from us soon, folks. Cheers.