Dr. Mirman's Accelerometer

How to survive the EU AI Act with Modulos' CEO, Kevin Schawinski

December 10, 2023 Matthew Mirman Season 1 Episode 6

We're excited to bring you an episode that's brimming with insights from the brilliant Kevin Schawinski, founder and CEO of Modulos AG, who joins us to talk all things AI regulation and how he got there from his former work as a physics professor. Our conversation takes a deep dive into the EU AI Act, its impact on AI practitioners and data scientists, and the work Modulos is doing to navigate this legislative landscape. With Kevin's knowledge, we draw insightful comparisons between AI compliance and the compliance seen in regulated systems like information security.

The episode rounds off with a lively discussion on the use of AI in astronomy, giving us a glimpse into how AI is being used to answer complex scientific questions. Kevin indulges us with tales from his journey building a startup in the AI field, and we also delve into the often polarised debate about remote work, sharing our own experiences and strategies. So, gear up for an episode that's chock-full of insights, experiences, and some thought-provoking discussions on AI, startup culture, and the future of work.

Modulos LinkedIn

Episode on Youtube

Accelerometer Podcast
Accelerometer Youtube

Anarchy
Anarchy Discord
Anarchy LLM-VM
Anarchy Twitter
Anarchy LinkedIn
Matthew Mirman LinkedIn

Speaker 1:

Hello and welcome to another episode of the Accelerometer. I'm Dr Matthew Murn, ceo and founder of Anarchy. Today, we're going to be discussing AI legislation with Kevin Chawinsky, founder and CEO of Modulose, an AI governance platform. Now, ai legislation has been in the news a lot lately, so I'm really excited to be talking to an expert on this topic. Kevin comes to us with a fairly unusual background, having previously been a physics professor at ETH. I'm really curious to hear how he made that jump and a little bit more about what Modulose does. Kevin, do you think you can tell us a little bit more about your work?

Speaker 2:

Yeah, sure. So we're at a threshold in how AI practitioners and data scientists and all the software practitioners that work on AI are going to be putting their products and services together. And it's forced not by innovation in technology, it's going to be forced by government regulation. So the EU has been working on something called the EU AI Act the Artificial Intelligence Act, since about 2018. And this was more or less ignored by people for a long time. It was being developed by the Commission, by the various bodies of the EU, and it came to the forefront in a rather dramatic way when the wave of chat, jpt and innovation from that hit, and, of course, governments got very interested in the business of regulating AI, and so now the AI Act is close to a political deal. So whatever I say in this podcast, by the time you see, this may no longer be accurate, because the horse trading, the private, non-public horse trading is happening right now, and so whatever I say, caveat emptor.

Speaker 1:

So tell me a little bit about how you got into your work on AI governance.

Speaker 2:

So we found it modulus a couple of years ago, with a focus on data-centric AI as a technology.

Speaker 2:

So we had all these really cool tools to help you identify specific on a sample by sample level, those sources of error, noise and bias that would cause problems in the performance of your model.

Speaker 2:

So you could easily debug your data to find the samples that would hurt your accuracy or some other performance metric. But they could also tell you where a source of particular bias came from, so it could help you at D bias both your data set and the resulting model that you trained on it. And almost two years ago now, we did a SWAT analysis as an exercise to see where we are, and one of the threats that we came up with was well, what happens if the EU does regulate AI? And we did our research and actually they were working on it and, as I said earlier, it was very much unknown. We read that draft act as it was at the time and we realized this will change so many aspects of how AI is used by companies and how it's served to the public that there needed to be a tooling and infrastructure around building AI in an actually completely different way, in a way that's maybe not so natural to most developers.

Speaker 1:

So how, exactly what sort of tooling are you building around AI to prepare for this act?

Speaker 2:

So I'll do it maybe a little bit by analogy. So a type of regulatory compliance and standards compliance that most developers will be familiar with is information security and management systems. So if you want to get SOC2 certified or ISO 27001 certified, you have to build certain systems around your information security, test them and maintain them. You have to build the policies around them, and this used to be essentially a manual exercise. Right, you'd have an Excel sheet with the controls that you needed to fulfill so that when the auditor came, you got your certification. Now a whole bunch of companies build amazing software products around that to help you with that. These are the Vantah Drata others. They build really good software products that also automate most of the process that goes into building information security.

Speaker 2:

Another analogy and this is maybe closer to what the AI Act will do are those people involved in building highly regulated systems. So probably the most prominent one would be medical devices, but things to do with aviation security safety. Those are in many cases already highly regulated and must meet certain standards, and those standards include methodologies for managing quality and managing risk. And the EU has basically taken that approach not directly, but heavily inspired especially the medical device regulation, and wrote an AI Act that makes building high risk AI applications into an exercise similar to building a medical device.

Speaker 1:

Given that there are these new regulations that make it so that there's a lot of extra work that you have to do to release an AI for a high risk application, how do you think that prospective PhD students, researchers, can prepare to be building their own projects in AI?

Speaker 2:

So Pure research purposes, of course, are not touched by the AI act, because you're not putting products on the market, right, if you, if you just build them.

Speaker 2:

There's been a lot of Fear generated around this topic, saying the EU is trying to outlaw open source. This is nonsense. You can build open source, you can do research, you can do whatever you want. The AI act starts to apply when you create a product out of it. Then you wouldn't put to market, and so again, the analysis would be if you're a student in, you know, building new therapies for people, new devices to help people, if you do them in the lab at the university, of course there are certain protocols to obey, but it's not the same as saying, okay, now I want to sell this to patients, right, and so if you're a student today, if you're thinking about working in high-risk AI applications, you should be aware of what's in the AI act and also actually other laws and regulations that other countries are working on. Get familiar with that way of working so that when you go into the private sector, you know what you're facing, but don't let it slow down your research work.

Speaker 1:

What, in this case, counts as a product?

Speaker 2:

I Prefer this for saying I am not a lawyer, I'm not giving you legal advice, but if you're putting it on the market, that is you're you're selling it as a service or you're offering it as a service. That is on the market, but it doesn't have to be necessarily a consumer facing product, though that's often going to be the case. But there are also applications that deal especially with health and safety, where the AI component may be something that you as an individual would never see. But let me think of an example here, one thing that's explicitly name-checked if there's an emergency dispatch system, you call your emergency number and the system that prioritizes you know who gets the the Police car or the ambulance first. That might be powered by an AI system. That is for sure high risk, but you would never see it. It's still on the market.

Speaker 1:

So you seem a little bit excited about this sort of regulation. Is that an accurate statement?

Speaker 2:

I excited. It's the wrong word. I think it's going to be a sea change in in how we deal with this really cool technology, and the sooner we get ready for it, I think the better it will be and the easier it will be. And Again, I love analogies because it's really the only thing we have here, because we don't really know what's coming. So in 2016, the EU introduced GDPR, and we all know GDPR is the annoying thing where you have to click away the cookie banner. But of course, that's not what GDPR is. It's actually not even in GDPR. Cookie banners are not in there.

Speaker 2:

Gdpr is all about how Personal data, private data, is supposed to be handled, and so in GDPR was introduced. The challenge was the engineering challenge was to essentially rebuild the data infrastructure of Almost all companies and almost all companies, not just in Europe, but around the world. So we looked into this and the GDPR introduction had a two-year transition period. So from when it was passed until the penalty started, it was a two-year period and nobody really knows for sure how much money was spent by private industry. You get ready for that moment, but the study from the European Commission itself puts the number around 200 billion euros. So that's the scale. That's the only comparison you have to the scale of the engineering challenge that the AI community has Starting, probably in a few months.

Speaker 1:

Do you think that these regulations go far enough?

Speaker 2:

That's a very leading question. So, in order to to tackle that question, let's talk about what the EU is actually intending with it. And what they're intending is Consumer protection, so they care about making sure that AI based products are safe, respect people's rights and, if you, if you want to dig really deep, ultimately it addresses the concern that people have. Like, if you look at surveys today, are people excited about AI? Some people are People in in the tech space are but most people are actually rather concerned about AI, the effect We'll have in our lives, on our economic opportunities, our jobs in the future, about discrimination, about being sorted by AI and losing opportunities in life. And so the EU takes a position Okay, we're gonna introduce a safety regime here that is similar to what we have for other types of products.

Speaker 2:

So when you buy a car today, it is required to have certain safety features it has to have an airbag, it has to have Compression zones, it has to be able to roll over and protect you and all these things. And so when you buy a car, you don't really think too deeply about well, will this car kill me? You just assume that the government forced the car companies to make it safe and not as a little bit, the approach the AI act takes, and that's also the basis behind their risk based approach. It was really a core component of the act which says that how regulated the AI application is Depends on what the application is. So what are you doing with it?

Speaker 2:

And we can talk about that, because it's actually very fuzzily defined, but that's what they're addressing. So if we're saying, well, it's consumer protection, it's a product safety, if you want to put it like that, it makes sense and I think we can all be behind it, and then we can immediately question Well, does it achieve its purpose? Does it achieve the goals it set itself? And that I can't answer because the negotiations aren't even finished and we don't know how products and services will change Once it's introduced. So we'll have to wait and see if it really does achieve what it sets out to do.

Speaker 1:

What sort of regulations would you be looking for?

Speaker 2:

I have some favorite aspects to the AI Act that I think are really really good.

Speaker 2:

I'll give you one or two examples.

Speaker 2:

So the AI Act is very concerned with making sure that fundamental rights are respected and that you're not discriminated against by AI, and there's a really nice provision in there that says that if you know that your AI application is at risk of discriminating against a certain group, certain really stringent conditions hold, you can weaken certain other provisions in data protection in order to get good training data to make sure your application doesn't discriminate or you minimize a discrimination.

Speaker 2:

I think that's really well thought out and it's a good thing. A second example that I think is really really good is it asks you to be mindful of feedback loops. So what do we mean by that? The AI Act says worry about having even small levels of discrimination that can accumulate and then, worst case, even affect your training data and have a negative feedback loop. So, even though maybe your gender bias is only like a fraction of a percent, if it's something that happens again and again and again, you might set up a negative feedback loop without even realizing it. That will lead to massive discrimination long term. So you should think about that and think about how to mitigate it. I really like that.

Speaker 1:

You mentioned before that the public was worried about AI, and is that something that you're worried about?

Speaker 2:

Yeah, I mean, if you think about how ubiquitous and powerful these systems are and in most places people don't even know that they're there and what power they have over our lives yeah, I would like there to be some standards to comply to, some disclosure on what these systems are really doing and how they work.

Speaker 1:

What's your personal nightmare scenario?

Speaker 2:

Well, I think the nightmare scenarios are very mundane, where these systems go unchecked, the things that we hold dear, our fundamental rights, equality, equal access, democracy, rule of law, elections that these things are affected by these powerful systems without any recourse or disclosure or oversight, and regulatory regimes like the AI act.

Speaker 1:

They at least point in the right direction to mitigate that, and what places do we have these systems that we're not really aware?

Speaker 2:

of. That's a very personal question, because I think it really depends on how much you know about where AI really is and probably the people listening to us are fairly aware of how ubiquitous it is. But I think most people are not aware that if they apply for health insurance, that AI might be used to assess whether they should get coverage.

Speaker 1:

Is that really the same type of AI, though, that this act is aiming at covering? Because we've had this sort of machine learning and analysis of insurance for at least like 30. I mean, depending on what you consider machine learning like, we've had that since the beginning of the concept of insurance. Is the act mostly meant to cover the recent advances around language modeling?

Speaker 2:

So it wasn't originally.

Speaker 2:

It was originally meant for the sort of traditional machine learning, all the way down to logistic regression and Excel.

Speaker 2:

And then, when chat EPT happened, there's a last minute effort to bring these things in the act and I know today, as we're recording this, there are massive negotiations, new proposals on how to categorize the different levels of large language models, foundation models, generative AI defining these terms in law it's actually very, very difficult and they're trying to do it in record time.

Speaker 2:

But the the act takes a very interesting approach here, basically says it doesn't matter really what's in it. As long as it's a type of machine or a system that learns from and uses data, then it's covered and it's AI and what matters is what you use it for. So if you're using chat EPT to give you a recipe based on what you have in the fridge, nobody cares about that right. That doesn't need to be too much scrutiny on it. If you're using chat EPT to complete your employees performance evaluations as I've heard from some companies that were experimenting with those tools you know suddenly you have a high risk system because now you have this incredibly complicated modern thing, An LLM, making decisions on your future chances.

Speaker 1:

So really, in this case, like it sounds a little bit like this act is almost a GDPR 2.0. Less than something novel about machine learning if it's just about data usage.

Speaker 2:

Yeah, I think GDPR, in scope and also penalties and an approach, is very similar. And there's, of course, also the political aspect to it, which is that Brussels very much saw GDPR as part of their goal, their ambition to be the global regulator, the Brussels effect if you've heard the term where Brussels sets the rules for the world because they're the largest market. The EU is the largest market and if you're building, you might as well build to that standard. So the EU is very conscious of that. They did it for data and privacy. America at the time didn't really react it, so there's no federal law competing with GDPR. And so now, if you're a business, also in the US, you care, you worry about GDPR. And the EU wants the same thing with the AI Act. They want to transmit European norms and European values globally and say, if you're building AI, you're building to our standards.

Speaker 1:

What's something about the future of AI that you're particularly optimistic about?

Speaker 2:

I think that the number of use cases and the things that you can do with it I mean, you know we're talking these super superlatives like I don't even want to list them. The opportunity is so huge we just need to make sure that those applications have some degree of oversight, regulation and safety so that they don't do and I'm not talking about the sort of extinction doom, whatever discussion. I'm talking about really mundane, everyday effects that those are thought about deeply and carefully by the people who built them, that we don't just rush out because it's cool that we think about that. We have a process for building things that are safe. So, to go back to the old car analogy, yeah, we can build cars that are really cool and really fast, but what people want is a car that doesn't kill them.

Speaker 1:

So what's something that you wish startups were doing more of right now?

Speaker 2:

I've seen a lot of concerns from startups about how regulation basically will kill their business and it just benefits the incumbents. You know the Google's on the matters of the world. They will adapt to any regulatory regime and actually use it to their advantage. I would say, if you're building an AI application today and you think you might be anywhere near that high risk list, you know, start building with those principles in mind even today, even maybe before the Actors Law, and take it seriously, and your customers will thank you for it and your market access will be much easier. So again to go back to the security certificates, if you're a SaaS startup today, you want to have that SOC2 certification right, because you're B2B, because your customers will demand it, and the same thing will be true for the AI aspect. They will ask you, you know, are you AI compliant? Have you been assessed? You have a certification, and if you can say yeah in fact, we thought about this right from the beginning you're going to have an edge over you competitors who didn't do that.

Speaker 1:

I know that you were previously an astrophysicist. You were a professor at ETH, sure, and like that, must have been an incredible jump to get into startups after that.

Speaker 2:

Yeah, well, I've always done sort of maybe slightly unusual things, even as an astrophysicist. So my research topic was the co-evolution of galaxies and supermassive black holes, and to do that I basically studied data from large surveys, and surveys being basically data recorded by telescopes, whether they be on the ground or in space. Sometimes I even went to the telescopes, sometimes I worked with the instruments, but mostly it was what we now would call data analytics. And so when I was a PhD student, I had my first run in with AI because I wanted to get large numbers of galaxies of different shapes. So there's fundamentally two types of galaxies. There's spiral galaxies, like the Milky Way, the sort of the swirly things, and there are elliptical or early type galaxies that look like a football or a rugby ball and we don't really understand how one turns into the other. And so I was working on that for my PhD, and so I tried to get more of the particular type of galaxies, the early type galaxies, and actually I looked into methods for sorting through the data and actually the sort of computer vision, machine learning type tools that were available at the time. I tried some of them and they basically they were just weren't good enough, so I said, all right, I'm going to sit down for a week, go through 50,000 images and do it myself until I got a blinding headache. And then one evening on Friday in the pub I was talking to a colleague about this and so we said why don't we take all that like there's a million images that we could sort through why don't we put them on a website and see if there's like a bunch of people out there that would want to help us? And this kind of blew up. So we built this website called Galaxy Zoo, put the Galaxy images there and when we launched we were the second most viewed story on BBC. Bbc News Number one was man flies to wedding a year early. You cannot beat that. And then the day after we'd be by a huge dog is reluctant media star. So again, you cannot beat that. But we were so wildly successful that the server we ever accessing that the astronomers in the US had put the data actually physically melted like a cable melted like it's not a figure of speech Because by the time we were done we had hundreds of thousands of people clicking away on galaxies at rates that we never imagined. So we sort of put two and two together because colleagues from other areas of science started emailing us like, can I, can I give you my data and have people sort through it? So you started this thing called the Zooniverse, which ended up in the end with, like I don't know, over 2 million 2 and a half million users clicking away at images. We would now call them like labellers. We didn't think of them as labellers for AI at the time. We thought of us the actual device with which to classify the data, and I really good at that. It's actually still going on.

Speaker 2:

I then went to the US for a bit, got really heavily also into X-ray astronomy, looking at distant, supermassive black holes with very high energy photons. It's also a very, very different way of looking at things because you literally count the individual photons Like there's one, there's a second one over a couple of weeks. Very different regime, very fun. I came back home to Switzerland at ETH and this was like 2012. And very quickly I actually got really I don't know unmotivated in working on galaxy evolution and black holes because we're just making no progress, and so I basically said, all right, I see that machine learnings come a long way. Why don't we use that for science?

Speaker 2:

And so I actually got more interested in using machine learning than the science enabled itself, and we had lots of cool projects. And so when I met my co-founder, who's a computer science professor at ETH he now moved to the University of Chicago and we decided to start a startup, first focusing actually on data-centric AI which was very new at the time which is really all about giving you the tools to iterate between model and data and finding the source of air noise and bias in the data. So basically saying, hey, which small number of samples in my data are ruining the accuracy of my model? Or, even more interesting, and then you see the connection to the AI act and what's coming now which samples in my data are responsible for the bias in the model? Where does the bias really come from? And so you develop really cool tools for that, and that's how we ended up then one day discovering the AI act.

Speaker 1:

So I presume you're no longer a professor. How has it felt transitioning?

Speaker 2:

I mean, you just do it, you don't look back and you make the decisions that you think make the most sense, and I think there's no better place to be right now than working on AI. And there's so many cool things happening so fast. It's impossible to keep up and I'm just super excited about being part of that world. There's actually a little return. I'm sort of only very marginally involved, so I'm part of a group of astronomers using LLMs for astronomy research and one of the things we did is we actually did a fine-tuned LLAMA on astronomy data and it's actually able to answer astronomy questions pretty well, I would say at the level of maybe like a good master student or early PhD students. So it's pretty good and we see this as a foundation of all sorts of cool applications.

Speaker 2:

For, since you're using LLMs for science, and I think astronomy is a perfect place to do it, because there's only two places where you have full access to all the data Either you are Google or Meta or whatever, and you own all the data in the world, so you can build really cool things there when you go to science, and there's really only one science where all the data are public and that's astronomy. And the reason for that is because most data is taken by government-funded facilities. So by law in the US, all data taken by Hubble are public. There's no copyright, there's no restriction, there's no license, and so all the knowledge, all the data in astronomy is actually available, and so you can build with that.

Speaker 1:

You mentioned that your.

Speaker 2:

LLMA model can answer astronomy questions.

Speaker 1:

Can you give us an example of what one of these questions might be?

Speaker 2:

So it's tuned on the research abstracts of scientific papers so you can ask it. For example, what do we know about the metallistic distribution of outer halo stars in the Milky Way and what it means for Milky Way assembly? So really nerdy technical astronomy questions.

Speaker 1:

Are you always asking it questions that you know the answer to, or have you ever asked it to solve something that unexpected?

Speaker 2:

I think I will let the other members of our team speak to that. It's actually also dangerous for me because I mean, I've left astronomy four or five years ago, so I wouldn't claim to be up to date.

Speaker 1:

But even known problems. Early on people discovered that GPT-3.5 wasn't particularly good at math. It couldn't add two numbers together after a certain size. So have you tried giving it simple inductive problems?

Speaker 2:

I haven't, but what I noticed that it's actually good at is motivating well why it knows things like what the data behind it is. That maybe sounds more impressive than it is, but it would say, identify the right instrument or survey or campaign that produced certain results. It seemed to be pretty good at that for me.

Speaker 1:

What's been the biggest transition for you, coming from academia, going into startups?

Speaker 2:

Academia is far more zero sum than startups. In academia, if I write the paper, you can't. If you get the prize, I can't. There's a limited supply of these things. There's always, even with your best friends and colleagues. There's an absolute competition. In startups, there isn't. If I discover tomorrow there's another AI governance startup focusing on the AI act, I say wow, that's cool, that's validation, that idea is good, the market is big, let's both go for it. I would actually feel positive about that.

Speaker 1:

In your startup journey? Have there been any bumps since you started?

Speaker 2:

Of course, startups are hard. If they were easy, everyone would do it. No, I think the biggest decision we took and it was a calculated risk was moving away from data-centric AI and saying, okay, we focus on governance, regulation and compliance, which, to me, was a new topic I had to get educated on. It was a gamble because we didn't know when was the law going to come, what was really going to be in it? Was it really going to be something that's going to be a big deal? We took a calculated bet and we were really amongst the first people building towards that. Now it's gratifying to see Now we're starting to have competitors.

Speaker 1:

How do you think about building a culture for your company?

Speaker 2:

A culture has two components. It's something that you as a founder or co-founder can bring in, but there's also, if you build a good team, you have people with strong opinions that are opinionated about how things should be and you need to harness that and you need to guide that and make sure everyone's moving in the same direction. That's a continual challenge. There are tools for that. There are methods for finding that alignment, but I think that's really important that people who are thoughtful and people who have strong opinions and things aren't ignored and are able to contribute to the culture you want to build as a team.

Speaker 2:

I don't think there's one correct way to do it. I think there's very different ways, maybe also on your core team what works for you? One of the big debates that people are having right now is work from home, work remote. Do you come in the office? Do you work only in the office? People have very strong opinions on that. I actually talked to our people about how they feel they work best and we looked at that and we found our solution, our sweet spot for that.

Speaker 1:

I think we're out of time, but thank you so much for coming on. It was really wonderful.

Speaker 2:

Thank you very much.

People on this episode