Transcript

Applying Emerging AI and Machine Learning Tools to Find Equitable Solutions to Scientific Discovery

Jeffrey Reid:
Hello and welcome everyone. I'm Jeffrey Reid, the Chief Data Officer at the Regeneron Genetics Center. And this is the Nucleus by Regeneron, our new podcast series where we discuss the science that fascinates and intrigues us. And today we are very lucky to kick off the podcast series with Daphne Koller, the CEO and founder of insitro. Hello, Daphne.

Daphne Koller:
Hello, Jeffrey. It's a pleasure to be there.

Jeffrey Reid:
Great to talk to you. I always learn something when we chat. Maybe you can introduce yourself and tell us a little bit about who you are and how you ended up at insitro.

Daphne Koller:
I have a very unusual career journey that is somewhat circuitous. I started out my career as one of the early people into the field of artificial intelligence back in the early nineties when it was still considered somewhat of a fringe discipline. And when I graduated with my PhD from Stanford in 1993, you couldn't actually say you were doing AI because people would treat you as if you weren't a serious researcher. And then I came back to Stanford actually as the first machine learning hire into Stanford's computer science department and helped move it away from its logic roots into more modern-day AI.
I ended up leaving Stanford on what was supposed to be two-year leave of absence in 2011, 2012 to follow a passion project, which was in the area of technology assisted education.
I am one of the two co-founders of Coursera, which is today one of the largest online education platforms in the world with well over a hundred million learners worldwide. And that kind of led me to an ultimate departure from academia. And then in 2016, I kind of raised my head up over the trenches for the first time since the founding of Coursera and said, oh my goodness, AI is changing the world and will only continue to accelerate in doing that, but really isn't having that much of an impact in the life sciences.
And I felt like one of the main reasons for that is that there just aren't that many people who spoke both languages. And I was in the privileged position having spent a large chunk of my career in each of those two disciplines and felt like I could help bridge across that chasm. And after a stint at Calico where I learned a tremendous amount of science and learned my first exposure to drug discovery and development in 2018, founded insitro, which is a company that really tries to bring the tools of AI, machine learning and data at scale to deliver better medicines to the patients who can benefit most.

Jeffrey Reid:
Well, thanks for the just landscape of your history. I have to say, as you know, I also have bridged academic worlds. And I think this is why your perspective is really so useful because you do have both of those perspectives. So, with that in mind, maybe we can level set and talk about just baseline definitions for AI and ML particularly because within biology and bioscience, I think there's some misconception.

Daphne Koller:
Okay, so different people have different definitions of AI and ML and for a lot of people they just use it almost as a synonym to each other. My personal definition is that AI is the discipline of trying to get machines to behave intelligently where intelligence is typically defined in a very human-centric way. And the way that humans exhibit intelligence. Machine learning to me is the discipline of getting machines to learn how to perform difficult tasks by learning from large amounts of data. Now it turns out that many, arguably most, tasks in artificial intelligence are best solved via machine learning. Hence the very large overlap between these two disciplines. But I would say that there's a bunch of tasks that I would define as ML where I'm not sure that people would think of them as artificial intelligence because what the machine is learning to do is a task that a human will never be able to do, such as finding really subtle patterns and images of cells or large DNA sequences or things like that.

Jeffrey Reid:
So, talk more about the impact of AI and ML or really these emerging computational tools on the way researchers and scientists are pursuing drug development and research.

Daphne Koller:
So, I think I would divide the impact into two different trajectories. One is just straight productivity gains and the other is deep discovery. I think that the kinds of tools that AI has developed and the type of large language model tools like ChatGPT and the like have really transformed our ability to interact in natural language with multiple different tasks. And I think that there is huge productivity gains to be had simply by giving people that ready access to these computational frameworks so people can now do these tasks without actually needing to know how to program. And I think that unlocks a lot of opportunity for people to just engage with their data in whole new ways without needing to have a data scientist attach to them at the hip, if you will. But I think that is an accelerant, but it's not necessarily transformative in the same way that we get when we allow our computational tools, AI, ML tools to take the increasingly large and complex data sets that are being generated by scientists and specifically in the life sciences and interrogate them in whole new ways.
One of the parallel revolutions that I see to the AI revolution, one that maybe hasn't gotten quite as much press, but I think is equally fundamental, is the ability to quantitatively measure biology at high fidelity and high scale and biology. I mean everything from what happens within a cell measuring proteins, what happens when you look at broader cells like cell morphology, cell behavior, all the way to human physiology. And there are more and more tools that we have at our disposal to generate measurements of these different biological systems at different biological scales. And there's just no way for the human mind to assimilate and extract the information that exists within these data.

Jeffrey Reid:
What are the data priorities? What is the biggest, when we're going to get in terms of our ability to access this kind of information, and what are the data sets do you think we should be aspiring to build to make the discoveries that we need to make?

Daphne Koller:
So, at insitro, we've made a very deliberate decision to focus on two kinds of data. One is data that we print in-house where we get to generate cells, perturb cells, and then measure cells at the single cell level using omic modalities, transcriptional profile, proteomics, as well as multiple imaging modalities including tracking cells as they evolve. So it's not just dead cells, it's actually live cells. And the reason we like that modality is because you really can now start to interrogate with active experimentation that it's often guided by the AI of what is the set of questions that we should ask those cells, how should we perturb them? And then measure what happens as a consequence. You get to interrogate causality with active experimentation, which is amazing. The other type of data that we really love is data from humans. And the reason is that while you can interrogate biology at the level of cells, ultimately what you care about is curing humans, not cells.
And so large human population cohorts where you can get an angle on a perspective on human physiology and how different genetics, which is a different view on causality, gives you different phenotypic consequences, different clinical traits that emerge from that. That is a hugely valuable resource that allows us to understand that. So-called genotype phenotype connection at the human level.

Jeffrey Reid:
So, with that in mind, what do you think the science of the future would look like in this space? What advice would you give a young scientist who wants to really contribute to drug development and drug discovery and wants to make the world a better place and a healthier place?

Daphne Koller:
So, if you had asked me that question five years ago or even three years ago, I would've said, you really have to become computationally literate. You really have to learn how to program. If you asked me today, I would actually tilt that a little bit and say that you really have to understand how to start with the data and let the data drive your thinking rather than start with a hypothesis and then cherry pick data to get at the hypothesis. I still think you need to learn maybe not the granular elements of programming, but the ability to become a very strong structured thinker because we're all going to have AI copilots that help us with maybe the more granular elements of if I want to do this analysis, they will pull in the right bits of code and put in that and actually have that analysis be executed. But you still need to think about how to put the big pieces together, how to formulate questions in the right structured way so that the AI can be maximally helpful to you in driving your work.

Jeffrey Reid:
So, let's shift a little bit to some of the opportunities that are out in the field. So there's all of this excitement around the new drug modalities with CRISPR and RNA interference, and we're seeing just a mind blowing transformative stuff that's coming out of early trials in the clinic across the industry. But as we get closer and closer to these more granular, more personalized medicines, is there a role maybe for AI and ML in thinking about how we de-risk a personalized therapeutic where maybe it's not so easy to do a trial or maybe it wouldn't be clear exactly what kind of trial you would do to convince yourself that this will really work?

Daphne Koller:
I think that's a very interesting question, Jeff, and I think begins with actually how comfortable regulatory agencies are going to get with a trial that is “N” of a few. And I think we're maybe a bit of a ways from that because it is still the case that we don't know upfront what some of the idiosyncratic side effects of particular variants of our medicines are going to be. And so, I'm not sure that that will be the first thing that will happen, at least not for small subsets of common diseases for those “N” of one diseases where we've all read about, then maybe yes. But I think actually a more interesting personalization trajectory is actually going to be with selecting the right combination of treatments for a given patient.
Because as we all know, a disease in most common diseases is not driven by one gene going awry. It's actually a disequilibrium where this one's a little bit too high and this other one's a little bit too low, and you really need to sort of maybe intervene in the system in multiple different ways in order to optimize the treatment for a given individual. And so I actually think that the more interesting personalization trajectory is by understanding in this complex continuum of different ways in which a system can become dysregulated, what is the right set of interventions then bring in a set of drugs that have already been approved in a unique combination that is specific to that patient's needs.

Jeffrey Reid:
Yes, we absolutely believe in combinations and at Regeneron we're taking that across therapeutic areas. So, let's shift to something that's near and dear to my heart. Diversity, equity and inclusion and ethics related questions around AI. Obviously technology is transformative here. As you mentioned, we've seen this amazing behavior emergent from the large language models, and it's very clear that this is going to disrupt the world in a lot of different ways that we've been discussing. But how do we one, both make sure that the people who are involved in this revolution are being chosen equitably and not being brought in from some perspective of bias as well as how do we make models and convince ourselves that we are making models that are operating in an equitable way and aren't overly trained to just replicate the biases out in the world.

Daphne Koller:
On the people side, I think we need to intervene in multiple different steps of the process. I actually think that by far the most important intervention point is as early in the funnel as you can get, make sure that young girls aren't scared away from engaging in STEM disciplines and in AI specifically because it's not the kind of thing that girls do. And I have two daughters, both of whom are in STEM, and I can tell you that even in the very privileged school that they were fortunate enough to go to, there was a preponderance of boys going into STEM disciplines, and that continued on into their college days. So I think that is a much more important intervention point, albeit one that is longer term.
Because the truth is if you don't have enough great people in your funnel as you hire, then you're kind of forced to work within the population that you have access to.

Jeffrey Reid:
So, specifically around race and ethnicity, particularly because this is so important to genetics, I am curious, do you see that as different or related to some of the concerns around gender impacts?

Daphne Koller:
Your question really has to do with how do we ensure that the science that we do is representative of the patient population that we're trying to address? And I think there is a rightful concern about the fact that many of the population cohorts that we work with are very European centric. And I don't need to tell you because you've been at the forefront of it, just how difficult it is to recruit population cohorts from a broader population base. So I think there's, again, multiple prongs to the solution. One is to continue the great work that you at Regeneron and others have done on trying to create larger, more diverse population cohorts and make sure that they are accessible to the broader research population. I will tell you that one of the big frustrations that I've had with all of us cohort, which did in fact recruit with a focus on diversity, which I think is great, is that that cohort is not available to industry researchers, which I think is a real shame because that's where a lot of the clinically relevant research actually gets done.
So having said that, I do think that even the discoveries that we make from European cohorts are valuable to a much broader population than just Europeans because, and you and I talked about this, Jeff when you visited us, that the fundamental human biology is the same across all humans. And the extent to which diseases are prevalent in one population or another might be different. And so the number of patients that are helped in different populations, subsets might be different. But if you find a drug that is effective in a given biology, that biology is likely to be relevant across not just the population in which it was discovered. So, I like to tell myself that the work that we're doing is not going to be relevant only for Europeans. But boy, I would love for it to be the case that we have access to a broader population cohort A, so that we can confirm that. And B, because we know that the more diversity you have in your human genetics, the more new findings you can get at because we get to see variants that are only relevant in a certain that are only there in a certain subset of the population and are often new windows into disease. And we don't see them if we only apply the lens of a, say, European-centric population.

Jeffrey Reid:
Oh, yeah, I totally see that. The opportunity, I mean, it's just scientifically indefensible to only focus on a single ancestry. We know we are missing an enormous amount of science that could have a lot of impact on people's lives. If we don't move outside of the European ancestry sequencing.
So, moving again to DEI related topics, I'm sort of curious, you were talking a little bit about your daughters and their experience. Tell me a little bit about times that you've maybe felt like you're the only or the other, and how have you dealt with that?

Daphne Koller:
So, I've been the only or the other for most of my career. I mean, even today in computer science, I think today it's a little bit better. Throughout most of my career, women were about 10% of the population. So I was often the only woman in the room. And I think it really began to hit me the more I moved into my later in my professional career, because maybe early I was kind of just a student and I was doing pretty well. And so there wasn't maybe overt signs of discrimination. I also had very supportive parents, which was amazing. But as you make your way higher and higher, if you will, into the higher levels of your professional career, the extent to which people don't recognize that a woman can rise to that level can be quite staggering.
There would be places, I'll give you one of my quote favorite examples. It's favorite not because, not in a good sense where I was introduced. This was when I was at Coursera and I was introduced by a mutual acquaintance to a Fortune 50 CEO as the Coursera co-founder and CEO. And I said, great to meet you, would love to chat. My assistant James can help us set up a meeting. And his assistant responded back saying, dear Daphne, could you please confirm James' availability for a meeting?

Jeffrey Reid:
Wow.

Daphne Koller:
Yeah, that one really is kind of pretty high up there.
And then the thing is, what do you do when that happens? Because it's the kind of thing that standing on your high horse and saying, I expect to be referred to as Professor Koller, just makes you look like you don’t look very good. And at the same time, if you just let it slide, then it just continues that way. And so there it is kind of like you're damned if you do, damned if you don't. And so one of the things that I say when I'm asked this question in various audiences is that often if you are an ally, a bystander who sees these things, it's best for you to comment. So, my colleague at the time could have responded saying, thank you for your email, my colleague, professor Koller, and I would be happy to whatever. And that kind of statement is a lot more effective, I think if it comes from somebody else and if it comes from you. And similarly, and I know a lot of us have felt like this, when there's a group of people around the table talking about a problem and you make a point and the conversation just moves on, and then a male colleague makes that same point five minutes later and everybody says, John, what a great idea. And then you can't come up and say, but I said that five minutes ago. Because you look like, you know.

Jeffrey Reid:
Yeah, like attention seeking.

Daphne Koller:
But someone else can say, John, thank you for amplifying on Jane's idea.

Jeffrey Reid:
Yeah, I think allyship is incredibly powerful there just the opportunity to take that moment, which obviously somebody is feeling discomfort or there's this interaction that isn't going the way they would want, and having the ally step in and say, hey, I see this, and this is not right. That can be really, really useful.
What are you all working on right now at insitro that's really exciting to you and what can we expect to see from y'all down the horizon?

Daphne Koller:
So, we have developed in the last year kind of crystallization of our vision, which we call pipeline through platform, which is we've built this incredible platform that allows us to generate new therapeutic hypotheses and turn them into medicines. And we're really starting to see that play out in the productivity and the ability to come up with de-risked insights. And by the way, a little tip of the hat, I will say we've done a bit of a landscape survey of what are other companies out there that we believe have truly successfully executed on this vision of building a pipeline through a platform. And honestly, there aren't many of those and Regeneron is one of the few. So, a real tip of the hat that not just saying that to be nice, and we're hoping to be another one of those.

Jeffrey Reid:
Well, we're definitely excited to see what you all do with it. So, thank you so much for taking the time to chat with me and definitely looking forward to seeing you again soon.

Daphne Koller:
Thank you. Me too.

Back
to top