How Tag.Bio Makes it Easier to Interrogate your data

The discoveries medical researchers and drug developers can make are constrained by the kinds of questions they can ask of their data. Unfortunately, when it comes to clinical trial data, or gene expression data, or population health data, it feels like you need a Ph.D. in computer science just to know which questions are “askable” and how to frame them. This week, Harry talks with the founders of a startup working to solve that problem. Tag.bio aims to make it possible for any worker in the life sciences sector—even if they don’t have a Ph.D. in computer science or data science—to interrogate their data quickly and automatically. The idea is to help them uncover trends or connections in their data that would otherwise require months of work and help from a data scientist or a data engineer.

Tag.bio aims to make it possible for any worker in the life sciences sector—even if they don’t have a Ph.D. in computer science or data science—to interrogate their data quickly and automatically. The idea is to help them uncover trends or connections in their data that would otherwise require months of work and help from a data scientist or a data engineer.

The company was founded in 2014 as a spinoff from the University of California, San Francisco Cancer Center. That’s where co-founder Jesse Paquette first invented a system that let oncology researchers ask guided questions of their data without help from a bioinformatics expert. Now Paquette is Tag.bio’s chief science officer, and in this episode, he’s joined by Tag.bio CEO Tom Covington to talk about how the startup’s technology works and why easier access to data is critical to faster progress in drug discovery and to the whole idea of precision medicine.

Please rate and review The Harry Glorikian Show on Apple Podcasts! Here’s how to do that from an iPhone, iPad, or iPod touch:

1. Open the Podcasts app on your iPhone, iPad, or Mac.

2. Navigate to The Harry Glorikian Show podcast. You can find it by searching for it or selecting it from your library. Just note that you’ll have to go to the series page which shows all the episodes, not just the page for a single episode.

3. Scroll down to find the subhead titled “Ratings & Reviews.”

4. Under one of the highlighted reviews, select “Write a Review.”

5. Next, select a star rating at the top — you have the option of choosing between one and five stars.

6. Using the text box at the top, write a title for your review. Then, in the lower text box, write your review. Your review can be up to 300 words long.

7. Once you’ve finished, select “Send” or “Save” in the top-right corner.

8. If you’ve never left a podcast review before, enter a nickname. Your nickname will be displayed next to any reviews you leave from here on out.

9. After selecting a nickname, tap OK. Your review may not be immediately visible.

That’s it! Thanks so much.

TRANSCIPT

Harry Glorikian: I’m Harry Glorikian, and this is MoneyBall Medicine, the interview podcast where we meet researchers, entrepreneurs, and physicians who are using the power of data to improve patient health and make healthcare delivery more efficient. You can think of each episode as a new chapter in the never-ending audio version of my 2017 book, “MoneyBall Medicine: Thriving in the New Data-Driven Healthcare Market.” If you like the show, please do us a favor and leave a rating and review at Apple Podcasts.

Harry Glorikian: In healthcare and drug discovery, everybody’s got data. Knowing what to do with your data and how to get value out of it is the trick. That’s what we’ve spent the last 60-something episodes of this podcast talking about.

Unfortunately, when it comes to clinical trial data, or gene expression data, or population health data, it feels like you need a Ph.D. in computer science just to know what questions to ask and how to ask them.

But there’s a startup in San Francisco that aims to break down that barrier and make it possible for any worker in the life sciences sector to interrogate their data quickly and automatically. The idea is to help them uncover trends or connections in their data that would otherwise require months of work and help from a data scientist or a data engineer.

The company is called Tag.bio, and it was founded in 2014 as a spinoff from the University of California, San Francisco Cancer Center. That’s where co-founder Jesse Paquette first invented a system that let oncology researchers ask guided questions of their data without help from a bioinformatics expert.

Now Paquette is Tag.bio’s chief science officer. And I’ve got him here today, together with chief executive officer Tom Covington, to talk about how Tag.bio’s technology works, and why easier access to data is critical to faster progress in drug discovery and to the whole idea of precision medicine.

Harry Glorikian: Tom, Jesse, welcome to the show.

Tom Covington: Thanks, Harry. Thanks for having us.

Harry Glorikian: So, I’m trying to wrap my head around Tag.bio and, and all the technical details and everything, but, but sort of, I want to step back and give people who are listening the chance to understand the organization and the goals. And so I’ll start with a grand vision question and it’d be like, okay: What’s wrong with precision medicine and the way that we’re sort of looking at data today?

Tom Covington: Yeah, I think first and foremost, precision medicine is at its heart is a bit of a data management problem. There are disparate data sources within healthcare and life sciences, so that to truly enable kind of an N of 1 or small N medicine, it requires the integration of those data types and the ability to ask questions of those disparate data sources.

And there isn’t really, or there previously has not been, a great solution for that problem. As a part of that, given the complexity of the underlying data, there are experts in manipulating and analyzing data, but they are not the same experts that are going to be practicing medicine or advancing science, the knowledge workers in the healthcare and life sciences space. If they have a question that could be answered in data, they have to hand off that question to an expert in manipulating or analyzing data. So data scientists, bioinformaticians analysts, and the like. And that process is slow and human powered.

And so if you have a question, it can be answered in data, and you’re a physician, it may take you one to two months to get an answer. We’re trying to take that process and turn it into something that takes two minutes or less.

Harry Glorikian: I keep thinking we’ll just hybridize them and then we’ll have the best of both worlds. But I think that might take too long based on my experience when we first came up with the term bioinformatics, right? Stick two people in a room and have them figure it out. And that, that took a while. But what’s not working so well. What’s not working as well as it could in this whole life science arena. How do you guys see what you guys are working on bringing that one step closer to being more fundamentally useful and providing value to the industry?

Tom Covington: Yeah, so I think the easiest way to think about it is with kind of use case examples. So we were working with a researcher who had published a paper on thymoma, the cancer, and when that was uploaded to the cancer genome atlas, and, theoretically, they had mined this data for all of its worth, we gave him access to the platform. And over the course of the evening and a glass of wine, he found three novel insights in his data that warranted publication in a paper. And essentially what we did was reduce the cost of him asking and answering the question of data. Whereas previously it would have taken him months to ask one of these questions, he was able to ask and then iterate on the questions until he found the right question that generated the right output. That allowed, that was novel. And I think that’s the big advantage of this kind of acceleration of discovery that happens via platforms like ours.

Harry Glorikian: So go ahead, Jesse.

Jesse Paquette: A lot of people are going to look at data problems in the life sciences and healthcare space, and they’re going to say, well, the problem has to do with the siloization of data. It has to do with the quality of data. It has to do with the integratability of data, and a lot of cultural problems that exist in the system. And then they’re going to fall back to the old adage, which is, 90% of the time of a data scientist or data engineer is just processing the data, working on the quality, getting the data in analysis-ready shape.

So why we get the question: Why aren’t we solving that problem? Well, it’s a hard problem. And ultimately what we’ve realized over many years is that if you’re spending 90% of your time working on processing and transforming and getting data analysis-ready, you don’t have enough time to do any of the analyses that you really want him to do. Case in point, this TCGA data set. They did analyses, basically what they could, they got published and they wanted to do so much more, but the data has so much value and you have to spend so much time just getting it ready. This is really what we’re trying to accomplish and making this data just rapidly. In an assembly of line, sort of an analyzable.

Harry Glorikian: Yeah, I’m trying to draw analogies to other things that I see going on in the tech industry, like, codeless sort of programming, where people who aren’t familiar with the data analytics side of it can sort of pull different analytics packages or, or scripts that they can use to run on their data without having to know how to code everything up from scratch. Is that a reasonable analogy to make? I mean, the other one that I was thinking of earlier that was GitHub, right. Where people can access these things that are written once by one person, but used by multiple people. So you don’t have to always go back to a data scientist and say, do this for me.

Jesse Paquette: Yeah, I can, I can take that. In some sense what you’re talking about is a marketplace and we do have a longer term vision for being a great marketplace for resources around an analysis of data. So if, if we have a really good turnkey connector to a critical data source, like an electronic medical record or a genomics data source, we can bring that in and people could use our system to build hybrid solutions. In many ways, it’s, it’s similar to, I think the way JavaScript works with NPM, or R works with all of its R libraries or Python works with all of its Python libraries. There’s this whole world of really useful stuff out there that you can sort of just swap in and out and, and, and make useful.

And I think our system really does that very well with data sources and data modules that represent algorithms or apps workflows on data. That’s a long-term vision of ours. Definitely. I think in the short term, what we’re focusing mostly is the low-code system and being able to deploy useful application layers on top of data in such a way that you can just do it really quickly with robustness and security, and then also get to iterate with the end users. It’s very important that you actually, if you’re going to build an application for a physician or for a researcher, you have to work with them to make sure that it’s really useful.

Harry Glorikian: Yeah, that was the word I was actually, it’s funny that escaped me. Low-code was the word. There’s too many damn new words that I need to keep track of for all these changes that are happening. So your VP of customer, Mark Mooney, said that you guys are solving this, quote “Last mile” of data analysis. What does he mean by that? Yeah.

Tom Covington: Yeah, so if you think about it from a physician’s perspective, it gets back or scientist’s perspective. It gets back to this long lag between making a request, an analysis request, and getting a result. We touched on GitHub earlier. Even if we had something—well, let’s say you’ve got something in GitHub that can be reused by others. How big a population can it be reused by? It’s likely, if it’s in GitHub, it’s likely for data scientists and other practitioners of those arts. For the physicians, the knowledge workers who are trying to extract the insights and make discoveries and data, they need a place where they can actually ask questions. And that’s where this kind of low-code application development environment helps, because you can very quickly build and deploy apps that speak the language of the domain expert, and allow them to ask their questions as they come to them, as opposed to having to work with or through a data scientist to generate those insights.

Harry Glorikian: But it is the data scientists that are helping build certain parts of this, right? So they’re not excluded from the process.

Tom Covington: So no, no, they’re critical to the process. They basically, instead of most of the time, and Jesse can speak more fluently on this, or eloquently on this, but traditionally, if you have a request, you hand it off to a data scientist, they tend to do ad hoc analysis.

So they’re like, okay, what are, what are the tools that I’ve got at my disposal? What’s the fastest way to generate this answer for the requester? And they will use various scripts and various languages and come up, generate the output. If there is a follow-up question, some of that may be reusable, but not all of it.

And then the process of extracting, doing a follow-up question can take a lot of time. If the data scientist instead builds an analysis app that allows reparameterization and the ability to, for the end-user, to ask 10,000 variants that have a similar type of question, then they can do the work once, publish it, and then lots of people can use that basic workflow to answer their specific question.

Harry Glorikian: So it’s basically, over time, any organization will end up theoretically with a library of these analytic tools that then they can use in different variation. And so theoretically then maybe the data scientist can work on more complex issues.

Tom Covington: Exactly. The more fun stuff.

Harry Glorikian: Yeah. Okay. So, let’s go to history, right? You guys started this in 2014. I don’t even know if there was a low-code movement happening in tech in 2014. I’m not so sure.

Jesse Paquette: WordPress is low-code.

Tom Covington: I guess we started it a bit, a bit early. But it was, based on some of Jesse’s, his kind of career as a bioinformatician and what he saw as shortcomings within the industry and the ultimate job of empowering physicians and scientists to make discoveries and, find insights quickly. He recognized that this was a constraint in the pace of innovation. And so we, when we started the company, nobody was talking about data mesh. I’m not even sure there was much around low code other than, as Jesse mentioned, in WordPress. But there has been a shift in the past, I would say, in the past couple of years towards data mesh as a an improved solution for data lakes and data warehouses. And low-code is the preferred path forward for developing software applications.

Harry Glorikian: Yeah, Jesse. I mean, I think if I remember correctly, you were doing that, you were doing sort of analytics of gene sequence data at Life, right? So is this, is that where you, the epiphany came?

Jesse Paquette: Before then, actually. I was working at the UCSF cancer center and as Tom described, I was in a situation where I was working with a number of really talented researchers, these knowledge workers that had interesting datasets, but it all required computational analysis. They were either too big or too complex. And I found myself repeatedly doing a lot of analyses. And at some point I thought, what? I can start to automate this. I can start to automate that. And I put together a platform for a specific purpose. It’s called EGAN, E-G-A-N, which stands for exploratory gene association networks. And it was basically a new way of looking at data as well as a new way of structuring data so that these analyses can be done more repeatedly, and in more of a workflow.

Jesse Paquette: And then I went to Life Tech and worked on similar applications. I went to a company called Ayasdi in Palo Alto. That was they, they had a, just a blockbuster algorithm which is, which is still really cool. And they were building applications around that and I was working on their life science applications for them, and it really comes down to the user experience. Physicians, they need to be able to start with something and know how to use it out of the box or with very minimal training. And when they come back and when they have that question again, two weeks later, they want to be able to come right back to the application and use it like they’re using email or they’re using Google or just like using the, tapping on their phone.

And, and it was, it was interesting. We started working in sports. And with the sports users specifically with an NFL team, our earliest iteration of our platform, what we had was a very complicated user experience. And we showed them how to do this really cool analysis analyzing when a certain receiver was getting passes and scoring touchdowns. And he said, well, great, but can you do it again for a different player? And I said, Oh yeah, well, I just have to click here, here, here, here, here, here, and here. The light bulb went off and we realized all we should have to do is just choose a different player. That’s how it should work.

Harry Glorikian: Yeah, but this is sort of like, I should be using it. Cause I’m asking questions all the time about a company or a technology or, and there’s all this data behind it that I’m sort of putting together to do my analytics of what makes a good company, what doesn’t make a good company. When is a technology on its upwardly mobile curve, right? So I’m, it’s not the same type of data, but it’s definitely data that I will make decisions based on. So a tool like this, I can see has more application than just where you guys are focused.

But Tom, your background is mechanical engineering manufacturing, clean energy. How did you two get together and start a bioinformatics company?

Tom Covington: Yeah, well, so Jesse and I have known each other for about 12, 13 years now. We played soccer together every Monday night and I knew a little bit about what he was doing. But one night after a match, we often would, go have a beer afterwards. And we started talking about, or he started talking about his idea. And his idea was essentially built on the foundations of EGAN, which he had developed to allow biologists to do some of their own pathway analysis when he was at UCSF. And as he started talking about it, I realized, I thought back to my time, because I was a race engineer for Honda for several years. And we were always generating large amounts of data at the track every weekend and trying to analyze that to improve the software, come up with new algorithms, new ways of controlling the engines. And I was pretty good at torturing the data in Excel, but that was the limits of my capabilities.

And what I recognized that he was describing was a tool for people like myself, to allow me to rapidly find insights in complex data. And that was pretty appealing. And this is, as you kind of alluded to, is a fairly generic platform. We have aimed it at the healthcare and life sciences space, because from our perspective, precision medicine is a, it’s a long described Holy Grail. There are some inherent challenges specifically with the kind of the disparate data sources and bringing them together. And, my wife is a physician at UCSF, Jesse’s worked at UCSF, I’ve worked at UCSF. We kept getting pulled back into the healthcare life sciences space. And so we decided to focus there and we think it’s a, satisfying and fantastic opportunity. At some point we may evolve beyond precision medicine, but for right now, we’re very clearly focused on precision medicine and the opportunities that it provides.

Harry Glorikian: So I like the word tortured, torturing data. I got it. I got it. I got to use that in a few places that, that, that I’ve always tried to be nice to the data, so it’s nice to me, but I’m happy to torture it. And it does sound like there’s a more of a generic application to what you guys are creating. I know that that everything requires some focus, but this does look like it could be used in a lot of other spaces that, even if you drew diagrams of, of adjacent areas that would give you that expansion.

So I was thinking like, what does Tag.bio mean? And I’m thinking, Does, is it based on, Jesse’s previous work of tag based analysis? Or how, how did, where did that name come up from?

Jesse Paquette: Essentially? Yes. I mean, if we had to, if it had to answer quickly. Yes. I mean, Tom, I don’t think has the whiteboard where we drew all of the possible names of the company and started to put together portmanteaus and stuff. I think his kids have long since drawn over that multiple times. We were going for a lot of things with the name. We wanted it to be short. We wanted to sort of not be a name that people had to ask, How do you say that? Which a lot of startups get, right? You try to come up with a really crafty way of spelling something. And then that’s your first question is like, how do you say that? So it’s clear. And it uses the .bio domain for better or for worse. And it really relates to the concept which we had initially, which was, which I had even farther back going back to UCSF which is based around treating categorical data as sets.

And, and so it gets a bit into the mathematics of things. And that basically, if we talk about the set of patients who lived versus the patients who died right in categorical data, it’s represented as sort of deceased or alive. Right. And, and many times algorithms are just going to look at that and treat them as words, or treat them as, as certain things as statisticians would.

But if you consider those to be sets and you can start to intersect sets with others, like you have treatment, right? So some people were treated and some people responded well and some people didn’t respond well. Some people weren’t treated and they responded well or didn’t respond well. And all of a sudden you started thinking about that using set mathematics tags is a good concept for that.

Tom Covington: Yeah, the simplest explanation is we live in tagged data and we came from biology.

Harry Glorikian: So you guys have been working on this for seven years, right? If my, if my math is correct. And that’s enough time, for both the product and business model to have evolved. I’m assuming that it has a few times can you walk me through how the platform has changed over time or that how the concept for the ideal customer, for the platform has changed?

Tom Covington: Yeah.

Jesse Paquette: Can I, if I could start from the technical side, I don’t think that the form has changed really at all. It’s, it’s exactly what we designed seven years ago. It’s just gotten a whole lot better based on all of the, the team members that we brought in to do the workforce, all the things that Tom and I don’t do particularly well. We’ve been able to complement ourselves with cloud architecture people working on projects in specific healthcare or life science areas. But when it comes down to the core tech and how useful it is and how scalable it is, I don’t think it’s changed. So I’ll let Tom talk about the business, because that has changed.

Tom Covington: Yeah, we had our original vision was to essentially mirror the worldwide web, but for data. So in a worldwide web, you’ve got data, you’ve got web servers, you’ve got a communication protocol HTTP, and you’ve got browsers for interfacing with that content. And we wanted to mirror that for data. And so we have data servers, we’ve got a smart API as a communication protocol, and you can similarly access content on those data servers via a web portal. That concept is [gone, but] the platform has remained the same. What we’ve learned through customer interactions is how to improve the user experience and around accessing data. And I think that, in, in our explorations, in multiple verticals, speaking about that NFL team, like that really simple kind of aha moments like, Oh, that’s going to be critical for kind of any user. And so we’ve learned a lot from the interactions with customers about how to improve the user experience. So I think from the platform perspective, and the kind of flexibility and generic applicability of it, we have by looking at a bunch of different verticals, initially, we, we learned what was going to be core across verticals.

Tom Covington: Part of the reason for the focus on healthcare life sciences is they, on the surface they look pretty different in terms of their data types. But if we have, we’ve developed a platform that can be kind of agnostic to data types and analysis types. And so, it is well-suited to marrying two disparate types of data together. And so for us, the opportunity of precision medicine is one that. Kind of emerged from those realizations and those learnings from other customers from the types of people that want to use it and the, the, how the businesses evolved. Originally, we started with kind of researchers, people that were not quite high enough in an organization to make buying decisions. We’ve since learned and we, now approach it a higher level within an organization. And that makes—because this is a concept that requires It’s different enough that it requires some vision and some, there are various users within an ecosystem, whether it be on the IT side security side, all the way up to the end user domain experts. And so you, you need to approach at a high enough level of an organization that they can see the vision. And be receptive to the idea that the current status quo is not working well enough and not fast enough. And the cost of answering your question from data is just far too high. And if it is that high, you were fundamentally limiting the pace of innovation within an organization.

Harry Glorikian: Yeah. I mean, because I was thinking to myself, I’m like, the next level would be like, again, if somebody writes the analytics part that can be reused at multiple organizations, right. That just theoretically speeds everything along, regardless of the data source that it’s ingesting. But how did you guys come about this whole idea of like, quote, “analysis apps” and do you guide users to like, this might be the right one for you to click on, to use for this? Or do you guys just provide the platform?

Tom Covington: Jesse. Do you want to take that?

Jesse Paquette: I mean, there’s the technical aspect and then there’s the business aspect. I’ll talk about the technical aspect and it’s something that we’re learning about with every interaction we have with a user or a customer. With big organizations there are policies in place they’re either formalized SOPs or there are rigid sort of cultural silos and, and things like that. And it, and as everybody knows, even if you have the most useful thing, if you don’t Institute some form of change management or training within the organization, you’re not going to get the adoption that you need, even if you just have the best tool ever. If you put Google in front of somebody who’s never seen Google before, they still might not use it unless you actually turn on their phone and point their fingers at it. And so we do make some effort to onboard users.

We think it’s very useful. We also then get to observe their experience and learn about the naive user experience. Something we care about specifically. And the experienced user is also important. We find that we have some power users who just love our system and they have no problem trying to do all sorts of fancy things with it, to the point where they want more apps. And, and at that point it’s, it’s up to us or their in-house development team to start giving them some more apps on some, maybe some new data that they need. And it’s, so we, we do spend a fair amount of time with our users. Yeah, Tom?

Tom Covington: Yeah, I think I’m kind of from a big picture perspective. Like the platform is flexible enough that you can build very simple apps and also very sophisticated apps. So, an example of a simple app would be, how much does this particular drug cost within a hospital system? That’s a simple dropdown, any user can see the title of the app and click on it and know exactly what it’s going to do. And you get into more complicated, where it may be doing some advanced clustering algorithm, and you’ve got to select the cohort that you want to look at. But it’s the, it’s designed so that the data scientist developer of these apps can write them in a way that will speak to the end user.

So, a healthcare app is going to a physician who is gonna understand that intuitively versus a researcher at a large pharma organization, they’re gonna have different data, different analysis needs, their apps are gonna speak their language. And so it’s a lot of it is down to, and this is one of our learnings through these various customer interactions, was that we need to enable the building and deployment of apps that speak the language of the domain expert and make it really easy and intuitive for them. When they just, they see an app they’re like, “Oh, I know what this is going to do automatically because I can, I recognize the, the analysis methodology, or I recognize the data fields in there.”

But it’s, it’s all tied around making the user experience as easy as possible. So there is minimal onboarding. One of the things that other software platforms that allow analyses don’t do so well with is the user experience. You’ve got, just think about something like Excel. If I build an Excel model and then share it with you, you may have questions or concerns about tweaking anything, because you don’t know what went into that Excel model. And you can add all sorts of things. You can do all sorts of things. There’s all, there’s all sorts of functionality available within the front end of Excel. And honestly, there’s too much complexity. And even Excel can be over overwhelming to somebody who hasn’t used it before. And we’re trying to make something that the least sophisticated computer user would be able to understand just from clicking around and trying it and running an analysis.

Harry Glorikian: I should start using this myself for all this stuff I try to do. But how hard is it to sell the product, and the big ideas behind it, to potential customers. I mean, do they, do they go like, “Oh my God, I totally get it. Now I’m jumping on this.” Or is it, I don’t want to call it a slog, but how much education does it take for an organization to get this big idea?

Tom Covington: Yeah. So it previously has been a slog, because there is enough, it is enough of a shift in the thinking that it takes some time for them to understand and use cases and deployments. Some of the large pharma and health care organizations that we’re currently at, it has certainly helped. The other thing that has really helped make things go faster is the recent kind of adoption of data mesh as a kind of a new paradigm for the next generation of data lakes and data warehouses. Domain-specific data products, the fact that other people are talking about that.

And then, we essentially built to that seven years ago, has certainly made things easier. It’s, there’s less education that has to happen from us respective to a customer. Also low-code, that is something that, for the most part you can just say, and that people kind of intuitively understand because there are other examples in the marketplace. And so I think that, we started the company pretty early relative to where the market was. But now the market is kind of catching up in terms of understanding the core concepts. And so that has made customer acquisition a lot easier.

Jesse Paquette: I’d like to add one more thing. So we’ve been talking a lot about end user experience. And that’s been our primary focus from the beginning. Over the last couple of years, we have learned about a second domain of user experience, which is equally important, which is the developer experience. And we’ve always been trying to support our internal developers and our collaborator developers and our customer developers but working on improving their experience.

So if they’re data scientists, they should be able to work natively in R and Python to develop on our platform, they should be able to bring in their own algorithms into our platform in their own visualizations. If they are more of a front-end application developer, they want to use JavaScript. And they’re okay using the JSON low-code templates to configure the platform and the data nodes. If they’re data engineers, they’re going to be working on the data plumbing layer, and we need to have a very good API system and set of SDK software development tools, right, for mapping the data in, from the, the, the state-of-the-art data platforms that they’re very proud of.

So we want to fit very nicely within the things that people have already been building and in doing so we find that customers are, the reception that we’re getting is much more positive because instead of saying, “You’ve got to throw away all this stuff and use tag.bio,” it’s, “Well tag.bio fits right here, and it fits right there and it could fit over there, but you’re using that other thing. So we’ll just wait on that one for a while.”

Harry Glorikian: Okay. So somebody buys this and puts it in place, starts to utilize it. How do you guys measure, I don’t know, a payback. How do you measure advancement? How do you measure impact? Because right. All of this is to make life easier, faster, and find that, billion dollar molecule, if you’re looking at it that way faster or identifying a patient that would benefit from something faster, right. I’m assuming there are lots of use cases that you guys have. So how do you, measure the “Holy shit? I found it” moment.

Tom Covington: Yeah, that’s a great question because, so one of the things that the platform kind of inherently does is it keeps a history of every analysis that’s been run. So when a user has a full history of their analysis, so, thinking back to, if you’re thinking about an Excel model, any tweak you make to an Excel model, you may notate by just changing the file name. In our world, every analysis that’s been run is annotatable, it’s replayable, it is shareable. So you’ve got a user history, then you’ve got an organization’s user history. So across all data nodes, all users so from an ROI perspective, the simplest metric is: how many more questions are you able to ask of your data than you previously could? The quick answer is it’s about 1000x more. Just by short-circuiting the process to ask and answer your question, people ask a lot more questions, not surprisingly.

The other is, we hear from the customers. Their direct feedback on like, how impactful it’s been, how much has changed the culture of the organization, how people are now talking about data the same way. Whereas previously, the domain experts, the knowledge workers talked about data in a different way than the people who are actually practicing the arts of extracting information from data. So they, we see it on the cultural side, but then we also hear use cases, say, one of our large AMCs. They’re using it right now for strategic financial recovery after COVID and they’ve been tasked with, how do we reduce costs, increase revenue still while maintaining or improving care. And, there are examples from that that are in, literally in the millions of dollars, just from one physician asking questions over the course of a couple of hours, able to identify opportunities and then, surface those and they implement them and sure enough, it’s dramatic in terms of the impact to the organization.

So those are the kinds of, that’s the feedback that we get. And so that’s why the use cases are so impactful when we engage with new customers, we can say, look, this is, this is what was possible at organization X. And this can be similarly possible with, for you and your organization.

Harry Glorikian: Yeah. You almost want to publish all that to make sure that everybody gets the message because that’s the goal, right?

Tom Covington: Yeah. There will be publications that come out of this because some of the work they’re doing and the impact it’s having on organizations are, is going to be replicable at other places. And it’s there are novel ways of thinking about data, looking at data that they get to leverage via tag.bio that fundamentally is going to change these organizations for the better.

Jesse Paquette: I’d like to bring up one thing and it kind of relates to what Tom was saying. And it sort of boils down to a bit of an ethos that we started with, which back in 2014 was sort of completely contrary to the hype of AI that was happening between say 2014 and 2016, we would talk to a lot of folks and they would say, are you AI? And we would have these debates about, Tom, do we actually say we’re AI? And we think, okay, now we’re going to say we’re AI because everyone cares about it. And then we would think, no, we are definitively not AI. While we have machine learning algorithms under the hood, we are first and foremost focused on the knowledge and the discovery power of the knowledge worker, the physician who has 20 years of experience in the ER, the, the biochemist who’s been working at a pharmaceutical company and in academia for, for 20, 30 years. They have so much information and their community of peers has so much information, detailed knowledge data inside their brains that is not being joined properly with the data that exists in these databases. And that’s really what we’re trying to do is bring those two together. And it’s interesting to try to quantify as Tom was talking about we’re working on those metrics.

Harry Glorikian: So who do you guys see as your competitors? Because when I hear low-code and things like that, there’s, I immediately go to the tech side. Right. Because they’re all, the valuations are off the chart right now on some of these things, but who do you see as competitors and how do you differentiate from them?

Tom Covington: That’s a great question and it’s one we’ve gotten a lot. So there are, we kind of tie three areas together, there’s this data engineering aspect, there’s the data science aspect, and then there’s the end user experience. We have competitors in all three of those areas, but there are none that span those three areas. So we may have folks that are doing some really great work on the data engineering side, or maybe on the data science side, or even in the end user software side. But there are none that currently link those three together, those three legs together. So some of the competitors may start to approach us in certain avenues in certain areas, but there is not a kind of end to end solution that takes generally analysis-ready data, marries it with these data science capabilities, and then turns that into low-code application platform. So, for the time being, we’re a bit unique. But I, obviously as we start to gain more traction, they’re going to be people that are going to start trying to approximate what we’re doing. And, we’re anticipate that look forward, look forward to the competition. But realistically right now there’s, there’s no great solution that kind of packages up those three legs that we span.

Jesse Paquette: We’ve encountered a lot of potential customers or customers of ours that had previously tried to stitch together a solution which didn’t look like ours, but it was trying to solve the same problem. Really connects those three layers, the algorithms, the data engineering and the end user experience. And they’re trying to stitch them together using open source components. They’re basically trying to support a whole software environment within either a pharmaceutical or a healthcare organization. And it’s really hard for them to sustain, the technical debt mounts, and the project eventually fails.

So we, we do see that people, like a customer, for example, we would approach a large pharma or big healthcare institution. They are familiar with the problem. They probably have an in-house solution that they either built, or they had some consulting firm coming in and build for them. And some people in that organization feel rather proud of that thing that they’ve built. And other folks absolutely hate it because it doesn’t solve 80% of their problems. And it’s an interesting environment to get into, but it’s usually not another vendor. It’s an in-house self-built solution.

Harry Glorikian: Yeah. Tough to get over some of those issues. I know if one of my partners was here, the first question he’d be like is, I’m sure you guys are filing IP on some of this. So hopefully you guys are able to protect it and create at least a moat around what you guys are building. Because it does sound like it was way ahead of a lot of the competitors.

Tom Covington: We have filed for some patent protection, or some patents, yes.

Harry Glorikian: So, COVID seems to have had an impact on, it seems like every organization I talk to these days and some of it has caused things to move a lot faster. Have you guys seen an acceleration of your business and, or are there places where people have said, yeah, your system is how I’m going to help find a solution from analyzing patients in COVID I’m looking at it from both sides, right? Where the telemedicine came whooshing in, because everybody needed it. And so I’m trying to figure out like, did it accelerate your business? And then through the acceleration, did it actually help identify opportunities in patient populations?

Tom Covington: Yeah, so it hasn’t been as dramatic as say telemedicine because that was, clearly everybody needed that right away. And so there was a big push in that effort. But it has accelerated certain aspects because, once you’ve got COVID patients, you want to understand that patient population and, understand you want to be able to do research on those patients. And so from that perspective, it has accelerated some business. Specifically there’s a large AMC that wanted to be able to look at, do analyses on their COVID patient registry and they wanted to create a COVID patient registry.

And we were able to get that up and running for them in about five days which allowed their researchers to do some pretty sophisticated analyses around survival, looking at what the makeup was, what was correlated with folks that ended up being, for example, intubated. So there was a clear need on their part to very rapidly be able to perform analysis on their COVID patients. And tag.bio was able to fill that need very quickly for them. And so I think there are other examples like that, that have been accelerated via COVID or the pressing need of COVID. But there’s, it’s also not as high a priority, say as telemedicine. So I think it’s been good for us in general. But I also think it is not quite as bright and shiny as the, “Oh my God, we need a solution for how we can continue to see patients when they can’t come into clinic.”

Jesse Paquette: I would add that I think what we’re doing is we’re riding a much larger, but slower moving wave because of COVID, which has to do with cloud adoption. We are working with a number of cloud providers as channel partners and within the healthcare and life science space, there is a lagging surge in cloud adoption. And we’re seeing more interest in our platform more, more meetings, more proof of concepts, more and more getting through the stages of the sales cycle, which, usually it’s a really long sales cycle in healthcare and life sciences. You have to get a lot of people to approve. You have to go through the security approvals and, and the risk assessments and, and you get the right people to sign off at all levels. There’s a lot of stakeholders within the organization. But being part of this cloud wave means that it’s, that the organization has already decided we’re going to pick one of the major cloud providers. We’re going to build out more infrastructure, perhaps all of our infrastructure on that cloud. And it’s this sort of new green field opportunity where applications useful applications like ours can come in and be easily adopted compared to the older model where there’s more inertia.

Tom Covington: Yeah, that’s a, that’s a great point. Yeah.

Harry Glorikian: So what have I have I not asked you guys? I mean, I’m also thinking about like, how does all this data, does the platform actually let you also visualize some of it? Cause I can see the things I like to see in certain ways, make it easier for me to tease things apart when I’m looking at it. But what have I not asked you about your platform that you think I missed?

Tom Covington: It’s a good question. I mean, I think one of the things that we are realizing is that there’s a lot of value in having full provenance of analysis and have kind of a full history. It creates an additional essentially additional data source for how data are being used within an organization.

So being able to understand which data nodes are of value, which analysis apps are of value. We talk about UDATs or useful data artifacts, and those could be gene signatures. That could be a particular cohort of patients. But those UDATs that get discovered via the platform and then get shared via the platform. And then the visibility on those is accessible to the kind of senior leaders within an organization. You start to understand the value of your data a lot better. And right now, particularly on the life sciences side, and even on the healthcare side, they may have immense volumes of data that are not being utilized. They’re being stored because they believe there’s value in them. But the time to extract that information is so high and the cost associated asking questions is so high that you don’t have a good sense of like, what are valuable datasets, what are valuable analysis applications? And, we’ve, we provided this additional useful dataset of, for an organization around where the greatest value I, and there were organizational within their industry and within their infrastructure.

Jesse Paquette: I’d like to extrapolate on that. If I could again, to quote our VP of customer Mark Mooney, we think about it this way. Even if you have the most useful data analysis application on top of your data right now, what happens is that people use it and you get information and you start to save it to your computer. You start to take it away from the system to be able to take action on it. Maybe for example, in health care, you might realize that if you do something in the ER, you’re going to improve patient care and improve your bottom line. And it’s a really useful thing. What Tom had just described the useful data artifacts means that there’s a gravity in our system, that all of the useful things that are found and created in our system, right. They stay central to the system with attribution and provenance about who made them and who created them. They become shareable units of information and reusable, which is a very different paradigm than other analysis systems. Say, if you take your favorite visualization app, you’re going to take something away. You’re going to send it to somebody in an email. It goes away from the system. And ours is really trying to bring all of the useful things that were created from the system and keep them there so that they can be found and reused.

Harry Glorikian: Yeah, I’m almost thinking like you would rank these, you would, at some point be able to rank them to let people know which ones are more or less useful and maybe why they were useful. Right. Which might generate more of that type of data.

Tom Covington: Exactly.

Harry Glorikian: Wow. So great learning about this. Because I have to admit, when I started reading about this, I’m like, I’m going to get in over my head really quickly, but this was incredibly useful. It sounds like something I almost wish was self-serve and I could use it for some of the stuff that I have, but it sounds like it’s more, you have to deploy it within a certain network, as opposed to one individual like me utilizing it.

Tom Covington: We are, we are coming for you though. It’s going to be probably a year and a half or so, but yes, ultimately we want to empower people like yourself to be able to deploy these, set, set up a system like this for yourself relatively easily.

Harry Glorikian: This was great. I look forward to keeping in touch and hearing how this evolves, and maybe one of these days I’ll be your beta user to try my own data analytics and see how we can use it for our own organization.

Tom Covington: That would be fantastic. We would love to help.

Harry Glorikian: Thank you so much for joining me today.

Tom Covington: Thank you very much for having us. We really appreciate it. And we enjoyed the conversation.

Jesse Paquette: Thanks, Harry.

Harry Glorikian:That’s it for this week’s show. We’ve made more than 50 episodes of MoneyBall Medicine, and you can find all of them at glorikian.com under the tab “Podcast.” You can follow me on Twitter at hglorikian. If you like the show, please do us a favor and leave a rating and review at Apple Podcasts. Thanks, and we’ll be back soon with our next interview.

How Tag.Bio Makes it Easier to Interrogate your data

Related Posts