Speaker 1: Thank you, Cheryl, and yeah, I'm very proud to be the first presenter this year, big responsibility as well. Okay. So, yeah. So, let me tell about Beagle and first of all, what Beagle is. Imagine that your eDiscovery platform understands natural language and has reasoning capabilities. This is essentially the core of what Beagle can do and what that means. Right now, all the platforms, they are essentially hard-coded, meaning that they can do what they're programmed to do for. And even if they use predictive analytical tools, those are usually pretty straightforward algorithms based on some simple heuristics. And in contrast to that, Beagle can actually understand the intentions of the user, understand the requests of the user, understand the content of the documents it's working on, and kind of match them with each other. It helps on different levels, for example, with search, right? You don't have to iterate over different synonyms of the keywords anymore. You don't have to put your citations in different ways. You can ask the questions and, you know, get much more meaningful results and also get much less false positives if you do that. And based on these capabilities, we also introduced the concept of automated document review. This is essentially a way to analyze large chunks of documents. What is currently fully done by humans, by contractor attorneys usually. So we do that in an automated way. And this allows our clients to save a lot of time, money, and of course, the accuracy is also going up because with, you know, how people currently do that, it's pretty difficult to achieve the desired level of quality. I also want to quickly talk about how all this started. Basically it's the idea of making this product arise over six years ago when I was still at my PhD. I was working on deep learning, and I was mostly working with computer vision problems. However, I was kind of monitoring what's happening in adjacent areas, and I was pretty impressed with some results. Like, even at those times, over six years ago, it was already possible to ask natural language questions about Excel tables and get meaningful results. I was frankly so impressed with these emerging reasoning capabilities that I decided that, yeah, I want to do some product based on that. And the legal industry was frankly one of the obvious choices for several reasons. First, I knew that these services, the legal services overall, they are very expensive. And what attorneys mostly do is can analyze the content, analyze texts, and do reasoning about this. So, you know, there was a perfect match basically. And second reason is that, you know, reducing the price for the legal services also increases access to justice to smaller litigants who cannot really afford to, you know, even enter into the litigation process right now because they anticipate very, very high prices associated with this. Yeah. So, how all of this started, I was at Snap for the last five years where I led the machine learning team. However, at some point, I decided to pursue this idea. I found a co-founder who is Udit Sood, and he is currently a senior associated litigator at Covington, a pretty big law firm. And together, we kind of narrowed down this vision of building a product into the tool for e-discovery, the platform for e-discovery for many reasons. First of all, again, this is the area where these reasoning capabilities are needed the most. And second, he knew all the issues with the current solutions, and he realized that they are very far from being perfect, and there is an opportunity to build, I think, that would be much more powerful and capable than what they have on the market right now. And later on, I found the CTO, it's Maxim. He was previously an ML engineer at Facebook and also a machine learning manager at Snap, similar to me. Yeah. I convinced him to become the CTO. It was not easy. But yeah, here is our team. And we are also consulted by top industry advisors. One of them is the director of e-discovery at another big law firm. And our third advisor is a general counsel at several recent unicorns. Like right now, it's one of the top companies, I can't tell about this, but yeah, he has a lot of experience and helping us a lot. Okay, can we go next?
Speaker 2: Sergey, one more question. Sure. I'm not sure you said it, but it's one of my favorite things. You've already successfully launched and sold a product, that's how you got to Snap, right?
Speaker 1: Yes, exactly. I was a founder of another startup, which was about machine learning effects on mobile devices, and at some point it was acquired by Snap.
Speaker 2: Excellent. Thanks. Sorry to interrupt.
Speaker 1: Good. Carry on. So, I wanted to tell about the case study that we made and about the automated document review overall. So yeah, let's look at this email. And this is an email from a collection of Coca-Cola emails, which were published as a part of some litigation process. In this email, they discussed the publication of a research article, which shows that one of the major predictors of childhood obesity is the lack of physical activity and not the sweet drinks. Obviously, Coca-Cola guys are happy about that. So yeah, if we go to the next slide, if we had the following hypothetical requests for production, for example, those which are looking for scientific findings about sugars, sweet and beverages, this is the result that we would generate for this type of email. And unlike the human reviewers, which produce simply yes or no binary tag, we also generate you the document summary, we generate you the reasoning, why exactly we make this decision, and we also generate the relevancy score. So you can sort out all your documents and go through them accordingly. But most importantly, it's the reasoning, so you can verify why this decision has been made. And if you have just a single tag, you can't really do it. But here you can see the summary, you can see the reasoning, you can understand why and if there is any problem, you can fix this. Can you go next? So yeah, we conducted the case study, we took 500 emails like that, and we asked the contract attorneys, two groups of contract attorneys actually, to do the review of those. And the first group was doing like they usually do, just yes or no tagging. And the second group was requested to do it like we do, with response for each RFP, there were 11 of them, and with reasoning why they make the particular decision. The results were very, very interesting. First, there was a huge difference in the number of responsive documents. Second group identified six times less responsive documents than the first one. So I think it's, I'm not even talking about Beagle. So this is just something which tells you a lot about the quality of human review. Like if you, you know, do it like you usually, you get a lot of false positives problem. And then we compared it with Beagle. So what we figured out is that Beagle responsive rates and recall precision rates were higher than what we could get, even for those who are doing this kind of careful review with all the reasoning. The recall rate was about the same for all three groups. However, the precision differed a lot. And Beagle was actually on top of two human groups. We were pretty happy to see this results, like no cheating, I promise. And yeah, I think it gives a lot of promise of, you know, what can be achieved using the services like we provide. Can we go next?
Speaker 2: Oh, sorry. Can I ask one or two follow-up questions? Of course, of course. And Kalina, can you just go back a slide so I can ask the specific. So I think the idea here was to really prove out the technology, right? So you went in and kind of did a head-to-head comparison, Coke versus Pepsi type heads up. And then thinking about, I think the other thing that's going to be really important as we move more AI into legal is how we prove that it's legitimate. And I think, you know, this gets into it. But can you talk just a teensy tiny bit about how you're going to verify or legitimize the results for folks?
Speaker 1: Yeah, exactly. So if necessary, we can do random selection of the results that we do. And those randomly selected documents with all the reasoning, with all the results that we provide, they would be reviewed by human attorneys. And those can verify if the reasoning is correct or not. And then we can just look at this and recall in precision rates if we consider what they say as the ground truth, we can get those numbers and compare with the industry standards. And as for my understanding, it's OK that if the error rate is 25% or less right now, which is pretty large numbers for me, and we can certainly demonstrate that Beagle can do better than that.
Speaker 2: I love it. And then one last question. Timelines. I know because I've done a head-to-head with other products before and found the computers obviously way faster. Can you give us any feedback on this particular case study of like how long it took the humans versus the computer?
Speaker 1: Yes. So in this case, it was a pretty small data set. So it's a bit difficult to compare because most of the time was about setting up the instructions and telling what needs to be done. I think the viewer time of the review was relatively small, probably a couple of days. However, our own estimates, Beagle can be equivalent to about 100 attorneys. Yeah, I don't think it's possible to hire that many, work with that many. So yeah, you can do the math.
Speaker 2: Excellent. Thank you. Sorry.
Speaker 1: Carry on. Sure. Can we go next? Yes. So I told about automated document review, but I also wanted to talk about several other features. And first of all, it's, of course, natural search and QA. And I think it's not a surprise anymore that people can do natural language question answering about particular document, which is, of course, useful if you have a 100 page contract or something like this. So you can search for particular elements of this contract by asking questions. But we also provide this functionality for the whole data set. And this is already not that easy thing to do. So you can get the responses as the list of documents. And moreover, we provide you a kind of short snippet as a summary of the documents that we have identified with all the links to the evidence, whatever is relevant to your question. So if you want to get like a quick idea of what the response is, you can go look at this snippet. You can see what is the evidence, verify that this evidence is indeed supportive. And if you want to do a more thorough analysis, you can go next. And the key thing here is that, again, you don't have to reformulate all your questions all the time, go over different keywords, go over modifiers. It works even for different languages. So imagine your document is in French and you ask in English, it's still capable of identifying those documents if the content is relevant. This is one thing. And another thing that we will soon introduce to the platform is natural language statistical queries and visualization. Imagine the situation that you have a lot of invoices and you want to see the dynamics of the revenue from different groups of those invoices. And in those cases, if you want to do this right now, you will have to export all the numbers into a separate spreadsheet, build some chart on top of that. But in Biggle, you will be able just to ask this query and you will get those charts straight away, because we are able to combine all of the sections together, get those numbers and build the chart on top of that. And I think this is another pretty useful feature for users, which would simplify their work. Can we go next? Yeah, I don't, I can't really talk much about what we are building right now. But overall, we will be enhancing Biggle's capabilities to approach the level of junior attorneys. So we will be adding more capabilities in terms of what it can do with the data, in terms of like how it can operate with different requests and so on, so on. And yeah, at some point, you will be basically able to even ask for insights. For example, you get data from the opposite party and you want to ask what kind of weaknesses does this data set demonstrate and how I maybe can use that to stand for my own position. So this is something you would have to work a lot to get. But if you get those kind of hints, it's pretty easy to verify if this is indeed the case. So imagine like you get this hint and your work will be not just doing this search by yourself, but rather verify the results of what the machine does. And this is what we see as a trend where everything goes. And this is what we mean by cruelty-free discovery.
Speaker 2: I love it. So a quick break into the future of product vision, but there's a few secret things you're still working on, it sounds like.
Speaker 1: Yeah, yeah, exactly.
Speaker 2: I love it. And then how can people connect if they want to just stay up to date on the status of the product?
Speaker 1: Sure. Yeah. Please subscribe to our waitlist. Here is the QR code for that. And we will stay in touch with you and you will get all the recent updates.
Generate a brief summary highlighting the main points of the transcript.
GenerateGenerate a concise and relevant title for the transcript based on the main themes and content discussed.
GenerateIdentify and highlight the key words or phrases most relevant to the content of the transcript.
GenerateAnalyze the emotional tone of the transcript to determine whether the sentiment is positive, negative, or neutral.
GenerateCreate interactive quizzes based on the content of the transcript to test comprehension or engage users.
GenerateWe’re Ready to Help
Call or Book a Meeting Now