How EarMark Turns Meetings Into Finished Work (Full Transcript)

EarMark uses real-time voice AI to generate tickets, docs, and prototypes from meetings—and scaled reliably by switching to AssemblyAI.

Download Transcript (DOCX)

Speakers

Add new speaker

[00:00:00] Speaker 1: Hi, everyone. It's Mark from Assembly. I'm joined here today by Mark and Sandon from EarMark. We're going to be talking through how they built EarMark and why they chose Assembly and how they can recommend other founders and PMs to be thinking about voice AI in their apps. Hi, Sandon. Hi, Mark. Thanks so much for coming on today. Maybe just walk us through first what EarMark does.

[00:00:22] Speaker 2: Yeah, so EarMark is a productivity suite where the work completes itself. So what EarMark does is it listens to your meetings in real time and turns what's said into finished work. Things like docs, tickets, updates, and next steps. And unlike generic AI meeting tools, it actually produces real artifacts, real work. So product teams can move forward without all the manual follow-up.

[00:00:46] Speaker 1: Nice. Was that how EarMark started? Could you walk me through the origin story and what the first version looked like?

[00:00:53] Speaker 2: Yeah, so the origin story and the first version was radically different than what we're working on now. We actually started EarMark as a Vision Pro product. You know, so essentially what it was, was a AR VR rehearsal experience, right? Where we wanted to help product engineering leaders be more preparatory, like as they're influencing. I mean, a lot of folks are certain leaders, a lot of people in product, nobody reports to them, right? So the concept was, how do we make people effectual, like in their ability to influence? So we started with this idea of real-time speech coaching, right? So, you know, you could go into an immersive environment, you know, in your Vision Pro. You know, you have your Google slide deck, you know, presented. And then you would have real-time feedback in terms of, you know, whether or not you're breathing. You know, should you enunciate a little bit better? Should you speak up more? Maybe you're speaking too quickly, right? And provide real-time feedback around those concepts. And then we decided to, you know, conduct a bunch of user research, right, around that particular solution. And it turns out that, you know, what we learned and the key insight was that nobody really prepares for anything. So we made a rehearsal product for people who don't prepare for presentations. So that was the key learning. And then what we did was we took, essentially, we just pivoted, right? And we took the idea of a real-time feedback experience, right? And we put it on the web. And the concept there was, could we essentially enable product folks to be more informed in the moment, right? So that was, like, sort of the first thread we started pulling as a service. And that evolved to automated creation of artifacts and deliverables of work while people were having conversations. And we're about five iterations in since our Vision Pro solution. And, yeah, just in market and having conversations with customers and prospects every day.

[00:02:47] Speaker 1: That's really cool. I like that. I guess it kind of developed and matured over time. And now you're building something people want. So that's always a good place to be in. Yes. Do you dog food earmark internally at all? Like, does your own team use it day to day?

[00:03:01] Speaker 2: Yeah, we use it every day. So one concept would be, I mean, like, for Sandin and I and Dylan, our go-to-market leader, what we'll do is we'll have just unstructured conversations, you know, that could be, like, brainstorming, right? An idea would be, you know, could you, like, you know, so we'll talk through sales status, right? Things about customer sentiment, things about the product. And then we'll actually, like, go through ideation in real time. The thing for earmark that's really powerful is it'll take these unstructured conversations and turn them into structured artifacts, first and foremost, right? So, you know, requirements or maybe support documentation for our customers, as an example, or maybe go-to-market messaging in terms of, you know, type of language you're using, maybe based off of customer conversations we've had, right? So that's been really useful. But something that's actually more useful in our most recent iteration of the product is the ability to essentially push to cursor, you know, push to V0, push to codecs, and actually have conversations lead to prototypical flows. And so for us, the concept and the true problem we're trying to solve is this idea of can we essentially author, you know, the doorbells that you normally would create maybe a day after, several days after, in that 30-minute increment or hour increment of the meeting that you're actually in, right? And the real unlock there is, you know, if we're talking through concepts, right, and then actually looking at prototypical flows, that's a great way to kind of close that cycle time up. You know, because R&D teams are just overtaxed, you know, Mart, I'm sure you understand, like, how hard just the business of R&D is, you know, and for us, the biggest thing is, you know, can we just, you know, help cycle times for everybody? And then also, you know, for product managers and a lot of folks that aren't in engineering, you know, like a lot of like R&D is moving so much faster with AI. Can we essentially enable, you know, product folks to keep up and maybe even stay ahead of their engineers that are running five times, 10 times faster?

[00:05:07] Speaker 3: Yeah, I think my, you know, my favorite use case is, you know, like imagine you're, you're in a meeting, you know, as Mark was mentioning, you're, you're brainstorming on features and you're 30 minutes in, you're 40 minutes in, and then suddenly you share your screen and then you'd be like, is this what you were talking about? And like, just having that there, like in that context is so cool because like everyone's minds is like already thinking about that. And then you can continue to riff and it's just like, like the quality is so much better and just reducing like that cycle time and in not having to do those follow ups is kind of such a huge benefit.

[00:05:43] Speaker 1: I honestly, the thought of working by just talking is a dream come true for me. And it has been for me personally, now that I use Super Whisper on Mac, like most of the things I write now are hardly with the keyboard, it's all with the voice and having kind of like an interactive platform to do that on like meetings, obviously, but having artifacts coming out of those conversations, that's like a game changer. Yes. So you guys use Assembly under the hood. Could you tell me more about why you chose Assembly? What was that decision looking like? Did you do an eval? Did you do a vibe eval instead? Did you have another solution prior to Assembly? Why did you change? Maybe kind of walk me through the story of how Assembly got into the picture.

[00:06:28] Speaker 3: Yeah, there were two big reasons for why we chose Assembly. So we were using another provider for transcription before, but we were running into two big issues. I would say the first issue was kind of like the plumbing. It's almost like we had to build a ton of abstractions just to get it to work. Like you can think of things like microphone management, the WebSocket lifecycle, a lot of reconnection logic, and just tons of work to make it reliable. But I think the biggest piece was that we were kind of getting slammed on the concurrency limits. And it was really hard for us to predict scale or even like a launch when we did like a product launch or like, hey, like, we're not sure, like we seem to be like around the edge. But if we want to get over this, we have to sign like this really expensive like enterprise contract. So it was just kind of like a lot of unknowns. And actually, fully sorry, just kind of like a right before launch, we kind of discovered Assembly AI, we did a quick, you know, some quick tests with it, we discovered, hey, like this transcription, it's not only really fast, but it's also really accurate, and more so than what we were using. And we actually swapped it out in four days, right before launch. And we launched with that. And it's been it's been great ever since.

[00:07:45] Speaker 1: That must have been just a mad rush, like four days before launch, you're trying to get like all the plumbing moved over from one provider to another. Yeah. I want to like, double tap on that concurrency limit conversation. Do you open one session for everyone on the call? Or does like each person get one stream?

[00:08:04] Speaker 3: Every person will get one stream. So if there's one person on the call, who's actually using earmarked, they'll get one stream, but if four people are using it simultaneously, that will be that will be four streams. So if you think about that, like even one meeting, like one workplace could be using four streams at the same time, and then obviously multiply that out within that one workspace. And that's a lot. And then of course, over different companies and whatnot. So that's like, you know, a ton of concurrency streams. And what's really cool about assembly is having this unlimited concurrency stream where there's almost kind of like a back off policy, like we'll get you to a certain threshold. And then it just continues to add on based off of that. So that's, that's been that's been fantastic for for us scaling. And yeah, I'm not sure how we would have survived without that.

[00:08:56] Speaker 1: Nice. And what are you guys focused on building now? Like, where's the product data?

[00:09:03] Speaker 2: So, so Mark, the one thing that we're trying to focus on this idea of a true, you know, chief of staff for product teams that is predictive in terms of what individuals need, what teams need the ability to not only task or delegate work to your chief of staff, but then also have proactive tasking, you know, being aware of, maybe if you think about sort of a multiplayer setting. If, let's say, there's a delivery team, you know, that's like off shore, right, that had a blocker last Tuesday, you know, could you be practically notified, right, of what those blockers are, you know, sort of day in, day out, like, what should I truly pay attention to, you know, as, as a product or engineering leader. So I think for us, you know, just the idea of like, sort of this, this co-presence that is really helping, you know, like every aspect of your work. We know that, you know, product leaders and engineering leaders are so overtaxed, right, in terms of not only meeting schedules, but then the types of deliverables that result from them. The one thing that, that we've learned is, you know, every 30 minutes, you know, is different, you know, in terms of the audiences are speaking to, every audience requires a different type of artifact or deliverable, right. And oftentimes they require different levels of fidelity of those artifacts based on seniority or whatever the immediate need is, right. And that's just a huge, you know, sort of contextual lift, you know, for folks that are working 60 hours a week, right. Like, so I think a big part of that is, you know, can we help those folks like in their roles to create capacity to be more strategic, right. Like, you know, for a lot of customers we speak to, oftentimes they haven't talked to customers in months because they're just sort of beholden to the needs of internal teams and deliverables around those teams and keeping their teams mobilized and fed. So the idea for us is, you know, this chief of staff that basically creates capacity for folks to basically, you know, do the things that got them in the product in the first place or engineering in the first place, right. So that's the concept of the tool. The other piece is this idea of like, you know, a second brain, you know, which is kind of similar to this idea of a chief of staff, but could we have a second brain for product teams, you know, which is a queryable, you know, pool of contacts, right, organized by project. And can you essentially, you know, have essentially a system of action, right, relative, you know, relative to maybe a system of record in more of a traditional form.

[00:11:33] Speaker 1: Honestly, it kind of sounds like a, sorry about that. Um, honestly, it kind of sounds like a game changer for founders where like they're coming in and out of these meetings with different people, with customers, with like employees of the company. Um, like not every meeting looks the same and not every meeting needs the same artifact to come out of it. So like they could really benefit from earmark by like being productive all day when they're on those calls, talking to people and like getting things done, even though like their calendar is full.

[00:12:05] Speaker 2: Yeah, the unlock for our customers, which has been, which has been a really cool thing to see is this idea of unlimited task agents, you know, that are running in real time, like in the background, you know, like as conversations and the workday progresses. Um, for a lot of folks that we speak to, it's like they can't imagine not having unlimited task agents operating in the background as conversations progress, you know, so that's a, that's a cool like sort of unlock for a lot of people.

[00:12:32] Speaker 1: It kind of reminds me of my coding workflow where like nowadays I have like four different tabs open on my terminal just with open code chipping away at different projects. But since it's a real time, like product earmark, like those terminals are constantly being open and we can have like way more than just four. Like it could in theory be unlimited.

[00:12:52] Speaker 2: Correct. Yeah. Yeah. We haven't actually, uh, uh, tested the unlimited part, but, uh, but we're, you know, it, it, it, it's, it's, you know, we're, we're fairly certain that, uh, that you can put quite a few task agents towards, towards work.

[00:13:05] Speaker 1: Really nice. Maybe you guys could show me around the product itself and we can give like the audience an idea of what it looks like and how, how you can use it.

[00:13:14] Speaker 4: Yeah, let me pull that up for a quick second here.

[00:13:18] Speaker 3: So welcome to, welcome to earmark. Um, so I'm kind of in my main page. Uh, I have a meeting that I've already prerecorded here. Um, kind of one of our, our, our retros here. Um, but basically what we wanted to, to do is kind of like design almost like a utility type tool that was just super straightforward to use, just really quick to get in, um, where one, a user can kind of start capturing their meeting. Um, and during a meeting or when they're done with their meeting, um, they can create essentially any artifacts that they want, which is kind of completed outputs. Um, so kind of like what we were talking a little bit earlier, one of my favorite use cases, um, is to come up with, um, engineering specs. So essentially based off the meeting, based on the transcript, um, it's essentially pulling in kind of what it thinks are, um, actual work items that engineers can work on. Um, so for example, in this meeting, um, that I have here, uh, users were talking about a missing 404 page. And what's really neat about this is I have just kind of like quick actions to essentially kind of like building cursor, um, which would open it up right into the external app. And I could literally just kind of start going, um, and get that running, um, get that running from here or likewise. Okay. Maybe if I didn't want to jump into cursor, um, real quick, maybe I need to save this for later. Why not just add this right into linear? Cool. This looks good. I'll create the issue. And now that's on my, that's on my tracker. Um, so those are kind of examples of just kind of like getting into action really quickly. Um, for more communication examples, we have kind of a bunch of different templates such as, hey, if I want to follow up with my team on Slack, um, I have this nice template that has kind of emojis. It's like really short and condensed, um, has all of the action items in here. Or maybe I, you know, need a more kind of like traditional PRD that I just want to get started and, uh, get onto a first, uh, draft here. Um, all right. Um, so yeah, here's a traditional PRD that, uh, someone might see. And, uh, on the left, um, I think, I think most people might be used to essentially kind of like chatting, uh, to change. We're going to be introducing this topic called, uh, uh, vibe, uh, vibe docking, um, which is kind of on the left is essentially the format of which you see on the right. And what's really cool about this is, uh, I can just kind of, kind of go in here and maybe be like, you know, add, um, add emojis kind of like under this executive summary section. And what that's going to do is it's just like, it's going to like regenerate, um, live for me, just kind of like based off of what I might tweak in here. Or like, maybe I like, I add another section, um, like, um, I don't know, customer quotes, um, if there were any quotes, um, you know, in, in this particular meeting and, uh, let's see what it comes off. Okay. Cool. So yeah. Indirect feedback. Quotes. Um, so it's, it's neat because you can really just kind of like adjust and fine tune on the spot, um, and then send this out, um, to your platform of choice when you are done.

[00:16:32] Speaker 1: But, uh, yeah, that's, that's your mark. That's real neat. Thanks so much for, for sharing this with us. Um, I want to kind of take the conversation in another direction and maybe start to think about how like the future way that we interface with computers and with AI is with voice. Um, you mentioned sort of like this chief of staff, uh, that lives on your computer. Maybe you guys can share a bit more of like a vision of how you think we would use this tech five to 10 years from now when, you know, ASR becomes like really accurate, really perfect, um, super fast. And of course we, we started to build up a lot of this context, um, from work that we've done and artifacts that we've already generated.

[00:17:14] Speaker 3: Yeah. One, one of our, one of our goals is how to turn. Essentially a knowledge work from being really row, uh, reactive to proactive. So, you know, imagine kind of like a true chief of staff. Like as you come into work, um, like in the morning, as you're walking down the halls, you know, imagine someone is being like, Hey, like overnight, this vendor might've renegotiated the deal. Or maybe there was an engineering team that's in a different time zone as you, and they ran into a blocker and like, here's that blocker and just kind of surfacing those things for you. Um, so you don't have to find out kind of later in the day or maybe worry about, Hey, like, you know, what's the status on this? Those things just come to you. So you, the queue can then decide, okay, like, where do I want to take action today? And it's almost like coming into work like every day and knowing what are like the top three important things to work on. And those are all in real time. And the extra kicker on that is since we know the context now, imagine if your market actually take action on that. So imagine like, okay, now I actually want to delegate some of this work off. Or maybe I want to take on some of this work that, uh, you know, that, uh, um. But, uh, that's, that's kind of our, our five-year it's like a grant vision framework.

[00:18:34] Speaker 1: Yeah.

[00:18:34] Speaker 2: And the other piece too, Mart is, um, you know, we talk a lot about the, the relevance, uh, or maybe slightly, you know, irrelevance of systems of record, you know, today where a lot of work, you know, for knowledge workers is, um, essentially entry, you know, to make sure that, um, you know, records are kept. Right. That you can have a credible report to your key stakeholders, you know, because everybody is, you know, uh, entering, you know, whatever is required, like within their system of record. Um, we think that the future work is going to evolve to where systems of record might be a little less important, maybe in a traditional state, because if you capture all conversational context, right. You, there's nobody that has to play scribe, um, to enter these things, you know, to have visibility in terms of what's happening in R&D as an example. Um, so that's a really powerful unlock. Um, you know, this, this sort of evolution to systems of action, right. Is something that we're really, um, uh, optimistic about as well, where, you know, it's, it's not entry, you know, sort of passive, right. You still have to go and execute whatever the work is within the system of record. You know, the, the, the idea that, um, things will basically self seed themselves in terms of tasking is a really powerful concept. Um, and something that, uh, that we really look forward to.

[00:20:01] Speaker 1: Yeah.

[00:20:01] Speaker 3: That's really nice. Sorry, if I could just add one more thing. Um, I think one of the reasons why we're really bullish on voice is because that's where like 90% of like the conversations happen at work. And so like, if you imagine like you have all of your documents, you have all of your slacks, like that's great. That's a lot of data, but there's still so much that happens in conversations and meetings or side conversations. Um, and just imagine that also captured in kind of with your second brain and what you could do with it.

[00:20:33] Speaker 1: That's really nice. I like the idea of like a daily brief coming into work and like just being caught up on everything and conversations that happened while I was asleep. And leveraging that to be more productive in my day. That's like, that's a really nice vision for the future. Um, so do you guys have any like advice for founders who are building in general, but also with voice AI? Like, um, I know, I guess your product has matured and it's grown. Um, I guess maybe you can share some insights from, from there. Yeah.

[00:21:05] Speaker 3: Two big pieces of advice. I think one is like privacy is, is design. Uh, voice data is very sensitive by default. Um, so you need to be like really intentional about what you store, how long you store it. Do you encrypt it? Um, do you avoid storing it? Um, and actually for us, for earmark, um, you know, because we, we view, we've used privacy so, so strongly. Um, we do have an option on all of our plans, um, which is called temporary mode where we actually don't store, um, the transcripts or any of your data at, at all. Like there's no, there's no retention plan. It literally just bypasses our database completely. Um, so really designing around that and thinking about that is really important. Um, but, uh, one of the lessons that we learned early on to, for voice AI products is actually making the UX really forgiving. So when a user is using a voice AI product, they're actually taking action in something else. Like they could be in a meeting, they could be on a phone call or they could be in a conversation with something. The product that you're using is almost secondary to them. So if you are trying to capture a conversation and in order to start that capture, it's like four button clicks or different configurations, the user is just not going to use it. So it needs to be dead obvious. Like, can it be one click or could it do it for you? And if there's a blip that happens, can it resolve itself or can it figure things out itself? Um, just removing that kind of like that, uh, those decisions from people while they're using something else, I think is a, is a, is a huge thing that, uh, that it's easy. It's easy to overlook when, when building with voice AI products.

[00:22:49] Speaker 1: Yeah.

[00:22:50] Speaker 2: And Mark, just general founder experience. Um, one, one thing that, uh, Stan and I, uh, sort of, you know, sort of evolved to is, is, um, I don't know, like there's so much, uh, out there in the form of best practices and like founder content. Um, and there's so much, uh, um, uh, that's what I'm looking for. There's, there's, there's so much dogma, right. Associated with just being a founder that, um, I think, I think that the pro tip is, is like perspectives are nice. Um, and there are some frameworks that are helpful, but there's nothing more, uh, helpful and impactful than just lived experience and just charging through and just being a founder and not, um, uh, thinking the dogma, you know, is, is what it really is. You, you kind of learn that like everybody has an opinion, right. But you have to sort of figure out what works best for you, uh, in your organization in terms of driving it to success. Um, and I think for a lot of folks that are sort of like on the precipice of like becoming founders or when they're in it really early, um, they, they kind of get caught up in that group think, you know, in a way that I think is less than productive. Um, so, uh, yeah. So the pro tip there is like advice is great. Um, but you know, charge your own path.

[00:24:04] Speaker 1: That's, it's really funny that you mentioned that because one of our like unofficial company values is beware the dogma. Like, um, we were going through a phase where we like really grew as a company and we immediately reached for like that enterprise software growth playbook that's common. And part of the reason why you guys have, um, stuff like unlimited currency with, uh, sorry, unlimited concurrency with no enterprise agreement is because we realized that's not how our customers pay for and buy AI products. They want flexibility. They want to know that this is a company that they can scale and grow with and kind of like letting go of that traditional playbook has allowed us to serve our customers and keep them really happy. So I think that's really great advice. Thank you so much for sharing that with us, Mark.

[00:24:51] Speaker 4: Yeah, absolutely.

[00:24:53] Speaker 1: Well, that's all I really had. Um, thanks for your time. And, uh, I guess I will leave like your contact information down below. If, if you want to try earmark yourself or get in touch with Mark and send and, um, everything will be in the description. Uh, thanks guys for your time. This was really such a great conversation. Yeah.

[00:25:10] Speaker 2: Thank you, Mark. Really appreciate the opportunity. Thank you, Martin.

Summary

EarMark is a voice-first productivity suite that listens to meetings in real time and turns conversation into finished work artifacts—docs, tickets, updates, next steps, and even prototypes—reducing follow-up and cycle time for product and engineering teams. The company began as a Vision Pro AR/VR rehearsal and real-time speech coaching tool, but pivoted after user research showed people rarely rehearse; they moved the real-time feedback concept to the web and iterated into automated artifact generation and “chief of staff/second brain” workflows. EarMark dogfoods daily, using unstructured internal discussions to generate structured requirements, support docs, GTM messaging, and to push outputs into tools like Cursor, Linear, Slack, and PRDs via templated “vibe docking” prompts that regenerate artifacts live. They chose AssemblyAI after a prior transcription provider required heavy engineering “plumbing” (mic management, WebSocket lifecycle, reconnection logic) and imposed restrictive/expensive concurrency limits; AssemblyAI provided fast, accurate transcription and effectively unlimited scalable concurrency, enabling a last-minute provider swap four days before launch. Looking ahead, they envision proactive, voice-driven systems of action: a predictive chief of staff that surfaces blockers, status, and priorities across time zones, and reduces reliance on manual systems of record by capturing conversational context and auto-seeding tasks. Advice for builders: treat privacy as a core design constraint (minimize retention; offer modes that store nothing) and make voice UX extremely forgiving and low-friction because users are multitasking; for founders generally, avoid rigid “best practice” dogma and learn through execution.

Copy

Download

Title

EarMark: Turning Meetings Into Real Work With Voice AI

Copy

Download

Keywords

EarMark

Remove

voice AI

Remove

meeting transcription Remove

Remove

real-time artifacts Remove

Remove

productivity suite Remove

Remove

product management Remove

Remove

engineering workflows Remove

Remove

chief of staff AI Remove

Remove

second brain Remove

Remove

systems of action Remove

Remove

AssemblyAI Remove

Remove

unlimited concurrency Remove

Remove

WebSockets Remove

Remove

mic management Remove

Remove

privacy by design Remove

Remove

data retention Remove

Remove

temporary mode Remove

Remove

UX for voice Remove

Remove

Cursor

Remove

Linear

Remove

Slack

Remove

PRD generation Remove

Remove

prototyping Remove

Remove

Copy

Download

Key Takeaways

EarMark converts real-time meeting audio into actionable deliverables (tickets, PRDs, specs, updates) rather than generic summaries.
The product pivoted from a Vision Pro rehearsal/speech coaching concept to web-based real-time meeting-to-work automation after learning users don’t rehearse.
Dogfooding shows value: unstructured conversations become structured artifacts and can be pushed directly into tools like Cursor and Linear.
Transcription infrastructure matters: prior provider required extensive reliability ‘plumbing’ and had restrictive concurrency limits.
AssemblyAI was adopted days before launch due to higher accuracy, speed, and scalable (effectively unlimited) concurrency.
Future vision: proactive ‘chief of staff’ experiences that surface priorities/blockers and enable delegation, shifting from systems of record to systems of action via captured conversational context.
Voice is a key data source because most work coordination happens in conversations not captured in documents or chat.
Build voice AI with privacy as a first-class design constraint; consider zero-retention/temporary modes.
Make voice UX extremely low-friction and self-healing because users are multitasking in meetings.
Founder lesson: use frameworks lightly; avoid dogma and learn by executing what fits your context.

Copy

Download

Sentiments

Positive: The conversation is optimistic and forward-looking, highlighting successful pivots, strong product-market pull, excitement about faster workflows, and satisfaction with AssemblyAI’s speed, accuracy, and scalability. Cautionary notes appear around privacy and founder dogma, but overall tone remains constructive and enthusiastic.

Copy

Download

Enter your query

{{ secondsToHumanTime(time) }}

Back

Forward

{{ Math.round(speed * 100) / 100 }}x

{{ secondsToHumanTime(duration) }}

Select Audio file