How to Build a Privacy-First AI Medical Scribe (Full Transcript)

A step-by-step walkthrough of transcription, role-based speakers, SOAP notes, PII redaction, and automatic deletion using AssemblyAI APIs.
Download Transcript (DOCX)
Speakers
add Add new speaker

[00:00:02] Speaker 1: If you're building an AI-ambient medical scribe, you've probably realized that your transcription layer can make or break your app. You need accuracy on industry terms, privacy that you can guarantee your customers, and you need your app to actually fit the real-world scenarios and situations that your users are going to be in. That's why today I'll be building a very simple AI medical scribe, and along the way I'll share some tips and tricks so that you can see how Assembly might be able to help you build the best AI medical scribe. So let's dig right in. What we have here is a very simple transcription request in less than 30 lines of Python, but I'm going to add more features step by step, and eventually this will be a more fleshed out AI medical scribe. This transcription request works as-is, so why don't we just give it a quick try. So the transcription looks good, but now we want to add speaker labels so that we can differentiate between the doctor and the patient in the discussion. Alright so now we see the transcripts separated by A and B, but if we use speech understanding we can do speaker identification, since we know that the roles in the conversation is doctor and patient, we can tell the API to label the transcript with their roles instead of just generic A and B. So here we specify speaker identification, and we say speaker type role, doctor and patient. Let's give it a try. Here we go, so now we see the doctor and the patient, and we can differentiate between their role in the transcript itself. If I'm not wrong, the doctor prescribes Tremadol, but it is written in a way that isn't capitalized. So to fix that, I can add Tremadol to the key terms prompt, which basically biases the model to help transcribe it in the way that we specify. And now we see that the T in Tremadol is capitalized. Our transcription looks good, so why don't we generate some SOAP notes based off of this consultation as a summary for the doctor. To do this, I'll use LLM gateway, which is an interface for various large language models, which you can call with your Assembly AI API key. So we'll add our prompt, and then we'll make our request to LLM gateway. All we do here is pass the prompt in as a user message, along with the transcript, and then we call the request in a POST request, and we can print out the response. Let's try. All right, so here we can see the SOAP notes being generated. All of that looks good, but in a medical context, it's important that we delete the data after we make a request, so as to preserve the patient's privacy. To redact this personal information in the transcript, we can add PII redaction, which basically ensures that any of that personal information doesn't appear in the transcript, and therefore in the SOAP notes that we generate. So here we can see that the patient's and the doctor's names are redacted according to our settings in PII. All right, so we have an accurate transcription with PII redacted, and we're also generating SOAP notes. All of this looks great, but I think the next step is to delete the data from AssemblyAI's servers after the request is completed. In a production environment, you'd save the results on your database instead of having it on ours, and that way you can be certain that we don't store any of your data on our end. To do this, we just have to add a DELETE request at the end of our script. So each of these lines put in a DELETE request for the transcript, and then for the LLM gateway response. And just to prove that we really do delete the data off of our servers, we'll make one final GET request and print the result so that we can verify that it's deleted. All right, here you can see that the audio URL has been deleted by user. The text is empty and just states deleted by user, and so we can really tell that this transcript request was deleted from our servers. In case you were curious, because we used an audio URL from our own S3 bucket, that audio file isn't stored on our servers and is immediately deleted upon a DELETE request as well. We also offer a time to live on these requests such that they are automatically deleted after a set period of time. If this is something that you're interested in, please get in touch with our team. It looks like our AI ambient medical scribe is complete. I hope you enjoyed this video, and if you have any questions, please feel free to reach out to our team. Bye.

ai AI Insights
Arow Summary
The speaker demonstrates building a simple AI ambient medical scribe in Python using AssemblyAI. They start with basic transcription, then add speaker diarization and role-based speaker identification (doctor/patient). They improve domain accuracy by biasing key terms (e.g., proper capitalization of a medication name). Next, they generate SOAP notes from the transcript via an LLM gateway using the same API key. To protect privacy, they enable PII redaction so names and other personal data are removed before summarization. Finally, they show how to delete transcript and LLM outputs from AssemblyAI servers via DELETE requests, verify deletion with a GET request, and mention options like automatic time-to-live deletion and the handling of audio URLs stored externally (e.g., S3).
Arow Title
Building an AI Medical Scribe with AssemblyAI: Transcription, SOAP Notes, and Privacy
Arow Keywords
AI ambient medical scribe Remove
AssemblyAI Remove
medical transcription Remove
speaker diarization Remove
speaker identification Remove
role labeling Remove
key term biasing Remove
domain terminology Remove
SOAP notes Remove
LLM gateway Remove
PII redaction Remove
HIPAA privacy Remove
data deletion Remove
DELETE request Remove
time to live (TTL) Remove
Arow Key Takeaways
  • Transcription quality is critical for ambient medical scribe apps, especially for medical terminology accuracy and privacy guarantees.
  • Add speaker labels, then upgrade to role-based speaker identification to distinguish doctor vs. patient in the transcript.
  • Use key-term prompting/biasing to improve recognition and formatting of industry-specific terms (e.g., medication names).
  • Generate clinician-friendly summaries like SOAP notes by sending transcripts to an LLM via a gateway interface.
  • Enable PII redaction so personal identifiers are removed from transcripts and downstream summaries.
  • Implement explicit deletion of transcripts and LLM outputs after processing, and verify deletion via a follow-up GET request.
  • Consider automatic retention controls such as TTL-based deletion; external audio URLs (e.g., S3) can be deleted immediately as well.
Arow Sentiments
Positive: The tone is instructional and promotional, emphasizing how features like role-based speaker labels, key-term biasing, PII redaction, and data deletion improve accuracy and privacy, culminating in a successful build.
Arow Enter your query
{{ secondsToHumanTime(time) }}
Back
Forward
{{ Math.round(speed * 100) / 100 }}x
{{ secondsToHumanTime(duration) }}
close
New speaker
Add speaker
close
Edit speaker
Save changes
close
Share Transcript