Every conversation
becomes a protocol.

Live transcription with speaker recognition. GDPR-compliant, in 13 languages.

Start the recording, talk to colleagues, clients, patients or yourself – anymize transcribes live, detects speakers after the recording ends and files the finished protocol as a document in your account. From then on: reusable in chats, knowledge bases, projects. Without a single word going to a US service.

What you get

Three results.
One working pass.

01

Live text as you speak

The transcribed text appears right during the recording – you see what was understood and can correct or refine as soon as the meeting pauses. No waiting on a batch result.

02

Speaker recognition after the recording

The moment you stop the recording, speaker recognition runs across the full transcript. Every passage is assigned to the right speaker. No participant cap – two people, five, twelve or more, the model separates as many voices as are distinguishable.

03

An editable protocol as a document

The finished transcript lives as a document inside anymize – just like any uploaded PDF. From there it becomes a building block of your further workflow: dropped into chats, added to knowledge bases, linked to projects, summarized as an artifact.

How it works

Two ways
to start.

Path 01

Desktop microphone

The standard path for the office. Pick your microphone from the dropdown, click “Start recording” – transcription is live from that moment. Pause button for interruptions, stop button ends the session and kicks off speaker recognition.

Path 02

Smartphone via QR code

Perfect for meeting rooms, bedside rounds, interviews at the customer's site. anymize shows a QR code on your desktop – scan it with your phone and the same recording surface opens in the mobile browser. You record with the phone, the transcript appears in parallel on both devices. No app download, no account switch, no separate recording software.

In practice

Meeting in the conference room, laptop out of reach or fan too loud. Phone out, scan the QR, recording runs. Back at your desk the finished transcript is ready in your anymize account.

The speech stack

Self-hosted.
In 13 languages.

Speech recognition runs on Voxtral Transcribe 2 – the open speech model from Mistral, a European provider. We operate it ourselves inside our infrastructure: no audio stream goes to US services like Otter.ai or Rev, no transcript lands with any third party.

Supported languages
13 languages
  • DEGerman
  • ENEnglish
  • FRFrench
  • ITItalian
  • ESSpanish
  • PTPortuguese
  • NLDutch
  • RURussian
  • ARArabic
  • HIHindi
  • ZHChinese
  • JAJapanese
  • KOKorean

Mixed-language meetings (e.g. DE + EN alternating) are recognized without manual switching.

Voxtral Transcribe 2

Open model from Mistral, operated by us.

European infrastructure

No routing to Otter.ai, Rev or any other US provider.

Multilingual

Language switches within the same meeting without manual toggles.

Why self-hosted

Voice recordings are among the most sensitive documents of any company: client meetings, patient rounds, investor conversations, internal strategy sessions. With most transcription services, the audio lands with the provider – often in the US, often with unclear privacy commitments. With anymize, processing happens exclusively in our European infrastructure (EU, hosted at Hetzner in Germany). No audio, no transcript, no metadata leaves to third parties.

Speaker recognition

Who said what.
Precisely traceable.

After the recording ends, speaker recognition runs once across the full transcript. The result: every passage carries a speaker label – “Speaker 1”, “Speaker 2” and so on.

No participant limit

Two people, five, twelve – the model separates as many speakers as it can distinguish vocally. Ideal for round-table meetings, board sessions, conferences with many participants.

Assign names afterwards

You rename the labels (“Speaker 1”, “Speaker 2”) once at the top of the transcript. The new name is automatically propagated throughout the entire document. “Speaker 1” becomes “Dr. Schmidt”, “Speaker 2” becomes “Client” or “Anna”. In one step, not passage by passage.

Before assignment
Speaker 1
Speaker 2
After one click
Dr. Schmidt
Client
Propagated across the whole document

Why only after the recording

Speaker separation becomes noticeably more accurate when the model can analyze the full conversation. Individual voices are learned better through many samples than through the first three sentences. That is why recognition runs once at the end – with clearly better results than any live variant could deliver.

Anonymization

Anonymization
on demand.

Transcripts typically contain a lot of personal data: names of participants, addresses, phone numbers, medical conditions, company references. Whether these appear in clear text or as placeholders in the document is your choice.

Anonymizationenabled

Anonymized on demand

If you enable anonymization, the finished transcript is run once through the anymize anonymization pipeline after the recording ends: names, addresses, IBANs, case numbers and the remaining 40+ categories become placeholders. The anonymized transcript can then safely be handed to international frontier models (for summarization, analysis, translation) – without any participant being identifiable.

Originals stay reachable

Thanks to bidirectional anonymization, the original data remains accessible: in preview you see it normally. In chats based on the transcript, you get answers back with the real names. On export you decide per file whether originals or placeholders are written.

Or don't anonymize

For internal meetings without sensitive content or for audio from public sources (lectures, podcasts) you skip anonymization – saves processing steps, the transcript stays in clear text.

Reuse

From transcript
to finished work.

A finished transcript is rarely the end product. You want a summary, a to-do list, an anonymization for sending, a structure for filing. In anymize this happens without media breaks:

01

Drop into a chat as context

Start a chat, point to the transcript document. The AI reads it in and you ask: “Summarize the core decisions in 5 points.” · “Extract all to-dos with owner.” · “What did Dr. Schmidt say about topic X?”

02

Move into a knowledge base

For recurring topics (weekly standups, client meetings, clinical rounds) set up a knowledge base and collect all relevant transcripts there. The AI pulls from them when you ask later – with citations.

Knowledge bases
03

Link with a project

Does the transcript belong to an ongoing mandate, advisory project or product launch? Link it with the matching project. All participants in the project now have the transcript as context – without you manually sharing it.

Projects
04

Turn into an artifact

Let the AI produce a finished artifact from the transcript: a formal meeting report, a decision protocol, a to-do list as a table, a client memo as a letter. Edit in the WYSIWYG editor, export as Word or PDF.

Artifacts

Use cases

When live transcription
makes the difference.

Six realistic deployments – drawn from the working realities of our customers:

Client meeting (lawyer)
What you get

Complete conversation protocol with speaker assignment client / firm

Specifics

Anonymized for peer review, original for the file

Patient rounds (clinic)
What you get

Transcript with speaker assignment physician / patient / care

Specifics

Directly used as a source for the discharge letter, § 203 StGB preserved

Team meeting (agency / consulting)
What you get

Protocol with all to-dos and decisions, labeled per participant

Specifics

Highlights in an artifact for colleagues who did not attend

Customer call (sales / CS)
What you get

Conversation transcript with reference to the CRM contact

Specifics

Extraction of objections, commitments, next steps

Expert interview (research)
What you get

Verbatim transcript, speakers clearly separated

Specifics

Reused as a citable source in research projects

Lecture / webinar
What you get

Full-text script for post-processing

Specifics

Generation of summaries, flashcards, glossaries

The pattern

Wherever post-processing is required – meeting report, discharge letter, case note, meeting summary – live transcription plus AI post-processing saves the main work: the manual typing and structuring.

What you should know about live transcription.

Frequently asked questions

Thirteen languages: German, English, French, Italian, Spanish, Portuguese, Dutch, Russian, Arabic, Hindi, Chinese, Japanese, Korean. Mixed-language meetings (e.g. German/English alternating) are recognized without manual switching.

Start now.
14 days free trial.

All models. All features. No credit card.

We stand behind anymize. And we know – when an AI tool touches client, patient or employee data, a demo video isn't enough. That's why we give you 14 days of full access – all models, all features, no credit card. Enough time to be certain, before you trust us.

Your AI workplace awaits.