Live Kaiwa

Everyone is laughing. The guy across from me slaps the table and says something that makes the whole room erupt. I smile and nod. I have no idea what he said.

It's a Sunday afternoon in our neighborhood meeting hall. About ten of us are sitting cross-legged around folding tables, drinking beer after a long morning of clearing waterways. My lower back aches from shoveling mud out of the irrigation channels that feed the rice paddies. I feel good about the work. But the meeting that follows, where the neighborhood leaders discuss upcoming plans, budgets, and schedules, goes completely over my head.

I live in a rice farming community in rural Japan. Maybe 40 homes. Once or twice a month, we all get together for some kind of work. Weedwacking the paths between paddies. Cleaning the waterways. Fixing roads. Picking up garbage along the river. I've had a great relationship with my neighbors since we moved here, which has been both a blessing and a slow trap. The better the relationship gets, the more they involve me. More responsibilities, more committees, more meetings. I'm grateful for the trust. But the meetings are where it all falls apart.

In most of my daily life, I understand 80 or 90 percent of what's happening around me in Japanese. At the grocery store, at my kids' school events, chatting with the neighbor who drops off vegetables from her garden. I can hold my own. But these neighborhood meetings hit a wall I can't get over. The local dialect is thick. The vocabulary is so specific to farming, municipal budgets, and community traditions that I've never encountered any of it elsewhere. People riff on inside jokes that go back decades. They reference events from before I was born. I'd walk out of a two-hour meeting having understood maybe 5 percent of what was discussed.

Among foreigners living in Japan, there's a certain pride about language ability. Many people I know have spent years grinding through textbooks and conversation practice, and they've earned real fluency. For them, reaching for any kind of tool or crutch feels like an admission that the work wasn't enough. You should just study harder. And I get that. I respect the dedication.

But I have the pragmatism to know where I stand. I have three kids. My job is conducted entirely in English. My wife wants the kids speaking English at home, so that's what we default to in the house. I have a dozen hobbies pulling at my free time. The math doesn't work. I would need one to two hours of focused study every day, for years, to close the gap between "chatting with neighbors" and "following a rapid-fire discussion about drainage fees in Tochigi dialect." I'm also just not a natural language learner. Some people absorb languages like sponges. I am not one of those people.

So I built something.

Live Kaiwa grew out of frustration. I wanted to walk out of a neighborhood meeting understanding what actually happened, not the 5 percent version I'd been piecing together from context clues. The idea was simple: capture the audio in real time, transcribe the Japanese, translate it to English, and display it all on my phone so I could follow along.

I'd wanted something like this for years, but the speech-to-text and translation quality just wasn't there. In the past few months, both the accuracy and the speed finally caught up. You can get a reliable transcription and translation back fast enough to keep pace with a live conversation.

The project turned out to be a perfect collision of things I already knew and things I wanted to learn. Speech-to-text, translation, text-to-speech, API integrations: I'd worked with all of these in previous jobs and side projects. The plumbing was familiar. But I'd never built anything with Next.js, and this gave me an excuse to pick it up.

It works by streaming audio from the browser microphone to a transcription and translation API, displaying everything in real time. A running summary in English keeps track of what's been discussed and flags any action items. It also generates suggested responses, so I can follow what's being said and have something ready if I need to speak up. Everything saves to local storage and is exportable, because I don't need to be keeping people's potentially private conversations in a database.

Live Kaiwa transcript view showing real-time Japanese transcription with English translation and a running summary
The transcript view during a neighborhood meeting, with a running summary on the right

At the end, it generates a visualization recap of the conversation that you can play back and export. I use it to catch things I missed in real time and to review the full conversation before the next meeting.

Live Kaiwa visualizer view showing speakers in a conversation space with relationship dynamics
The visualizer maps speakers and their interactions in real time

I've used it at a few meetings now. The difference is hard to overstate. I went from nodding along politely to actually knowing what was decided, who volunteered for what, and when the next workday would be.

The best side projects are the ones you can't tell if you're working or playing. I get to tinker with APIs and streaming audio on a Saturday afternoon, and then the following day or weekend I get to test what I built. The feedback loop is short and often involves edamame and beer.