Show HN: Voiceliner – Capture structured braindumps on the go
a9.ioHey HN! A few months ago, all I wanted was a voice memos where I could hold-to-record. Holding and releasing lets me “burst” quick chains of thoughts together. It’s especially useful when walking with friends, to capture stray references and ideas.
After testing the initial burst interaction, I realized I wanted to transcribe them, and relate notes together into a hierarchy. Other features came naturally, like geotagging each note and swiping during recording to change the “temperature” (importance) of a note.
The app is open source and written in Flutter.
I built a similar product for myself sometimes back - https://play.google.com/store/apps/details?id=net.ashishb.vo... (Android only)
This is brilliant - simple, novel, local, open source. Love it.
I had a dream at the beginning of the pandemic that people who spend all day in Zoom calls might be able to spend all day hiking as well. I tried quite hard to make it work a couple of times - big USB battery in the backpack - but LTE signal was never good enough up in the hills here.
You could integrate vosk for local on-device private transcription. https://github.com/alphacep/vosk-api
How have I not heard about this before now? I hope it's as good as it looks!
thanks, this is a great idea. Curious how it stacks up against other libraries.
i did a quick comparison with regards to word error rate... https://robmsmt.github.io/2021/09/04/benchmarking-asr-first-...
For my particular dataset it worked quite well. It was the best offline ASR that I tried.
Thank you, I was just thinking today that I needed a tool exactly like this.
My go-to brain dumping tool is simple note but it's too much separation between an ephemeral thought and the process of recording it.
Especially clever because I think all the tools to do this have existed for basically as long as Android has existed, but this is a very good application of those tools.
Okay, this is super cool. When I started taking walks I found I was thinking through a lot of things but didn't have a way to remember all the things. So I started using the voice transcription notepad (so nice that people these days are less hostile to someone walking down the street mumbling about things :-)). In my case I was stuck taking the document and then re-editing/moving it over to Evernote manually which I didn't always want to do. This looks like it will replace my workflow with a single tool. That is super awesome.
One feature request would be 'activate by airpod tap' so I wouldn't even have to hold my phone, just tap my airpods to make a note.
Did the transcription work well? I always found myself taking more time editing the text later than the time I saved with dictation.
Generally 90+% accurate on the transcriptions. I recognize however that speaking English with a mid-range male voice with a northern California accent is kind of like cheating.
Super clean setup on iOS! Can’t wait to try it out, the concept is lovely.
I use notion a TON but it’s not great for the most immediate time sensitive notes.
Do you think you could integrate google calendar / reminders in some way? A lot of the kind of notes I would record on this kind of app have some form of either deadline, or are only relevant after a certain date or time. For example, “check in on this thread Monday morning”. I use google calendar reminders for this right now as they stack up but it’s not a great solution.
I'm less bullish on specific deadlines for these kinds of notes. I'm planning to do a weekly summary notification - "you made X notes last week, Y of them were high-priority" https://github.com/maxkrieger/voiceliner/issues/20
I’ll give you some examples of some legitimate notes/reminders I’d take:
- check this thing that opens tomorrow at 6pm
- talk to the lawyers about x before friday
- Fill in my PLF some time tomorrow so it’s ready for my trip
Etc.
Reminders would be a useful addition to this for me, too.
If this would support a bluetooth PTT microphone, it would be perfect. Then I could leave the phone in my pocket and have tangible, physical buttons to record with. I don't like messing with a touchscreen when out and about.
I wonder if this could integrate with Siri (or Assistant), e.g. 'Tell Voiceliner <your note>'. Be better than fiddling with your phone, at least.
Looks like there's an issue: https://github.com/maxkrieger/voiceliner/issues/35
FYI bluetooth microphones almost all degrade to some horrifically terrible codec that makes voice recognition much less reliable.
I don't understand why in 2021 Bluetooth degrades to "worse than a 1970's land line" in quality as soon as something tries to use the microphone.
I had high hopes for the MYLE, which I saw at CES 2016, but it never ended up getting built.
https://web.archive.org/web/20161231051940/http://getmyle.co...
What's your favorite PTT mic? I'm always on the lookout for great device recommendations!
I don't have one. But I think I would buy the Aina PTT Voice Responder if I bought one.
https://shop.ainaptt.com/ptt-devices/21-ptt-voice-responder....
I have used a few of these for testing at work (I'm a developer). They're pretty solid devices, well built, audio quality is _subjectively_ pretty good and they have headset ports.
Curious what API limitations are preventing you from doing the transcription device-local on Android?
This looks really cool. I'v been looking for something like this for a while. Is there an APK download? I don't use the google store and the github page didn't seem to have an obvious link?
You can use Aurora store to download and install almost any play store app.
Yes but it seems a bit perverse to do this for an Open Source project.
Cool concept. Looking forward to trying it out.
I've thought about taking voice notes before, though I've imagined that as more of a private hands-free thing. I'm curious what your experience of using it in public or with others around (the walks with friends, but also family or colleagues?) is like?
(I sometimes go on a walk while I talk myself through a problem; I've noticed I almost always stop speaking while someone else is in earshot. I suspect I'd also be inclined to avoid taking voice notes with others around.)
It’s certainly awkward! I’m personally willing to stomach it because it’s kinda cool to “invent a new social primitive”, as dorky as it appears. There’s a very real tradeoff between mere awkwardness and fumbling with a text entry/forgetting the thing altogether.
I also recently rolled out a “create text note” escape hatch in the menu.
I think this is exactly what I've been looking for. I like going for walks and listening to audiobooks and podcasts, and I've been using Google's recorder as a way to "highlight" my findings, but it doesn't let me add to existing recordings so I can't collate sessions together.
I'll be trying this on as a replacement.
If this had automagical sorting / hierarchy of my recordings based on key words, or allowed me to "shuffle" my entire collection of recordings according to a few pre-set algorithms, this would be interesting.
Example: I spend 5 weeks recording 200 sound bites about real estate development in PR. I do no organization. I click a button in the app marked "Organize by opportunity". It sorts my recordings into 4 folders with 2-3 nested with titles like "The Tulum project" and "Evan's group".
I don't particularly need transcription because I don't want to do any of the work implementing the feature I just described ...
As it is, it looks neat but I'll stick with iOS built-in recorder.
How would any non-domain specific tool (ie a voice recorder app for real estate or even real estate in PR" even know what "opportunity" means.
It could do a loose keyword match but unless you used the words "Tulum" or "Evan" how would it know to link notes together without context on who Evan is?
> How would any non-domain specific tool (ie a voice recorder app for real estate or even real estate in PR" even know what "opportunity" means.
> It could do a loose keyword match but unless you used the words "Tulum" or "Evan" how would it know to link notes together without context on who Evan is?
Does it need to know? Fairly vanilla NLP can provide the data to categorize (or index) by identified parts of speech, such as verbs, proper nouns, etc. If you have a large enough pile of notes, categorizing or subcategorizing by combinations would be useful.
There are pitfalls, such as lacking sufficient context for disambiguating between identically named people (eg. your sister Mary vs. Mary from work), but that doesn't negate the utility of such a feature.
Further refinements for association and disambiguation would be highly contingent, but that very contingency can be modeled with Bayesian classification (or more advanced attentional mechanisms) that learns when to apply them. For example, a bit of sentiment analysis could help associate Mary (that you're often mad at) with the words 'project' and 'report', but Mary (that you like) with 'barbecue' and 'holiday' for clustering purposes.
These supplementary techniques necessarily operate on 'small data', and the real challenge is finding natural UI flows and affordances to suggest them to the user when appropriate and solicit feedback without overwhelming.
If it had 10+ algorithmic shuffling based on keywords, I'd just click the button until it was shuffled somewhere in the domain of "close enough." Then I could reorder the folders myself. Maybe it's counterintuitive, but as a user, having an algorithm shuffle things wrong is actually preferable to me, rather than me starting from a large flat list of arbitrary unlabeled recordings.. (assuming I would not take the time to label each one as I record it).
Why not just have customizable keyword categories? So then you could preface it with said keywords / tags.
"HIGH/LOW OPPURTUNITY x name x location"
"CHECK OUT x on DAY"
"BUY x"
Etc...
Awesome!!! I have had it in the back of my mind for years to do something like this. I am so glad that you have stepped up. Very, very much needed. Now I just need to figure out how best to bulk transcribe the hours and years of audio recording...
I tried to do this for my rambling half-hour in-car voice "notes" using AWS S3 -> Lambda -> Transcribe. I got the flow working but the transcript is literally unplayable and I'm not sure how to get it to a usable quality. I have vague plans to experiment with IBM Watson but it's way back-burnered.
Was looking for something like this just today. Does it support hands-free operation (or while we're at it, does someone else know something that does)?
This is currently my main criterion. I want something that captures my thoughts while hiking without seeing or touching the screen. Currently dabbling with Siri shortcuts, but they're pretty buggy and lacking.
So if Voiceliner could either support the Shortcuts API and/or switch into a mode that's press-to-record / or start-stop, but somehow works on the connected Airpods only, that would be awesome.
Bonus points for re-reading the transcription to me and very light editing on top (like document switching).
Is there something like that?
If anyone is curious of a workflow that might work on ios: Settings > Accessibility > Back Tap. You can then assign a Shortcut to run when you tap the back of your phone.
The taps are finicky for me, maybe because of my phone case. I might try them out again and see if it's worth ditching my case.
Have you tried using voice control?
it's open source - and i can't imagine activating via play/pause would be overly difficult to implement, if it does not already do so.
Congrats on launching!
I've wanted exactly-this for years. I've sketched a few versions but it stayed on the back-burner for me, partly because friends/etc didn't see the appeal.
I'm really excited to try it out.
This sounds great to take quick but organized notes for my use case - researching and prototyping, but when the notes need to be converted into a more formal 'report' later.
Amazing! I’ve been looking for a way to put down my thoughts for a while, tried carrying a pocket notebook.. the notes app.. zen journal .. but writing is so much friction .. I used voice notes for a while but couldn’t search them for later so it was difficult.. I hope this is the one ^^
What do you use for the transcription part? Paid library or something open? More curious on accuracy.
According to the source its azure. https://github.com/maxkrieger/voiceliner/blob/main/lib/repos...
on iOS, it's on-device. on Android, Azure - so far I haven't hit the free limit of 5hr/mo. Might start charging Android users if we hit the free limit.
Android has on-device transcription for some devices (e.g. Pixel 5 and above). Maybe you could use that instead of charging?
EDIT: If you feel strongly about this and think it's possible please send in PRs. Thanks!
Unfortunately you then lose ability to play back the original audio https://stackoverflow.com/questions/2319735/voice-recognitio.... This is a major usability tradeoff IMO, though I'm willing to be swayed to add an option.
I suspect Google's limiting this because they don't want devices "freeloading" their cloud transcription service, since most phones can't do it on-device.
Can you not buffer the audio and then send the buffer to the transcription service, allowing you to keep the original?
If this were all on-device I'd use this in a heartbeat. I'd even pay for it. I worry about privacy though - I appreciate you went with Azure instead of Google, however!
You really think Azure and Google Cloud have that much difference? If you use GCP, you can select whether they can also use the data for training, per request. Does Azure have that option or clarity?
Their respective reactions to 2013 is all the clarity you need.
Not just Pixel 5 and above, you can transcribe in real time with any android phone using Google Recorder app. Which essentially means that it doesn't need special processor (as they marketed for Pixel 6) or the cloud services to transcribe in real time.
App probably loads a model for offline use, I don't know if you could somehow use that app as an API or something.
Same goes for Chrome. You can see live transcript of any audio in Chrome Desktop without internet. That transcript is readonly and cannot be copy pasted even.
The link you posted is a 7 year old question; is the information still relevant? Surely copying the original audio is enough of a workaround?
Not a mobile dev so could be wrong.
Oh this is great. Just tried it out. A couple feature wishes:
- Sync the audio + text somewhere (although maybe this can be done with SyncThing already?) - Add a widget / app action to support one-tap voice notes from the home screen
Crikey - my Xmas present has just arrived!! I have never gone from seeing something on hn and having it installed and loved so quickly!
Thank you, thank you, thank you.
And open sourcing it too - can't love you more ;)
Super cool! I’ve been slowly getting over the self consciousness of using Siri in public. Maybe I’ll start with around the house first…
Great app! And this comes from a guy who has _very_ few apps installed on his phone! Would it be difficult to have it recognize a different language? (german)
Interesting. I've mostly achieved this with BlitzMail (basically, one button to an open textbox that emails to yourself.) But I might check this out as well.
Suggestion: a play button for the hole outline folder you are in… for example: I want to listen to everything outlined in that folder while running
Upvote here :) https://github.com/maxkrieger/voiceliner/issues/6
Installed. Been need something like this for a while!
Love the initial setup wizard. Great way to teach the user, clarifying what various permissions are for.
I've wanted this for a while – thanks so much!
cool. but on Android, wouldn't Gboard speech input, which also works offline, suffice for the speech input and transcription part?
If you're okay with your voice being piped to Google instead of Microsoft.
Doesn't it work offline now (optionally)?
Ask me 20 years ago!
Great idea! I've been trying for a while to find the best way to take notes when on the go, just installed to test it.
This is perfect! Any chance to be able to select the target language for translation? Portuguese - Brazil for example :)
Coming soon! https://github.com/maxkrieger/voiceliner/issues/17
I love really niche but really useful gems like this! Could anyone recommend any others?
This looks pretty awesome! Is there a way to export them? I’d like to push them into Trello or another tool
Keeping it all local is brill. There’s def “signup fatigue” with apps like this. This is perfect.
This is the first instant download I have done from HN. Can't wait to try it out later. Great job!
Fantastic! I love that it is all local.
The audio goes out.
Only on Android.
Love that first item on your first App Store screenshot: “Re-watch the mother of all demos”.
Well done! Love that you can create outlines from voice recordings - super cool.
Love it! Can't wait to see how this evolves over time.
Is it possible to change transcription language on iOS?
It follows system settings, but I'm going to add app-specific localization soon! https://github.com/maxkrieger/voiceliner/issues/17
Is there a way to just type instead of speaking?
Yup, check the menu in the corner or long-press anywhere in the empty space.
Is there a direct .apk download?