Settings

Theme

Intel open sourced Stephen Hawking’s speech system

blogs.msdn.com

432 points by btzll 10 years ago · 65 comments

Reader

joefreeman 10 years ago

Fun fact: the latest version of this software uses SwiftKey under the hood - http://swiftkey.com/en/blog/swiftkey-reveals-role-professor-... (Disclaimer: I used to work for SwiftKey)

  • Nexxxeh 10 years ago

    As a loyal SwiftKey on Android user, the prediction engine for SwiftKey is unnervingly good. Glad to see it's put to a better use than helping me write Facebook and HN posts.

    It begs the question though, why isn't there a proper SwiftKey keyboard for Windows? The OSK on 8.1 is awful compared to SwiftKey on Android. The Windows 10 is a improvement, but I'd still pay for a better one.

    • greyskull 10 years ago

      Have they improved performance of SwiftKey? I had the paid version for years, was even a VIP (won a t-shirt and everything), but I switched off last year in favor of Fleksy as a faster, more lightweight alternative.

      • mtgx 10 years ago

        I think the performance of Swiftkey improved about half a year ago. But I think around the same time I started noticing a significant drop in Swiftkey's accuracy. It "feels" less accurate to me than it was, although I use it with 3 enabled languages at once and I imagine that also brings lower accuracy by default. Still, I think it has become quite a bit worse than before, and I worry they did that on purpose as a compromise to improve performance.

    • balladeer 10 years ago

      Was a loyal user. Then they started making themes, emojis and all those bells and whistles and thrusting it all in the app which made it really slow and bulky.

  • kragen 10 years ago

    Does this mean that SwiftKey is now open source, that what they've open-sourced doesn't include the actual prediction engine, or that what they've open-sourced is not the latest version? The GitHub page says they use Presage.

    • modeless 10 years ago

      Interesting, since Presage is GPL 2 but this project is Apache licensed. Intel says "Integration with Presage is through the Windows Communication Framework" which I guess is a roundabout way of avoiding the GPL?

      • kragen 10 years ago

        The Apache license v2 is GPL-compatible, which was actually one of the major reasons for having a v2.

        • modeless 10 years ago

          Apache licensed code can be included in a GPL project, not the other way around. If the project includes any GPL code, the whole thing is GPL.

          • saurik 10 years ago

            I think it is better to think of it from the perspective that this code and this project is Apache, but any binaries that result from linking this code to that code are under GPL.

nbevans 10 years ago

Bear in mind this project was started around the same time there was a ton of uncertainty around the future of Silverlight and WPF. Alas, one did die, one lives on for now. But nobody knew that at the time, including Intel, or apparently Microsoft. WinForms has never faced any forward compatibility uncertainty so it is a good long-term bet.

tonyedgecombe 10 years ago

WinForms is still a great way to write desktop software if you don't need all the features of WPF, I just started a new project with it and have been really productive.

  • Pxtl 10 years ago

    WinForms is not simple. There are so many core classes that have counterintuitive edge-cases and overcomplicated behavior, and so many things you'd expect to work by default don't.

    Databinding is a complete trainwreck, the Combo-box class is horribly overcomplicated by its double-duty as text-entry and drop-down-list, the DataGridView is a complete beast of leaky abstractions, and the layout engine completely falls apart if somebody alters the DPI unless you obsessively test DPI alterations yourself.

    I don't blame Microsoft for any of this - it was 2000 and they were making a wrapper around some terrifying legacy code.

    But this thing should have been tossed in the dustbin of history a long time ago.

    • jorgeleo 10 years ago

      I believe that no technology is simple on it's own, it all depends to the abstractions that you are used to. Counterintuitive is dependent on how things are expected to work.

      Data binding is not solid, but it is a quick hack to display data, the solution is using a business model and mvp or mvc.

      What do you find complicated about the combo class?

      Data grid view... It is a train wreck, but then again, there is not much need to use it if you have a proper model behind.

      The layout engine does sucks... The only alternative I found is to use the dev express layout control. The rumor is that 4.6 solves this.

      Win forms is solid and it has very little chance of disappearing. Areas of the screen can be controlled independly, which means ui encapsulation is there... Something not easily done in html.

    • 0x37 10 years ago

      What a hyperbole of a comment. It really isn't that bad.

    • duncan_bayne 10 years ago

      Sure. But - serious question - what offering from Microsoft would you replace it with?

      • Pxtl 10 years ago

        I keep using it because I'm used to all its warts and idiosyncracies. So I don't know what properly-supported alternative one should use. I just get annoyed how many brand-new fresh-out-of-college developers I meet that use it. They need something better.

        • duncan_bayne 10 years ago

          Having used both WPF and Silverlight until 2010 (at which point I abandoned the Microsoft stack altogether) I agree, but I don't think the answer is either of those technologies.

          Have you tried building GUI apps in Racket? That's the sort of thing I was wishing for when using either Java or .NET to build Windows GUIs.

          • Pxtl 10 years ago

            I've done some academic intro-to-FP stuff in Racket, but haven't really got my feet wet with a non-toy application in it. So the GUI framework is good?

            • duncan_bayne 10 years ago

              Yup. I haven't built anything of significance in it (yet) but it's proved really easy to learn, and (again, in my limited experience) rock-solid stable and fast enough:

              A trivial example:

                #lang racket
                (require net/url
                         racket/gui/base
                         racket/sandbox)
              
                (define (menu-file-exit-click item control)
                  (exit 0))
              
                (define frame
                  (new frame% [label "Demo"] [height 480] [width 640]))
              
                (define menu
                  (new menu-bar% [parent frame]))
              
                (define menu-file
                  (new menu% [parent menu] [label "&File"]))
              
                (define menu-file-exit
                  (new menu-item% [parent menu-file] [label "E&xit"] [callback menu-file-exit-click]))
              
                (send frame show #t)
              
              ... is all you need to create a basic GUI app with a File -> Exit menu option. And that really is all there is - no resource compilation, no code-behind, no separate languages for expressing the UI and the actions connected to it.
  • ZanyProgrammer 10 years ago

    WinForms is still the easiest way to write desktop software quickly and easily. If you don't need a fancy UI for customer facing work, its a great platform for internal use.

    • Aleman360 10 years ago

      ... for some definition of "easiest." Maybe I would agree if you don't need to support a custom look-and-feel (Win32-looking apps don't really fly anymore), responsive layout, high DPI, touch/pen input, system theme colors, accessibility, localization, and haven't learned XAML.

      WPF and UWP apps are both far easier.

      • jimmaswell 10 years ago

        Win32 is the style that Windows apps are supposed to be and what people expect in a Windows app. What are you talking about? In what way does the win32 look "not fly"?

        I've tried to use WPF and it was just a major pain and felt like a mess. There is no impediment to "responsiveness".

        Where did you get the idea winforms apps don't follow the system colors?

        • Aleman360 10 years ago

          > Win32 is the style that Windows apps are supposed to be and what people expect in a Windows app.

          Maybe in the Windows XP era. None of the built-in apps in Windows 10 look like Win32 apps, apart from legacy stuff that hasn't been ported yet and now looks sorely out of place.

          https://msdn.microsoft.com/en-us/library/dn894631.aspx?f=255...

          Much of the Windows shell isn't even written in Win32 anymore, it's all UWP (source: I work on the start menu).

    • lbruder 10 years ago

      Have a look at lazarus (http://www.lazarus-ide.org/). It uses a different language (FreePascal instead of C#), but for me it's much more productive, and the programs written run without any framework and feel much snappier.

  • jimmcslim 10 years ago

    The data binding in WPF seemed more powerful to me... enabling MVVM approaches, although I'm sure the same is possible in WinForms as well.

    Does WinForms still get love in the new versions of the SDK or is it considered done now?

    • toong 10 years ago

      WinForms is in maintenance mode. The only updates it gets is to make sure it runs on new versions of Windows & maybe some security related updates.

      (There still are bugs around, but I think they will not be addressed ever. Fixing those could potentially break existing applications relying on that behaviour.)

    • whoisthemachine 10 years ago

      I don't know, I've done both (Silverlight MVVM and Winforms) plenty and I prefer the manual approach. Databinding is cool until you have any sort of complexity in your app, and then it becomes unwieldy quickly.

      • mariusmg 10 years ago

        Databinding (and MVVM by extension) is ok only in very simple scenarios. Once you hit some complexity you end up in a world of hurt.

        Databinding is not a leaky abstraction ,it's a fucking flood abstraction.

        • stupidcar 10 years ago

          Um, if you're using MVVM, then by definition you should have a clean separation between the properties of your Model, which is part of your domain, and the properties of your View-Model, which is part of your presentation tier and directly tied to a particular View. If you have this, what exactly can be leaked by databinding? The only things you're binding to should be properties of your View-Model.

          • icegreentea 10 years ago

            While that's largely true when dealing with pure business logic, there's a maddeningly large amount of UI logic, and hybrid business/UI logic that becomes really annoying (or outright impossible) to handle in pure MVVM with WPF. Reasons for this include that many user interface properties aren't exposed nicely to allow data-binding.

            Probably the most infamous one is that the WPF listview with multiple select enabled doesn't allow you to bind to the collection of selected items. Instead you have to do all sorts of work arounds, that while individually aren't too bad, when put all together, makes all the other hard work you put into doing MVVM on the components that you fully control super frustrating.

            • r-n 10 years ago

              > many user interface properties aren't exposed nicely to allow data-binding

              You hit the nail on the head with this. This is why I don't like to use WPF outside of writing small utilities.

        • m_fayer 10 years ago

          I've worked on large and complex apps with MVVM in WPF and on the web with Angular, and have not regretted using MVVM for a second.

          Databinding is indeed a leaky abstraction, but at the same time it's a very powerful one. I'm willing to learn the inner workings of the binding system to avoid performance pitfalls and other weirdnesses. I'm also willing to continually wrap all sorts of not MVVM-ready components to make them data-binding friendly. When people talk about databinding being a leaky abstraction, what I hear is "I was promised magic and it's not actually magical."

          Also - there's many different approaches to how it's done with various tradeoffs. Compiled bindings on Android and the new Windows platforms look interesting, and you should also check out how ReactiveUI approaches it.

          In the end though, I've never been able to achieve a satisfactory level of loose coupling, testability, and portability without databinding. Despite the overhead and occasional surprises, it's paid off in spades as far as quality and productivity.

    • tbrownaw 10 years ago

      As far as I can tell, it seems to be done.

lorenzhs 10 years ago

Discussion of a previous article, focusing on the difficulties during the development of ACAT and tailoring it towards Stephen Hawking: https://news.ycombinator.com/item?id=8686757

btzllOP 10 years ago

Github repository: https://github.com/01org/acat

dimman 10 years ago

To everyone who's interested and programming and are thinking; what should I do/program? I'm sure there are a lot of small things or applications you can do to help other people in need. See it as a learning experience and something that might have a huge impact in other peoples life, how's that for a motivator for something to do? Kudos and respect to all people behind this project and to Stephen Hawking himself.

twotwotwo 10 years ago

My mom had ALS, and used single-switch input for a while, after typing and writing on paper weren't possible. She wrote out little notes about what she was thankful for, prayers, practical messages (she had Type 1 diabetes, and told folks her insulin doses), and recipes this way. Eventually she had to switch to giving messages to a human holding a letter board by looking towards them for 'yes' and away for 'no'--cameras or Hawking's infrared-laser-based system weren't really feasible.

We looked at some software called EZ Keys from a company called Words Plus. (I don't think she used it specifically, at least for long--I know she used another program, a DOS-based one called Living Better that ran in 40-col. mode that I can't anything about on the Internet.) EZ keys looked more or less like Intel's thing -- scan rows, scan items in a row, completions/predictions over at left. It even had an option to use a frequency-sorted keyboard like the Intel one, with the common letters pushed to the top left (since those are the first rows/cols to be scanned). Hawking apparently used EZ Keys, so it's possible the Intel folks intentionally gave their thing a similar interface to make the transition easy.

It is worth remembering that no user cares if it's WinForms or whatever. Some folks might like a nicer voice if they haven't gotten used to theirs like Hawking ;), but the main concern is just getting the message across. Intel seems to have worked on the right stuff: better prediction (Presage http://presage.sourceforge.net/, which looks interesting) and context-sensitive controls. The infrared-laser-based input method sounds cool, too.

This is a neat space: an optimization/prediction problem where improvements can be a significant help to someone. (There are also practical optimizations that don't have much to do with the general word-prediction problem: sometimes people have to say things about their care, food, etc., or generic 'hi' and 'bye', and it's good if those are fast.) A Web page or Chrome extension can do a lot--how close can you get to smoothly operating the Web with just the spacebar? the arrow keys and Enter? or plain old typing, but slowed down and using 0-9 for completions?

I've heard that nowadays, people with communication trouble and enough movement use text-to-speech on mobile gadgets with their nifty and highly refined predictive input and that's awesome.

blackbeard 10 years ago

Thinkpad love there as well. His "custom" computer appears to be an X220 tablet in an enclosure.

  • noir_lord 10 years ago

    Would make sense, easily available commodity hardware that is reliable and in a decently small form-factor.

acqq 10 years ago

Is there Hawking's speech synthesis at all (there was an article that his voice is based on some hardware device http://www.wired.com/2015/01/intel-gave-stephen-hawking-voic... )? I understand it's "just" a "navigation" system (replacing the mouse and keyboard with the facial movement virtual key). If it's so, the title (the "speech system") is misleading.

The project also doesn't use SwiftKey but

https://github.com/01org/acat

"Presage, an intelligent predictive text engine created by Matteo Vescovi."

  • acqq 10 years ago

    It's seems that the company which owns "Paul" voice preset and DECtalk now is SpeechFX:

    http://www.speechfxinc.com/dectalk.html

    I don't know how different is that from CallText 5010 which was, as the Wired article states, eventually bought by Nuance Communications. Still, as per Wikipedia:

    https://en.wikipedia.org/wiki/DECtalk

    "The CallText 5010 is still listed on Hawking's site as of 2015.[9]"

  • voiceclonr 10 years ago

    @acqq: Shameless plug. This doesn't seem to have speech. However, I've tried to build a text to speech synthesizer in www.voiceclonr.com. Appreciate if you could try and leave feedback.

    • acqq 10 years ago

      If I understood correctly the open-sourced version uses as an example the Microsoft's Speech API. Searching for which I find the gems like this:

      https://connect.microsoft.com/VisualStudio/feedback/details/...

      "System.Speech has a memory leak - by eoghanoh

      Status: Closed as Won't Fix"

      I see your work is based on http://hts.sp.nitech.ac.jp/ Can you tell us what are your changes?

      Edit: I see HN already commented your work:

      https://news.ycombinator.com/item?id=9812734

      • voiceclonr 10 years ago

        So much developer rage in that Status :) On the HMM stuff, it was pretty much the baseline code from the link. The things I recall experimenting were more about getting it done faster (threading some training phases, different gcc options during synthesis etc).

      • hobarrera 10 years ago

        So the core and a really important part (speech synthesis) is still closed source.

        That's a shame really, I was really looking forward to try it out. And the title is grossly misleading.

jmpeax 10 years ago

If anyone is considering downloading this to use his voice to annoy your best friend John with things like "Hello, my name is Steven Hawking. The universe is big, but not as big as John's mother.", let it be known that this software doesn't sound like Hawking.

datawaslost 10 years ago

Presage is great, but to clarify some other comments - it doesn't involve any specific dataset, like SwiftKey - it simply does nice smoothed predictions when given a large database of n-grams (groups of words) and their frequencies. It's fairly easy to chop up a corpus into n-grams using NLTK or other tools, and there's a good port for Python called Pressagio.

My startup Spoken - http://spokenaac.com - uses n-gram predictions to help users with aphasia or other language disorders speak. The user interface challenges aren't quite as intense as Stephen Hawking's binary input, but it's an interesting field if you're into design and big data.

ris 10 years ago

Forcing a disabled man to use Internet Explorer. Surely this is the basest form of cruelty.

andersonmvd 10 years ago

Now security researchers will analyze the code to find vulnerabilities to exploit Stephen Hawking's speech system. Next headline Stephen Hawing's voice sounds like Justin Bieber's voice, lol.

nimitkalra 10 years ago

The code in the GitHub repository [1] is pretty interesting to look around in.

[1] https://github.com/01org/acat

melling 10 years ago

Does it make sense to be more aggressive in predicting by giving the user a second level from which to choose? For example, if he types 'b', then it could offer to type 'black' or 'black hole'.

On the iPad, for example, if I type 'f', I get shown 'for', then if I accept, I always see 'example' and 'instance'.

  • rjbwork 10 years ago

    I believe there is a predictive typing keyboard based off of Tries that is floating around out there for one of the mobile OSes.

  • twotwotwo 10 years ago

    Suspect "sometimes, but rarely," because choosing "hole" after "black" is pretty cheap--you need your expected savings from potentially saving that choice to exceed the cost of bumping your worst option from the list. It's more likely to be practical when you can offer lots of choices rather than the typical three on mobile, though (bumping a 10th choice is cheaper than bumping a 3rd choice).

    Separately, predefined phrases/templates can be really practical for things related to care, food, saying hi and bye, etc. That's a special case--user likely cares more about getting it done efficiently than choosing the exact wording they want each time.

sagivo 10 years ago

Amazon passwords: https://github.com/search?utf8=%E2%9C%93&q=filename%3Aaws.ym...

almost_started 10 years ago

.Net WinForms? No wonder it sounds so bad.

Keyboard Shortcuts

j
Next item
k
Previous item
o / Enter
Open selected item
?
Show this help
Esc
Close modal / clear selection