4.2010 – 6.2011

Microsoft Touch Mouse


Before 2009, novel input architectures, combining multi-touch interaction with basic mouse functionality were explored by Microsoft research. In 2009, I helped Microsoft Hardware turn this research into a real product, ensuring the touch interaction model carefully balanced gesture recognition against core mouse functionality, and later designing the product’s onboarding to ensure user adoption of the gesture set.

Microsoft released their Mouse 2.0 research paper for the 2009 UIST ACM Symposium just as Apple introduced its Magic Mouse to the world. The paper covered the pros and cons of five different hardware architectures, described the approach to low-level touch processing and highlighted new potential desktop interactions. Both companies saw exciting opportunities for multi-touch mice and Microsoft Hardware responded to Apple with something more impressive than Arc Touch Mouse. Out of the five research prototypes, the ‘Cap’ mouse architecture had the most potential and the least risk. I joined the product team shortly after it was kicked off.

Those wanting a premium multi-touch mouse which transforms Windows 7 into the ideal experience – one just as exciting for the PC as Magic Mouse was for the Mac.

…but more comfortable.

For advancing Microsoft’s research were to…

  1. Design a full size wireless mouse that’s truly comfortable even when gesturing.
  2. Deliver fun, useful, and easy to perform multi-touch gestures, creating the ideal Windows 7 experience without compromising any core mouse functionality.
  3. Ensure the multi-touch gestures are quickly learned, easily remembered and adopted into everyday use.

And for the launch of Windows 8, evolve Touch Mouse’s behavior to showcase what’s new in Windows.

Required developing a clear mental model for gestures and reinforcing it with the right associated actions. Prior research explored a large range of opportunities – mostly manipulations – some appropriate for a select group of apps and others implying a next generation operating system. Influencing Windows was out of scope and the relevance of gestures had to be broader. The mental model forming as I joined the team was one finger for controlling content, two fingers for controlling the window, and three fingers for controlling all windows. The familiar two finger pinch/rotate manipulations were also interesting, but awkward to perform on a mouse, would complicate this model and were ultimately discarded for simplicity.

With the mental model taking shape, associating the most appropriate, meaningful actions to each gesture was next. Scrolling content with one finger in any direction was table stakes. Mapping a thumb motion to back/forward enabled the popular mouse button functionality and aligned nicely with one finger mental model. For two and three finger gestures, the premise was to control windows, replacing repetitive mouse movement with a more efficient interaction. Maximizing, minimizing and snapping windows mapped well to moving two fingers. Move up to maximize, down to minimize and left or right to snap. Mapping three fingers to show or hide all windows extended the metaphor and leveraged Instant Viewer, an IntelliPoint software feature providing functionality sorely wanted by all Windows users.

The sheer complexity of this product with early engineering builds was tangible, more so than anything the group had previously shipped. High-level experience goals, typical user research methodologies and a monthly cadence of subjective user studies validating the design were not going to produce timely, actionable feedback. The team needed a framework that translated goals into action so I drove the definition of both objective and subjective target metric criteria (e.g. meet a specific average rating or meet a specific spec). It did not matter how cool the gestures were if you could not confidently use the mouse. Finding the right balance was key so I also established metrics to track and characterize the source of those frustrations.

Development approached these challenges with machine learning (i.e. recording individual gestures, training the recognizer against a subset, and evaluating accuracy against the rest). While this process was efficient, I saw areas that could improve. When collecting recordings, people would simply hold the mouse and make one gesture after another, not knowing when the task completed or if they did it right. This did not capture real behavior, confused people, and often led to discarded recordings – a problem when you need good data to deliver a great experience. In addition, data was only collected when performing gestures, not in between or during more typical use. Frustrations occurring while pointing, clicking and scrolling, needed to be solved. To improve the tools and ultimately the process, I advocated for task feedback and task diversity, including activities that forced people to move, release and regrasp the mouse as well as for another tool which could dump the last 60 seconds of sensor data with a keystroke combo or the click of a ‘Report Bug’ button.

Timely feedback was now critical to the acceptance of our experience measures and for driving iteration. More frequent, focused feedback and automated synthesis, I knew could accelerate our pace. With no existing solutions to meet our needs, I drove development of another tool with the help of a vendor which collected survey data in a custom SQL database and aggregated responses automatically to an Excel file which visualized results against our metrics in a meaningful way. With a historical view of our progress (e.g. over hardware and software builds), we could sort, filter and analyze response frequency, distribution, and easily summarize our findings for management or dive deep into the details for development. What once took three weeks for feedback, now took just a couple of days, enabling the team to more rapidly tune the experience.

Now it was time to apply these tools towards the biggest challenges facing the product. For many, holding or moving the mouse easily caused something unintended. Strong survey feedback confirmed the issue, its importance and suggested a correlation with perceived comfort, but did not provide more detail for solving the problem. We needed to identify patterns across users for further algorithm development. To help, I captured richer recordings – compositing the screen showing sensor input while logging it with video of the hand using the mouse. Through analysis, I identified specific patterns of touch movement with mouse movement as well as patterns of secondary touch instantiation and its effect on tracking primary touches. My recommendations led to methods of mitigating inadvertent gesture actuation and later to refining scrolling performance.