Simplified Twitter Microsyntax for the Haiti Earthquake

18
Jan
2010

In this post, I have typeset many more sentences in bold than I usually do, so readers can quickly skim through it.

I applaud the efforts of U. Colorado’s EPIC Group in assisting the victims of the Haiti earthquake in calling for help using Twitter, and to make their tweets discoverable and actionable. I just performed a Twitter search for the terms #haiti -RT -http (includes all Tweets tagged #Haiti, except retweets or links) to inspect some of the tweets that are directly related to happenings on the ground, and they are (as expected) only a minuscule percentage of the total number of tweets about #Haiti. Syntax is thus sorely needed to achieve a decent signal-to-noise ratio to assist relief efforts.

Though, in my opinion, the current version of the tweet syntax seems too formal, too rigid and a tad too complicated for victims or rescuers on the ground. I am a programmer, and even I had trouble mentally parsing a few of the examples provided. We must keep in mind that Haiti is a bi-/tri-lingual country (and neither of them is English), so any syntactic terms used should preferably be semi-obvious to non-native speakers of the language as well as rescuers.

Roles of Microsyntax

  1. Make tweets discoverable: Microsyntax can assist local search-and-rescue efforts and unaffected Twitter users in determining if a tweet is actionable. This task is partly a Signal Detection Task and partly a Data Mining problem. In both situations, microsyntax can prove helpful: all that’s needed is a single tag that emphasizes that a particular tweet is actionable (versus not), e.g. #haiti #rescue (or #haitirescue, to avoid having to type a second # (hash) sign). This will greatly increase the sensitivity parameter d’ of the signal detection task.
  2. Make data mining easier: Once a tweet has been detected to be actionable, its contents must be parsed into a form that local efforts can take action upon. While it’s true that all the other proposed microsyntactic tags make it easier for applications to parse the data, this is at the cost of requiring users to learn new syntax. This seems to me a little too much to expect from victims of a recent calamity of this scale as well as from rescue workers with other higher priorities. Instead, as long as our tools can identify relevant tweets, computers should be able to perform the second task of parsing locations, names, and verbs from tags quite easily.

Also, microsyntactic terms need not always be prefixed with # (hash) signs; they are often difficult to type using cell phone keyboards, and on some handsets, may hamper input methods such as T9. Because of the intervening # signs, Tweets containing the proposed microsyntax decrease typographic readability for someone browsing through tweets.

To summarize, this imposes a heavy cognitive load on victims and search-and-rescue efforts while making parsing easier for machines. However, the task of parsing details from tweets can also easily be performed by large numbers of humans a.k.a. crowdsourcing via volunteer efforts or via tools such as Amazon’s Mechanical Turk.

Simpler, Lighter Microsyntax

The following are examples of microsyntax that are more readable, yet also parseable by machines. All situations are based on the ones in the original proposed microsyntax. Most are directly based on the EPIC microsyntax, with a few simplifications.

  • Rule 1: Always write in the third-person. This takes care of part of the name problem.
  • Rule 2: Instead of using #loc for locations, use “at”. It’s much more natural and not much more difficult to parse.
  • Rule 3: Verbs are actionable. Not syntactic verbs, but English (or French or Haitian Creole) verbs. It’s a trivial task to populate a tool with a dictionary to detect all word forms correctly.
  • Rule 4: Anything that cannot be parsed ends up as the equivalent of the #info tag (see EPIC syntax).
  • Rule 5: The entire text of the tweet should always be available to a human, so whatever information was incompletely parsed can be understood manually, and optionally added to the parsed version by a human.

The general aim is to require as little syntax knowledge as possible, and to keep as close as possible to the natural way people write tweets.

Examples

TWEET-BEFORE: Sherline Birotte aka Memen. Last seen at 19 Ruelle Riviere College University of Porter a 3 story schol building
TWEET-AFTER: #haiti #ruok #name Sherline Birotte aka Memen. Last seen #loc 19 Ruelle Riviere College University of Porter #info a 3 story schol building
Simplified Microsyntax: #haiti #rescue Looking for Sherline Birotte aka Memen. Last seen at 19 Ruelle Riviere College University of Porter, a 3 story school building

This tells the computer us:
What = Looking for someone.
Who = Sherline Birotte aka Memen (identified fuzzily based on initial capital letters)
Where = 19 Ruelle Riviere College University of Porter (automatically parsed based on “at”)
What else = “a 3 story schol building” (i.e. everything else in the tweet)

TWEET-BEFORE: Mirna Nazaire lives in P-A-P at Bizoton 6#12. Entire neighborhood without food. People are dying.
TWEET-AFTER: #haiti #need #food #name Mirna Nazaire lives in #loc PAP at Bizoton 6 #12 #info neighborhood w/o food. People dying
Simplified Microsyntax: #haiti #rescue Mirna Nazaire at PAP at Bizoton 6#12 needs food. Entire neighborhood without food. People dying.

This tells us:
What = needs food. (automatically detected from the verb in the sentence.)
What do they need = food (automatically detected from the object in the sentence.)
Who = Mirna Nazaire (heuristically determined from initial capital letters.)
Where = PAP at Bizoton 6 #12 (detected from microsyntax “at”)
What else = “neighborhood w/o food. People dying.” (Rest of the tweet, unfiltered.)

TWEET-BEFORE: French hospital is now open and ready to receive the wounded at the french lycee in rue marcadieux bourdon
TWEET-AFTER: #haiti #offering #med #loc french lycee in rue marcadieux bourdon #num 30+ #info French hospital is open and ready 2 receive wounded
Simplified Microsyntax: #haiti #rescue French hospital ready to offer help to 30+ wounded at the french lycee in rue marcadieux bourdon

This tells us:
What: Hospital. Also, something to do with medical efforts. (no need to tag explicitly, we can infer that from ‘hospital’.)
Where: The french lycee in rue marcadieux bourdon. (Automatically parsed from microsyntax “at”.)
How many people: 30+. (It’s already a number, no need to state “#num” explicitly.)

These are just a few suggestions. I will be contacting the PIs (principal investigators) of the EPIC project directly with some of my recommendations, but please continue to follow their syntax until they recommend anything different. The current syntax proposal isn’t perfect, but it is more important to avoid fragmenting the tagspace.

ReTweeting: Attribution for Discovery versus Attribution for Creation

25
Dec
2009

During the past few months, I have found myself consuming more news and articles via recommendations from friends and those I follow on Twitter than via traditional source-based subscription (e.g. subscribing to specific feeds or newspapers). Social media discovery is here, and the best part of reTweeted links is that they have already gone through a round of peer review by peers I trust.

Often, I’m tempted to reTweet that content myself, or post it to Facebook, or share it via Google Reader. A few of these media keep attribution intact (e.g. Google Reader adds the “Shared by” metadata for each person in the chain that shared the content.) Others such as Twitter are restricted by the length of the post, so the “RT @” list quickly gets too long and inevitably gets trimmed along the way.

But there’s no accepted practice for how this list should be trimmed. Should you keep the first Tweeter, even if that person is not the author of the content? (E.g. someone who read an NY Times article and tweeted about it.) Should you keep the last reTweeter, who was your direct link to the content in question? What about multiple Tweeters re-posting links to the same content, so it’s not a tree any more, but a forest of links (imagine a directed graph with edges denoting “shared by X to Y”).

The problem is that by including attribution about the process of discovery, we end up attaching higher value to discovery than creation. When someone reTweets a secondary source of information, attribution for the primary source is often trimmed away. This is especially bad for Creative Commons works that require attribution when re-posted, but is bad in general for any kind of work and for authors of that work.

I have come to the conclusion that although attribution for discovery is important, it’s hard to apply consistently in fixed-character-length media. It’s a completely different story in case of original content generated by the tweeter himself/herself: e.g. one-liners, or authors tweeting links to their (longer) content. Attribution for original content is vastly more meaningful than attribution for promoting someone else’s content (although the value of that act is substantial as well.)

So from now on, I will only attribute original content in my tweets and Facebook updates. My intention is not to discount the value of the source that shared the content with me, but instead to promote the original author of that content wherever possible.

HOWTO Use custom DNS redirects to save browser keystrokes

5
Dec
2009

Given the recent interest in DNS and its role in the public infrastructure of the Internet, sparked by the release of Google Public DNS, here’s a hack that can help you save keystrokes in the browser while accessing your favorite sites. Instead of typing in “youtube.com” or “twitter.com”, you can just type “y” or “t”. If you’re looking for a map of San Francisco, CA, you can type “map/sf” and jump to the right place in Google Maps.

A bright bold blinking marquee disclaimer before we start: this is advanced territory. If you don’t know what sudo is and why 127.0.0.1 is special, be careful following these instructions because you may unintentionally destroy your ability to do anything at all on the Internet — including looking up instructions for getting unstuck. Also, these instructions only apply to Mac OS X and Linux, or other UNIX variants.

Redirect custom DNS hostnames to frequently-accessed sites

The file /etc/hosts on your machine is consulted by the DNS resolver before making a request to a DNS server. The idea is to add new DNS entries to the hosts file on your machine, pointing short domains such as g and t to 127.0.0.1. Now, whenever you type g or t into your browser, the hostname will be matched from your /etc/hosts file, instead of receiving an NXDOMAIN reply (i.e., this domain does not exist) from an upstream DNS provider. Since this request is received by your own machine, you can then handle it to do whatever you want, including, but not limited to, redirecting the user to the intended destination.

This HOWTO assumes that Apache is installed and running on your system with PHP and mod_rewrite support.

Modify /etc/hosts

Open /etc/hosts in your favorite text editor, and add one line for each shortcut you’d like to set up. Leave everything else unchanged. (You will need to sudo edit this file.)

##
# Host Database
#
# localhost is used to configure the loopback interface
# when the system is booting.  Do not change this entry.
##
127.0.0.1 localhost
127.0.0.1 c # for Calendar
127.0.0.1 f # for Facebook
127.0.0.1 g # for Google Search
127.0.0.1 m # for Mail
127.0.0.1 map # for Maps
127.0.0.1 t # for Twitter
127.0.0.1 w # for Wikipedia
127.0.0.1 y # for Yelp
127.0.0.1 yo # for YouTube

255.255.255.255	broadcasthost
::1             localhost
fe80::1%lo0	localhost

You can test that this change worked, by typing in the address (e.g. http://g/ in your browser. Instead of seeing a page that says that your browser “can’t find the server ‘g’”, now you would see a page saying that your server isn’t configured correctly, or welcome to Apache, or whatever you would see if you typed http://localhost/ instead. If that worked, proceed.

Configure Apache to handle requests for unknown domains/URIs

Edit the following lines in /etc/apache2/httpd.conf. The following code shows an excerpt with lots of context around the line you need to edit. Locate the relevant section in your file.


#
# This should be changed to whatever you set DocumentRoot to.
#
<Directory "/Library/WebServer/Documents">
    #
    # Possible values for the Options directive are "None", "All",
    # or any combination of:
    #   Indexes Includes FollowSymLinks SymLinksifOwnerMatch ExecCGI MultiViews
    #
    # Note that "MultiViews" must be named *explicitly* --- "Options All"
    # doesn't give it to you.
    #
    # The Options directive is both complicated and important.  Please see
    # http://httpd.apache.org/docs/2.2/mod/core.html#options
    # for more information.
    #
    Options +Indexes +FollowSymLinks +MultiViews

    #
    # AllowOverride controls what directives may be placed in .htaccess files.
    # It can be "All", "None", or any combination of the keywords:
    #   Options FileInfo AuthConfig Limit
    #
    AllowOverride All # <-- Change this from None to All

    #
    # Controls who can get stuff from this server.
    #
    Order allow,deny
    Allow from all

</Directory>

Locate your Apache root directory. It’s usually /Library/WebServer/Documents on the Mac or /var/www in Ubuntu. If you’re unsure, check where it is by issuing the following command in a terminal: (assuming you’re running Apache 2.x)

grep “DocumentRoot” /etc/apache2/httpd.conf

In that directory, save the following file. It should be named exactly .htaccess. (That’s htaccess with a period at the beginning, so it’s a hidden file on UNIX.) Save it as /Library/WebServer/Documents/.htaccess on Mac OS X or /var/www/.htaccess on Ubuntu.

<IfModule mod_rewrite.c>

RewriteEngine on
RewriteBase /

RewriteCond    %{REQUEST_FILENAME} !-f
RewriteCond    %{REQUEST_FILENAME} !-d
RewriteRule    (.*) /index.php [L]

</IfModule>

The actual redirection script

Here’s the script that I use for redirection, but you can roll out your own, and do anything with each request you receive. (If you do something phenomenally awesome, I’d love to hear about it in your comments.) As you can see, it’s customized to the sites I frequent, including location preferences (e.g. the Yelp shortcut takes me to Yelp San Francisco directly. The search box is preconfigured for SF.)

Save this as /Library/WebServer/Documents/index.php (on Mac OS X) or as /var/www/index.php on Ubuntu.

<?php
  $uri = preg_replace('/^\//', '', $_SERVER['REQUEST_URI']);
  switch($_SERVER['SERVER_NAME']) {
    case 'c':
      redir('http://calendar.google.com/');
      break;
    case 'f':
      redir('http://facebook.com/');
      break;
    case 'g':
      redir('http://www.google.com/search?q=' . $uri);
      break;
    case 'm':
      redir('http://mail.google.com/');
      break;
    case 'map':
      redir('http://maps.google.com/?q=' . $uri);
      break;
    case 't':
      redir('http://twitter.com/');
      break;
    case 'w':
      if ('' === $uri) {
        redir('http://en.wikipedia.org/wiki/');
      } else {
        redir('http://en.wikipedia.org/wiki/Special:Search/' . $uri);
      }
      break;
    case 'y':
      if ('' === $uri) {
        redir('http://yelp.com/sf/');
      } else {
        redir('http://yelp.com/search?ns=1&find_loc=San%20Francisco,%20CA&find_desc=' . $uri);
      }
      break;
    case 'yo':
      if ('' === $uri) {
        redir('http://www.youtube.com/');
      } else {
        redir('http://www.youtube.com/results?search_query=' . $uri);
      }
      break;
  }

  function redir($url) {
    header('Location: ' . $url);
  }
?>

That’s it, now type your shortcuts into your browser instead of the longer URLs, and there you are. If you run into trouble, leave a comment and I’ll address it.

The only downside of this approach

Redirecting involves an additional HTTP request to your machine, which introduces additional latency. The request, however, is from your machine to your machine itself, so there’s no network involved. Personally, I feel that the keystrokes saved by the technique would have taken longer to type than the shortcuts I set via this method. But you don’t lose anything if you set this up and don’t use it — just continue to type entire URLs and you will never pay a latency penalty.

On-the-fly CSS Compression in PHP

8
Sep
2009

Web site optimization experts suggest that webmasters try to minimize the number and size of HTTP requests necessary to serve web pages. Web designers often use multiple CSS files because they are easier to manage, but this requires as many HTTP requests as there are CSS files.

This free script serves all your CSS files as a single HTTP resource, minified (by removing comments and extraneous whitespace), and gzip-compressed. It also requests browsers to cache the CSS content for at least a day before trying to fetch a new version.

The best part is that this does not pre-process the files, so it does not add any steps to your deployment process. It’s licensed free for commercial and non-commercial use, with attribution requested.

The query: Protocol

1
Sep
2009

Update: I implemented this idea at http://queryprotocol.appspot.com. Comments, questions, and suggestions are welcome!

When trying to explain a concept to others over email, I often find myself linking to a search engine’s result pages for a specific query, instead of a single destination URL. These are non-navigational queries, and there is no single result that I expect to be the most important one. Instead, my intention is to provide the reader a variety of links on the topic such that s/he may draw her own conclusions, or solve their own problem — all they need is a nudge towards the right query term to use. If, over time, better search results are available for the same query, then future readers get the benefit of automatically updated results.

E.g. Q: Where can I find the latest numbers related to the spread of the Swine Flu?
A: Try [H1N1 update].

To do this today, I simply link to my favorite search engine, Google. But that does not seem fair to fans of other search engines: Bing, Yahoo!, Altavista, and others. I would prefer to use a notation that allows the reader to use their choice of search engine to obtain the results. Just as we specify our default browser and default email client, we should be able to pick our default search engine.

We have already solved the first two problems (picking default browsers and email clients) using protocol handlers in the operating system. When I pass around a link to a web page, starting with http://, I do not specify the browser it should open in. Your operating system determines that it’s a link to a hyper-text transfer protocol (HTTP) document, and invokes your default browser. Similarly, for emails, the mailto: protocol provides for an application-agnostic way to invoke the user’s default email client to send an email.

It is easy to see how a query: protocol could be implemented similarly. To point you to the search results for a particular term, I would send you the following link: (don’t click on it, it won’t work — at least as of this writing.)

[h1n1 update]

The URL that the above links to is query:h1n1+update. Note there’s no HTTP protocol marker specified. If the OS wanted, it could provide local results as well. This means that the protocol extends seamlessly to Desktop Search as well.

Syntactically, this validates as a URI. Just as the mailto: protocol handler defines standard parameter names, subject, cc, and bcc, similar parameters can be standardized for the query: protocol. These may include corpus restricts (corpus={web, images, desktop, ...}), pagination controls (start=0, num=10), or domain restricts (site=manas.tungare.name).

Implementation is simple: all operating systems and major browsers support external custom protocol handlers. They can be configured as follows:

Protocol Prefix: query
Application Name: /Path/to/Application

The application does not need to be very complicated. It’s a mere stub, which, depending upon the user’s preferred search engine, converts a URI of the form query:h1n1+update to http://google.com/search?q=h1n1+update or http://bing.com/search?q=h1n1+update and opens that link in the user’s default browser.

Eventually, if browsers understand the query: protocol, there is no need for the stub application, and users may be able to share and exchange queries and yet seek results using their favorite search engines.

(The opinions expressed in this blog post are solely my own, and may not reflect the opinions of my employer, Google.)

My custom LaTeX styles

15
Feb
2009

By popular demand, my custom LaTeX styles are now available for download. (All of them have been dedicated to the public domain, I disclaim all copyright.)

There are four for now, but the set will grow. The first lets you use OpenType fonts (which is pretty much any system font); the limitation is that this works only with a single distribution, XeTeX, available only for Mac OS X, and a new experimental build for Linux. A second style typesets all chapter titles, sections, subsections and subsections in a sans-serif font (instead of the default serif). Very effective when used in conjunction with the OpenType style.

The final two adjust margins to require less paper for draft prints and such. Be good, be green.

One-button Phone Number Sharing

10
Feb
2009

Send this Phone Number to the Current Caller

How often have you found yourself calling a friend to get the phone number of a mutual friend? And then having to hold the phone while your friend pulls up the contact list on their phone, then recites the number to you, and then you write it on paper because your phone won’t let you add contacts while you’re on a call, and then you misplace the number you wrote on paper, ad nauseum. Why isn’t there a single button that says “Send this Phone Number to the Current Caller”?

It’s a common problem. You’re out and about, and realize you need to call a specific person, but you don’t have their phone number (or more often, you have it on your desktop computer, or your laptop, but that doesn’t do you any good in the current situation.) So you decide that the best thing to do is to call a mutual friend and ask them.

When they receive a phone call from you, they’re fumbling to hold the call while they look in their address book. (That is, if they’re lucky, and if their phone actually lets them open the contact list while they’re on a call.) More often, what happens is that they tell you to hang up while they consult their address book. And then you have to hunt for a piece of scrap paper because your phone won’t let you add a number to the list like that.

What the world needs is a button next to each phone number in the contact list that only appears whenever you’re on a call. The button, when pressed, sends an SMS from you to the current caller, and contains within it the information from the contact record you just selected. It doesn’t have to be too fancy, a two-line VCF record should do nicely.

If the recipient’s phone understands this method of contact transfer, it can prompt the user and import it automatically. If not, the user can still read the SMS herself, and dial the number. No more paper, no more fumbling, no more “let me call you back”.

It’s so easy, a caveman could do it. If only phones implemented it!

Email should have Expiration Dates

2
Nov
2008

The entire idea behind this blog post has been summed up in the title, so all I need to do now is to explain why I think email should have expiration dates, and how that would make personal information management better.

Email, as we all know, started off as a way of sending short messages to colleagues within a department. It has since evolved into a monster of a tool that does everything it was never designed to do. The paradox is that it is exactly the kinds of messages that email was designed to handle that cause me the most trouble these days.

  1. I often receive email from my friends about meeting up for lunch. This is important, but only for that particular day (and that too, if I receive it before lunch time).
  2. My research collaborators send me email when a paper submission deadline is near, with the draft attached to it. Those emails are not nearly as important after the deadline.
  3. My friends and I exchange travel plans over email, but is it as useful after the trip is done?

These are the kinds of messages I’m talking about: important but time-sensitive. Then there are others which are not really important, but simply one-time notifications that I can take action on and then forget (“bill is due in 2 days”, “X added you as a friend”, “your order was received”, “your package has shipped”, “free donuts in break room”, “we are not meeting today”, etc.)

Why do they linger on in my mailbox for years? They become indistinguishable from the really important email that I need to save for years, such as some very interesting and intelligent discussions I have had with others. Note that I’m not including spam in this discussion, because in my opinion, there are adequate spam-filtering tools circa 2008 that perform well enough for most users for the most part with an acceptable false positive rate. Not perfect, but acceptable.

The Keeping Problem

Email is no longer ephemeral — people hold on to their email for years. This is what results in the Keeping Problem in Personal Information Management: there is so much of information coming at us that we don’t want to spend the time to decide what to keep and what to trash, so we end up keeping all of it. We hope we never have to do spring cleaning, and instead rely on search to find what we want.

Filing is not the answer

Many people file and tag their email, but the question is, is the cost of doing so (time as well as attention) worth the payoff at the end? Consider the two alternatives: spending 10 minutes each day filing your email, versus spending an hour a month looking for that one email. Pretty soon, the second alternative starts looking better while swimming in a sea of email with no signs of abating.

Same needle, bigger haystack

The bigger the haystack grows, the harder it is to find the needle. The solution is to reduce the size of the haystack. Automatically. Most other solutions empower the user to filter, sort, file, tag and do other sorts of things to their email that do not scale very well. That’s where Email Expiration Dates come into play. For it to work, they need to be (1) defined and (2) honored.

Defining an Email Expiration Tag

Email expiration tags can be defined in several ways by several entities that handle the email message at some point of time in transit.

  1. By the sender of that email who cares about the recipients;
  2. By the email client (MUA) used by the sender, automatically inferring from certain common-sense words; e.g. subject contains lunch and body is less than 100 bytes;
  3. By the email server software that intelligently tags email based on common patterns seen across multiple users;
  4. By the recipient’s email client, based on heuristics;
  5. By the recipient’s email client, based on a user-defined rule set;
  6. Or explicitly by the recipient in a spring cleaning session.

Honoring an Email Expiration Tag : Fully standards-compliant

RFC 822 allows custom tags (Sec. 4.7.5). These are commonly referred to as X- headers, since the specification requires that all such tags be prefixed with “X-”. Many applications built on email make use of such tags: mailing lists use the X-List-* headers to specify the list name, subscribe URL and unsubscribe URL in a mail message. Spam filtering software such as SpamAssassin assigns a score to each email, saved as an X- header. Mail clients are free to interpret these tags as they see fit.

An expired email will not be automatically deleted if the user does not want it to be. This is important for archival purposes and to satisfy the stringent reporting requirements of the Sarbanes-Oxley Act. But now the user can make a one-button choice about whether or not expired emails be deleted, archived, moved away or kept around.

With help from legitimate bulk email senders (not spammers)

Bulk mail such as Facebook notifications could have expiration dates set to “one week after receipt”. Bill reminders could set the expiration date to be “2 days past deadline” (and then send another notification if payment is not received by then.) Donut announcements could expire at the end of the day. Talk announcements could expire at the end of the talk.

Fixing the post-vacation blues

Returning from a vacation is no longer refreshing, as we are thinking about the sheer volume of email we need to process once we get home. If I was on vacation when the donuts were on the table, I should not be bothered about it when I return. Go away! If it’s an invitation to a talk that happened while I was away, I don’t need to hear about it now.

What will it take for adoption?

Defining a standard is no use if it isn’t used. The best way for such a solution to be adopted is for a major email provider implement it themselves, perhaps in a limited beta? On the interface side, this requires two additions: one for sending, one for processing received messages. The widget at the sender’s end is simply a calendar picker, or a drop-down with relative dates (“tomorrow”, “next week”, etc.) At the receiving end, it’s a three-way radio button that lets users “Delete”, “Archive” or “Leave alone” expired messages.

Till then, it’s back to manual spring cleaning. Oh well.

Acknowledgments: I have had several stimulating discussions with my advisor, Manuel Pérez-Quiñones, and my colleague, Pardha Pyla, about our respective email filing strategies, (that mostly began as venting sessions). This idea no doubt borrows from my analysis and conclusions based on some of those conversations.

TGIF (apparently) Works!

27
Feb
2008

It’s no secret that Google hosts an employee-only event every Friday where we get to talk to Larry, Sergey and Eric directly (though the contents of each TGIF session are confidential.) In June, I walked up to the mic and asked them why Google wasn’t the default search engine in Opera Mini, the #1 mobile browser. It used to be #1 at that time; today, perhaps Mobile Safari has taken over that spot.

Today, I heard that this has happened. Seems like a good thing that they actually take feedback from interns seriously, and/or maybe I’m taking too much credit. :) In any case, I’m happy, because the older Opera Mini didn’t even let you customize the home screen to pick a search engine. I hope that has changed too (user choice is good.)

(Note: Both snippets mentioned here are public news; nothing confidential was released in the making of this post.)

Separating Phone Numbers from Phones

17
Jan
2008

Last night, I left my cell phone in my car. As with most of my follies, I realized it a few oh-no-seconds after I got home, but only after I’d taken off my jacket, gloves, cap, shoes and socks. It was an unnecessary walk in below-zero temperatures, but it got me thinking about phones, identities, what’s wrong about it all, and how it could be made better.

The problem is this: phones and phone numbers are tightly coupled together [1]. No wonder people keep their phones close to their heart — their personal identity is locked in it. If I don’t carry my phone, there’s no way to answer calls that I receive at that phone number. I can perhaps check voicemail from another phone, but still cannot make and receive phone calls under my own phone number.

Now compare this to email: if you go on a vacation without your own laptop computer, it is still possible to “borrow” someone’s random computer and check your messages. The messages you send will have your ID (your email address) attached to them, and the people you interact with will have no idea what machine you used (and there is no need for them to know.)

Why can’t we have a phone identity (our phone number) separate from the device (our phone) that is used to access it? If I forget my phone in the car overnight, I should be able to just add my phone identity to the home phone. That way, all calls that would have been received by my handset in the car will now be received at my home phone, and callers/callees will not know a thing. The next morning, I would re-establish my identity on my cell phone, and things will be back to usual.

I’m not a big fan of call redirects: that puts a temporary bandage on the problem instead of actually solving it. I don’t want my identity routed to another identity: I want to be able to use my own identity wherever.

This would also open up the market for multiple-identity phones. A couple can add both their identities to a single home phone in the evening, while they carry individual cell phones during the day. Forgot your cell phone at home? No problem, just borrow a loaner phone from the office receptionist and use it all day long (just as you would borrow a loaner security badge if you forgot yours). It would also make it easy for a group of people to be able to respond to a single phone call, e.g. despatch services for emergencies. A group of doctors could share a single phone number. Whoever is on emergency call duty would add the group phone number to his/her cell phone, and remove it after the duty ends.

Historically, a phone number has been tied to a phone, mostly because of technical constraints, beginning with the days of the human-operated telephone exchange. Email has shown that identities (email addresses) can be independent of devices (computers), that many identities can share a device, and many devices can be used by a single identity.

It’s an easy conceptual step forward to move to the many-to-many model instead of the current one-to-one. But there is a tremendous amount of change required of the infrastructure, and it won’t be cheap. But since I don’t happen to be in the business of implementing it (at least not yet!), so I’ll just write about this idea and hope that someone picks it up. Maybe someone will listen, and like it, and implement it.

Then I won’t have to walk out in the $#@*%$#^ snow to fetch a %$#%#$* cell phone.

[1] The more pedantic among us will point out that GSM phones keep the user’s identity on a SIM card, and CDMA phones maintain a single ID tied to the IMEI number of a phone. Although possible, that does not make swapping identities across phones easy: in the first case, you must have your current phone handy, which does not help solve my problem of having left the phone in the car overnight, and the second one requires a long phone call to the carrier to make the change. Neither is as quick or handy as the method I envision.

How do I eat Pringles chips out of a can?

29
Oct
2007

I ask you, the blogosphere, to enlighten me on the best way to eat Pringles that does not involve a bowl. The Pringles can is one of the iconic designs of modern times — uniformly-shaped potato chips in a tube — that seems to value form over function.

Let’s admit: eating chips is a secondary task for most Americans. These are snacks people munch on when they’re doing other things. Thus, these chips should be easy to grab with one hand and have the other hand free for the television remote, steering wheel or keyboard/mouse. At the same time, it is important that chips don’t spill, or worse yet, crumble in your hand. So what’s the best way to eat them without needing a bowl? (because using a bowl would just be weaseling out of this problem into one already solved in The Textbook.)

The first few chips are easy. (Isn’t that the case with everything? :) ) They’re within the grasp of your fingers, so it’s no different than plucking a few chips from a bag. It’s after the top few disappear that the problem starts. Should I force my hand into the can? Should I invert the can so the chips fall out into my hand? Should I tilt the can ever so slightly and tap on the side to have the chips exit one by one instead of stampeding all over themselves?

I’ve tried to dig in with my hand to get to the next few, but my hand is too big to fit inside the can, and it’s probably not a good idea anyway. I shudder to think of the day I’m in an Emergency Room with a Pringles can wrapped around my wrist, with $200/hour doctors cutting off an embarrassing roll of cardboard from the one organ that distinguishes men from apes. No, excavating anything but the top few is a job for professional archaeologists.

I’ve tried inverting the can with the lid on, so (I hoped) the chips would all accumulate on the lid, and then I could simply open it up and eat a few. The problem is, the quantum stable state for potato chips is a pile of crumbs. Inverting the can gets all the crumbs to the bottom of the can, and when the lid is opened, that’s what comes out first.

I’ve tried tilting the can at a precise angle and knocking on the side until the top few chips make their way slowly out the door. This sometimes works, but takes a long time, and very skillful knocking/tapping/flicking to get the right number of chips out of the can. Often, you’ll spend five minutes tapping unsuccessfully, then, out of a burst of frustration, you’d tap just a little bit harder, and have Pringles rain upon you. No go.

Dear Mommy taught me to search the Web before posting random questions to total strangers, so I did my homework. Here’s an innovative method of eating Pringles, but I’m no chopsticks ninja. And eating chips with chopsticks vaguely reminds me of the Seinfeld episode with George eating Snickers with a knife. You get the point, sort of.

So my question to you is, what’s the best way you’ve found to eat Pringles out of a can without spilling any crumbs, using a minimum number of hands to do it? A second, deeper, question, from my obvious position as a design and HCI person is, why has such a design resisted change over so many years despite being so hard to eat from?

Press Coverage of my Intern Work at Google

7
Oct
2007

It’s been exactly a month since my feature launched on Google Books. I went on an ego-surfing trip to see who had covered it. Here’s what I found.

The Case for Decentralized Social Networks

3
Oct
2007

This article was originally written October 3, 2007 and published here before OpenSocial was announced. With this blog post, I’m moving it to my blog to avail of features such as commenting and cross-linking with other related posts. I have not edited it since the original writing; if I do, edits will appear as updates marked as such.

Social networks are currently walled gardens: you need an account on multiple social networking sites to be able to interact with all your friends. This article makes a case for opening up the core protocols that define person-to-person interaction (decentralized networks) and various aspects of your public personality (decentralized applications). It is possible to use a few well-known semantic Web protocols and microformats to break down the walls and make the Internet a true social network.

Social networking Web sites are currently walled gardens. If you’re on MySpace, you cannot communicate with Facebook users or Orkut users. Although the features provided by most sites are comparable, if not equivalent, one must have an account on each of these sites to interact with members from that community.

The Motivation

That is not how social networks work in the real world. I do not need to be a citizen of a country or a follower of a religion to converse with the members of that country or religion, respectively. OK, this is a far-fetched analogy, but consider email.

The Evolution of Email

Before email as we know it today was in wide-spread use, the earliest way to send a message to anyone using a computer was simply to drop a file in their home directory. You could thus only send a message to users of the same computer as you. Ray Tomlinson came up with the idea of addressing users using the “@” sign, so email could be sent across computers. In the opinion of Jon Postel, this was a nice hack that finally evolved to an IETF RFC.

Today, we are able to address email to anyone on any network that’s connected to the Internet. Their ISP, or operating system, or mail transmission agent (MTA) or mail user agent (MUA; commonly referred to as a mail client) has no bearing on whether they will receive our email or not. The diversity in the email ecosystem allows me to receive, download and view my email in exactly the way I want.

Fast forward to Instant Messaging

Instant messaging evolved similarly, with ICQ, AOL, Yahoo!, and Microsoft all developing their functionally-equivalent, but non-interoperable protocols for essentially the same task. You had to have an account with each of those providers to be able to talk to their users. Along came XMPP and Jabber, followed by the development of an IETF RFC for instant messaging, which has now found support in commercial products such as Google Talk. XMPP does not require users to have accounts on multiple servers; if you have an account on one XMPP server, you can chat with any user on any other XMPP network (provided other prerequisites such as authorization are met.)

Why this makes sense for Social Networking

Social networking is no longer one of the fringe activities on the Web. There are several Web sites that purportedly do the same thing (and I’m too lazy to list them all.) The point is, social networking is now becoming a conduit rather than a destination. Much of our online time is spent on social activities, and the importance of individual users and their individual contributions taken together is increasing.

So what would it look like?

A decentralized social network would let users sign up at whichever Host site they prefer (just as you can sign up with any email provider today.) They would be able to participate and interact with users of any other such Host site, with no additional signing up to do. They would be able to create a profile that best reflects their motivation in signing up: a college student may sign up at a Host that allows him to display his classes and academic interests, while a professional may choose a Host site that emphasizes her skills and experience. Applications running on any Host site will have access to the Friends List of the user account they are running under, even across Host sites. A user’s profile may even be fragmented across multiple Hosts, with each Host hosting a particular aspect, or type of content for the user.

Advantages for Users

  1. You don’t have to sign up at multiple sites.
  2. You can choose a Host site that best suits your personality. If you like a casual, “explosion-in-a-media-factory” look, sign up for MySpace. If you prefer a more professional look, go for Facebook. If you want to expose more professional data than personal data, Linked-In is your Host. If you would like to express your affiliation with your company or non-profit, use their Host site as your primary home.
  3. If none of these suit you, just roll out your own Host site that hosts exactly one profile: yours. That won’t prevent you from being part of the larger social network.
  4. Since individual Hosts manage their user’s profiles, privacy can be controlled better. You will retain the choice to pick a service that best matches your privacy expectations.

Advantages for Developers

  1. When a new social networking site announces their own API and protocols, developers won’t have to scamper to port their existing app to it. They simply continue hosting it themselves, and the newcomer Host will simply talk the same language out-of-the-box.
  2. Developers will also have the flexibility of tailoring their interface whichever way they want — they will not be required to adhere to the strict interface guidelines of individual sites.
  3. For those that heavily rely on eye-balls and advertising, they can continue to host their own content, not subject to a third party’s terms of services.
  4. Expert users may choose to develop apps for themselves. These one-offs will be easy to integrate with that particular user’s profile.

Advantages for Hosts

  1. Host sites will be able to position themselves in a market better, and distinguish themselves from other offerings in a better way than existing sites can.
  2. The network effect will no longer be the dominant reason for users picking one social networking site over another, and such sites will have to compete on real features and good design, rather than simply “because all my friends are here”.
  3. Closer ties between users and hosts will enable them to tailor their services to the particular class of users they attract.

A Change in Philosophy

A few things will need to be re-thought, because, in a decentralized network, there are better alternatives to existing ways of doing things.

Disseminating and aggregating specialized content

We have seen specific websites and companies excelling in managing different types of data. Flickr specializes in photos, YouTube in videos, Blogger in blog posts, Twitter in one-line “twits”. All-purpose sharing websites such as Facebook also let you upload and embed all these types of content. What is the use of duplicating this content on each individual social networking site?

In a decentralized network, my blog could stay at Blogger, my photos on Flickr, and my videos on YouTube. My personal profile is simply an aggregation of these multiple aspects of my personality. What’s more, to design my own profile, I could just pick and choose the “modules” I want from a palette of available syndication options. (In fact, my own website is already designed like that: content you see here is aggregated from Twitter, Flickr, and FeedBurner, plus a few hosted pages.)

Developers can concentrate on what they do best, and outsource the rest of it to experts in individual areas. Photo album designers will not have to reinvent Flickr, and video distributors can simply leech YouTube’s bandwidth for their hosting.

Profile information can be mashed up

A user’s profile information can easily be mashed up for quick one-off applications. For example, if I need to create a list of all my friends from a particular group to print greeting cards, I do not need to write an application, submit it to Facebook and wait for their approval. I simply deploy it to my own Host’s server and get done in the time it takes to write “SELECT * FROM Friends WHERE Group = ‘christmas-cards’” (oh, and I would totally pick a host that provides a SQL interface for social data!) I can have an address book that integrates with my web-based email client, that maintains an updated list of email addresses of all my friends, of course pulled from their individual profiles.

Rethinking privacy and authorization

Authentication is easy (we’ll look at that soon.) Authorization is hard. But this is a problem that should be easy for public-key cryptography to solve. I’m not a cryptography expert, so anything I say here will be wrong. But I trust that if the experts put their mind to this, it shouldn’t be too bad to solve without having Alice, Bob and other alphabet-soup-inspired characters to make all their keys public.

Enhanced Search

To some, this may sound like a gross invasion of privacy, but in fact, deciding what information should be public, and making that publicly-accessible information searchable, are two different problems. Privacy gate-keepers at each Host will decide what content to make publicly accessible. Once that decision has been made, all the major search engines can index the public information (without having access to any of the private stuff.) Google made the Web searchable. A search engine for The One Social Network will make the world’s population searchable.

The Ground Work

A quick analysis of what’s required to make this happen makes us realize that much of the groundwork has already been laid.

Representing People and Relationships

The chief contribution of the recent boom in social networking is the recognition of the Person as a first-class entity on the Web. Earlier, the only way to represent a person on the Web was via her home-page. But that, too, was a static representation, largely disconnected from the activities and evolution of that person.

A recent push towards including semantic markup in Web pages has led to the development of microformats, a light-weight method of marking up entities within Web content in terms of loosely defined formats that do not interfere with the already-existing presentation duties of HTML. There is the hCard microformat defined for representing a person. The XHTML Friends Network establishes a format for indicating relationships among individuals on the Web. A lot of users and Host sites have made their pages XFN-Friendly, i.e., they have added semantic markup to the lists of their friends to indicate relationships.

Representing Activities

Blogs and twits have emerged as easy ways for people to broadcast their activities to whoever is ready to lend an interested ear. There already are standards that help people share these activity logs in standard formats: Atom and RSS.

Representing Personal Information

Again, microformats have been defined for such diverse things as user-posted reviews, calendar entries, résumés, addresses, geographical location information, with a whole lot of other discussions in progress. The mother of all social networking artifacts, tagging, has also been microformatized.

Communicating Across Diverse Websites

Many sites these days are opening up their APIs for external applications to access and modify users’ data over the Web. SOAP, XML-RPC and other, more formal protocols have given way to REST (Representational State Transfer) as a light-weight software architecture for distributed systems. With RESTful websites, it is easy for independent applications to modify data stored on servers: examples include Google’s GData APIs for many properties, Flickr’s API for accessing photos and metadata, Twitter’s API for posting twits, and many other services.

Distributed Authentication

Systems such as Open ID are emerging as viable standards for truly distributed authentication and identity management. There is no reason why an OpenID-based system cannot be used for the Network We Talked About. If we throw in the ability for Hosts to share authentication lists, that would make all Hosts available to all Users, and the question of having to “pick” a particular host may be moot.

What’s Missing?

Communication Protocols for Posting Messages Across Hosts

REST is here, but it only defines the transport architecture. A RESTful communication protocol will have to be developed for users to be able to post messages to other users on other Hosts. Nothing monumental, but just one thing that needs to be done.

Representing Groups Across Hosts

Groups of users will need a way to be recognized across Hosts. A simple way of doing this would be a naming scheme that stays unique across the network, much as Usenet groups have been. A lot of the lessons learned from the design of Usenet can be used here, because today’s social networks are much similar to Usenet, with a few other goodies thrown in.

Current Efforts

Although we are far from this vision, some sites (mainly Facebook) seem to have started on this path. The Facebook platform was a unique step in allowing developers to access users’ profile information. Though, Facebook still is a walled garden. In part to increase traffic, they also have taken baby steps in making users’ profiles available to search engines. MySpace profile pages are still very un-crawl-able. Flickr, Upcoming, and other Yahoo! sites use microformats extensively. Facebook provides RSS feeds of user activity.

Although these are steps in the right direction, they are not enough. Hopefully, we will reach a critical mass of social networking sites that adopt an open social network policy. Till then, you can find me at my many online haunts.

How many languages does it take to change a Keynote slide?

30
Sep
2007

I was playing with Telekinesis on Friday, which lets you use an iPhone as a remote control for your Mac. The idea is simple: Telekinesis runs a web server on your machine, and the iPhone connects to it. It ships with a few Telekinesis Applications (or “tapps”), or you can write your own to control your own programs.

I wrote one to control Keynote presentations from your iPhone. It’s fairly simple: it shows you the current slide and the presenter’s notes for that slide, and it lets you go forward and backward through your slide deck. (No, it’s not release-quality yet, but expect it in a few days.)

So here’s the real meat of this blog post: (Warning: geeky-acronym-land ahead.)

  • Being a Mac OS X app, Telekinesis’s UI is written in Objective C.
  • It exposes a web server that can run PHP scripts.
  • My remote application is a set of PHP scripts that sit on the Mac and run when the iPhone user launches the app.
  • On the iPhone, the user makes a request to the PHP script, that generates HTML, CSS and JavaScript to format the page for the iPhone
  • To capture the current slide, I use a command line program (screencapture) inside a shell script from within PHP.
  • I resize the large slide for the iPhone using another shell script, and push it out to the phone as a stream of bytes, via PHP.
  • To change slides, the user clicks the Next and Previous functions on the iPhone, which use AJAX (JavaScript, XML, XmlHttpRequest) to send the request to a PHP script;
  • the PHP script interprets this request, and wants to use AppleScript to ask Keynote to update the current slide. But since there is no direct way to invoke AppleScript from PHP, we use the command-line tool osascript in a shell script to run our AppleScript.
  • Keynote hears the call to action from our AppleScript, and changes the slide.

So, our champion team now includes the following players: Objective C, PHP, HTML, CSS, JavaScript, Shell Script, and AppleScript: all with the single goal of changing a Keynote slide.

Has anyone changed lightbulbs with an iPhone yet?

You know what I did last Summer?

6
Sep
2007

This, covered at the Official Google Book Search Blog.

While it is easy to share links, photos, videos, and opinions on the Web, sharing books with your friends online used to be tough — and tougher even, to share individual clippings from a book. This summer, I worked with the Book Search team to add clip-sharing features to Google Book Search.

You can now highlight a section of text in any public domain book in Book Search, create a clip from it, and share it with the world. You can post your favorite clips to your blog along with a personal annotation, collect them in a Google Notebook, or share them with friends anywhere you decide to embed the link. Your clip looks exactly as it appears in the book, or if you prefer plain text, we have that too.

Also at the Official Google Blog, about collecting, sharing and discovering new books.

We’ve also launched a way to let users, select, copy and embed segments of public domain books (like the Newton quote) in any web page. We hope to make it as easy to blog and quote from a book as it is from any web page. Like many innovations at Google, a stellar summer intern worked on this.

Of course, no project is a single-person effort: Bill Schilit, my mentor; Nathan Naze, JavaScript God; Adam Mathes, Venu Vemula, and the rest of the Book Search team laid the foundation and were an integral part of this feature.

A Proposal to Integrate Site-Specific Search Boxes into Browser Chrome

14
May
2007

Why do I have to search for the search box on any site I visit, before I can type my query into it? Given that almost every well-designed site has a search field, and it has been recommended as a good usability practice since 2001, why is it sometimes hidden deep inside the layout? Here is a suggestion for a change in the browser UI that will enable users to find the search box faster. Even faster than other suggestions so far. It only involves a little semantic markup on part of page authors and some redesign on part of browser makers.

The search box within the browser is underused.

The search box is one of the few basic design patterns omnipresent on the Web. It is also a de facto usability practice to place this search box towards the top right corner of the page. Yet, every site has it at a slightly different location (and some, even towards the bottom.) The user needs to search visually for the search box, or at least glance around until she finds it.

During this entire time, the search box within the browser chrome typically lies unused. It has a default search engine defined where it directs all queries typed into it. Why can’t the currently loaded website take advantage of this in-built search field for its own site-restricted search?

How it would work:

When a user is browsing a site that supports this feature, any searches conducted using the browser search box will send the queries to the site in question, and be able to display search results directly. When no site is loaded, the queries are directed to the user’s default search engine, just as it is now.

This proposal does not require any significant changes to any markup language — all that is needed is to enhance the markup with semantic knowledge, and microformats are just the answer! Simply marking up a <form> element with the CSS class "search" should be enough to tell the browser that this is a search form that should be promoted to the browser’s search box chrome.

Prior work on similar problems

HTML 3.2 (yes, 3.2) defines a link relationship type for search pages. Adding <a href="/search" rel="search"> indicates that the outgoing link is to a search page. Browsers currently don’t do much with this information (please correct me if you are aware of a browser that does something intelligent with this information). OpenSearch is (in their words) a simple format for the sharing of search results. Useful as it is, OpenSearch is more geared towards large-scale general search engines, and browser makers are adopting it as a standard for letting their users pick and install search plugins in browsers.

However, being able to customize the search field on a per-site basis with zero configuration on part of the user is not addressed by any of these proposals.

Addressing Potential Criticism

A few critics might argue that such usage dilutes the purpose of the search field (“It’s meant for searching via a search engine, not on a per-site basis”) or be concerned about possible user confusion (“Is my query being sent to a search engine or to the site I’m now visiting?”).

Considering an intentionality-driven approach to design, this UI is perfectly aligned with the user’s intentions. If a page has been rendered in a browser window, the user’s intention is likely to search within that site. If the user wanted to perform a search using a search engine, there is always the possibility of loading a new tab and then searching via the same field.

Issues of Mode

This also brings up the question of whether such a UI is inherently mode-based. (Modes in a User Interface are said to exist when a single input can result in two or more possible outcomes, depending upon the state of the UI at that point. It is generally considered bad design to employ modes in a UI because it invariably leads to user confusion.) In this case, it is arguable whether or not this UI employs modes.

The user task is “to search”, and the current site can be considered a specialized search engine (the specialization is that they only search within their own site). Given that a lot of sites employ site-limited versions of generalized search engines (e.g. Google’s Custom Search Engines), this notion is not very hard to think of. Hence, I argue that these are not two different modes, but two different search providers for the same box, just as current implementations offer users a choice among Google, Yahoo, Altavista and others.

Indicating the currently active search providers in browser chrome

It is also easy to indicate the destination of the search queries in a visually accessible format. Safari (for example) displays the word “Google” in the search field. When a site-search box is displayed in its place, it is trivial to display the name of the site instead of the word “Google”. It is also trivial to reuse the favicon for the same purpose.

Why not?

What do you think about this idea? Comments, suggestions, enhancements appreciated!

The Letter and Intent of Creative Commons Licensing

10
Feb
2007

I subscribe to Seth Godin’s blog, and I find his opinions very thought-provoking, I might add. I especially like his rants on usability and good versus bad experience design.

This morning, I read a post by him about one of “his” books being sold on Amazon. To explain why the “his” is in quotes, here’s what happened: Seth wrote the book in 2005 and licensed it under a specific Creative Commons (CC) license1. The book was and is still available for free from Seth’s website as an unlocked PDF. A book publisher, who had nothing to do with Seth directly, went ahead and printed the book which is now available for $9.99 at Amazon.com2. Seth is now pissed off at someone doing something like this, and is encouraging the readers of his blog not to buy that book.

I think that the publisher’s action is not only within the letter of the law, but also within the intent of the Creative Commons license Seth used. There are more than one CC licenses, and the specific one that Seth chose allowed free copying of the book, as long as authorship was properly attributed. Although there exists a Creative Commons license that disallows commercial usage, Seth chose not to apply that clause (which, by the way, he now considers was a mistake back in 2005.) This leads me to conclude that the publisher was offered those rights by Seth himself.

I can understand Seth’s getting pissed off because someone else was making money off his effort, but at $9.99, I think it nicely covers printing costs and perhaps makes a little profit for the printer3. If I already had the PDF eBook and still wanted a paper copy, I’d be super-willing to pay $9.99 for simply the printing, binding, cover, etc. I see nothing wrong with the printing of the book.

Although I admire Seth’s decision to license his work under a CC license, I feel he is going against the intent of the license by exhorting his readers not to buy a work that was permitted expressly because of that license. If he really wanted to follow the spirit of the Creative Commons, he should have provided a link to the book on Amazon and encouraged his readers to buy a paper copy in addition to the free eBook they might already have downloaded. His current actions undermine the spirit of openness that the original grant of the license had fostered.


1. I also use a Creative Commons license for this website and for all my non-academic writing.
2. I do not know the publisher and I do not earn any money as commission or from Amazon referrals. Just to make it clear, you know.
3. Maybe more. I don’t know much about the printing industry.

From the Desktop to the Phone … Seamlessly

16
Nov
2006

Google just announced a new feature in Google Maps: Click to Call. When you find a business on Google Maps, you can ask to be connected directly. Google then calls you on the number you provide, and places a call to the business at the other end.

This is yet another example of seamless task migration. The user’s ultimate goal in locating a business is to get in touch with them. The most common way to do this today is to call using a phone (at least as long as Voice-over-IP is not as ubiquitous as cellphones and land-lines). Lo, Google bridged the gap. End-to-end support for a user’s tasks using multiple devices is a challenge that’s getting its due attention only recently.

Hopefully, we will soon be able to do the same with phone numbers all over the Web. Imagine a button on my website that says, “click to call me”. Or, a button on my photo albums page that says, “view as a slideshow on the living room TV”. Or being able to press a button on your car radio to “read more about the currently-advertised product once I’m back home”.

A Tale of Two Interfaces

23
Oct
2006

Synergy, a mouse and keyboard sharing utility, has proven insanely useful to us users of multiple machines on a single desk. Think of it as a software KVM switch, but minus the “V” (for video.) You can arrange multiple machines side-by-side and Synergy seamlessly moves the mouse pointer and keyboard input from one machine to another at desktop boundaries. It’s a great idea and a great tool.

I use QuickSynergy on my PowerBook and Mac Mini, but later happened to look at the official GUI client on my friend’s Windows laptop. It’s not often that a user interface provokes a blog post on a Monday morning, but this was it.

Here are the screenshots:

QuickSynergy
On Mac OS X
Synergy
On Windows

QuickSynergy.png

QuickSynergy Client.png

QuickSynergy About.png

Synergy Main Screen.png

2. Synergy Configuration.png

3. Synergy Options.png

4. Synergy Hot Keys.png

5. Synergy Advanced Options.png

6. Synergy Auto Start.png

7. Synergy Info.png

8. Synergy Log.png

9. Synergy Running Test.png

10. Synergy Started.png

You will notice that QuickSynergy has exactly one dialog box (with two tabs, one to use when running as a server, and another when running as a client) plus one About dialog. Synergy has a total of 9 dialog boxes (plus one About dialog.) The question, I wish, the developers had asked themselves, was whether throwing in a dialog box for every single configurable parameter was the right thing to do. It seems like the UI Designer(s) simply gave up on trying to understand the users’ needs, and instead just threw everything out to the user: “here, now there’s a dialog box for every single line in the configuration file, go figure it all out.” In my opinion, that’s the designer shirking his or her responsibility of actually designing.

Synergy Relative Mouse Moves.png I wonder how many regular users would ever want to change some of the arcane options. And if there was a savvy user that wanted to, she could just edit the config file! Even as a Computer Science Ph.D. student, I have no idea what the “Relative Mouse Moves” option means, or why I should care about it. (If you say RTFM, that’s already the sign of a bad interface.)
QuickSynergy
On Mac OS X
Synergy
On Windows
QuickSynergy.png 2. Synergy Configuration.png

Notice how, in the configuration screen, QuickSynergy simply shows you one screen with four text fields on the four sides, whereas Synergy expects you to enter the positions as “Machine X is to Direction Y of Machine Z.” The first way is so much more natural, but guess why the Synergy implements the second way? Because the configuration file is written that way.

These are clearly two very different styles of GUI design (though I would strongly argue that a text field for editing a configuration file does not count as a “GUI”, it’s simply a command-line interface (CLI) inside a text field.) Quick Synergy puts the user first, and is designed to let the user work naturally with his/her mental model of a keyboard/mouse layout. Synergy starts from the configuration file and slaps on a UI on top of it. Thus, Quick Synergy comes closer to the user, while Synergy stays closer to the machine.

Synergy QuickSynergy Comparison.png

UI Design is not about letting users edit configuration files, it’s about letting them do what they started out to do. That a config file needs to be edited to make that happen is a side story.

Bookmark and share using ...

Delicious Facebook Digg Google Friendfeed Stumbleupon Twitter Linked In