Simplified Twitter Microsyntax for the Haiti Earthquake

18
Jan
2010

In this post, I have typeset many more sentences in bold than I usually do, so readers can quickly skim through it.

I applaud the efforts of U. Colorado’s EPIC Group in assisting the victims of the Haiti earthquake in calling for help using Twitter, and to make their tweets discoverable and actionable. I just performed a Twitter search for the terms #haiti -RT -http (includes all Tweets tagged #Haiti, except retweets or links) to inspect some of the tweets that are directly related to happenings on the ground, and they are (as expected) only a minuscule percentage of the total number of tweets about #Haiti. Syntax is thus sorely needed to achieve a decent signal-to-noise ratio to assist relief efforts.

Though, in my opinion, the current version of the tweet syntax seems too formal, too rigid and a tad too complicated for victims or rescuers on the ground. I am a programmer, and even I had trouble mentally parsing a few of the examples provided. We must keep in mind that Haiti is a bi-/tri-lingual country (and neither of them is English), so any syntactic terms used should preferably be semi-obvious to non-native speakers of the language as well as rescuers.

Roles of Microsyntax

  1. Make tweets discoverable: Microsyntax can assist local search-and-rescue efforts and unaffected Twitter users in determining if a tweet is actionable. This task is partly a Signal Detection Task and partly a Data Mining problem. In both situations, microsyntax can prove helpful: all that’s needed is a single tag that emphasizes that a particular tweet is actionable (versus not), e.g. #haiti #rescue (or #haitirescue, to avoid having to type a second # (hash) sign). This will greatly increase the sensitivity parameter d’ of the signal detection task.
  2. Make data mining easier: Once a tweet has been detected to be actionable, its contents must be parsed into a form that local efforts can take action upon. While it’s true that all the other proposed microsyntactic tags make it easier for applications to parse the data, this is at the cost of requiring users to learn new syntax. This seems to me a little too much to expect from victims of a recent calamity of this scale as well as from rescue workers with other higher priorities. Instead, as long as our tools can identify relevant tweets, computers should be able to perform the second task of parsing locations, names, and verbs from tags quite easily.

Also, microsyntactic terms need not always be prefixed with # (hash) signs; they are often difficult to type using cell phone keyboards, and on some handsets, may hamper input methods such as T9. Because of the intervening # signs, Tweets containing the proposed microsyntax decrease typographic readability for someone browsing through tweets.

To summarize, this imposes a heavy cognitive load on victims and search-and-rescue efforts while making parsing easier for machines. However, the task of parsing details from tweets can also easily be performed by large numbers of humans a.k.a. crowdsourcing via volunteer efforts or via tools such as Amazon’s Mechanical Turk.

Simpler, Lighter Microsyntax

The following are examples of microsyntax that are more readable, yet also parseable by machines. All situations are based on the ones in the original proposed microsyntax. Most are directly based on the EPIC microsyntax, with a few simplifications.

  • Rule 1: Always write in the third-person. This takes care of part of the name problem.
  • Rule 2: Instead of using #loc for locations, use “at”. It’s much more natural and not much more difficult to parse.
  • Rule 3: Verbs are actionable. Not syntactic verbs, but English (or French or Haitian Creole) verbs. It’s a trivial task to populate a tool with a dictionary to detect all word forms correctly.
  • Rule 4: Anything that cannot be parsed ends up as the equivalent of the #info tag (see EPIC syntax).
  • Rule 5: The entire text of the tweet should always be available to a human, so whatever information was incompletely parsed can be understood manually, and optionally added to the parsed version by a human.

The general aim is to require as little syntax knowledge as possible, and to keep as close as possible to the natural way people write tweets.

Examples

TWEET-BEFORE: Sherline Birotte aka Memen. Last seen at 19 Ruelle Riviere College University of Porter a 3 story schol building
TWEET-AFTER: #haiti #ruok #name Sherline Birotte aka Memen. Last seen #loc 19 Ruelle Riviere College University of Porter #info a 3 story schol building
Simplified Microsyntax: #haiti #rescue Looking for Sherline Birotte aka Memen. Last seen at 19 Ruelle Riviere College University of Porter, a 3 story school building

This tells the computer us:
What = Looking for someone.
Who = Sherline Birotte aka Memen (identified fuzzily based on initial capital letters)
Where = 19 Ruelle Riviere College University of Porter (automatically parsed based on “at”)
What else = “a 3 story schol building” (i.e. everything else in the tweet)

TWEET-BEFORE: Mirna Nazaire lives in P-A-P at Bizoton 6#12. Entire neighborhood without food. People are dying.
TWEET-AFTER: #haiti #need #food #name Mirna Nazaire lives in #loc PAP at Bizoton 6 #12 #info neighborhood w/o food. People dying
Simplified Microsyntax: #haiti #rescue Mirna Nazaire at PAP at Bizoton 6#12 needs food. Entire neighborhood without food. People dying.

This tells us:
What = needs food. (automatically detected from the verb in the sentence.)
What do they need = food (automatically detected from the object in the sentence.)
Who = Mirna Nazaire (heuristically determined from initial capital letters.)
Where = PAP at Bizoton 6 #12 (detected from microsyntax “at”)
What else = “neighborhood w/o food. People dying.” (Rest of the tweet, unfiltered.)

TWEET-BEFORE: French hospital is now open and ready to receive the wounded at the french lycee in rue marcadieux bourdon
TWEET-AFTER: #haiti #offering #med #loc french lycee in rue marcadieux bourdon #num 30+ #info French hospital is open and ready 2 receive wounded
Simplified Microsyntax: #haiti #rescue French hospital ready to offer help to 30+ wounded at the french lycee in rue marcadieux bourdon

This tells us:
What: Hospital. Also, something to do with medical efforts. (no need to tag explicitly, we can infer that from ‘hospital’.)
Where: The french lycee in rue marcadieux bourdon. (Automatically parsed from microsyntax “at”.)
How many people: 30+. (It’s already a number, no need to state “#num” explicitly.)

These are just a few suggestions. I will be contacting the PIs (principal investigators) of the EPIC project directly with some of my recommendations, but please continue to follow their syntax until they recommend anything different. The current syntax proposal isn’t perfect, but it is more important to avoid fragmenting the tagspace.

Grad School 101

29
Aug
2009

This is a collection of tips compiled for a seminar series at Virginia Tech for Computer Science Grad Students in Fall 2008, compiled by Manas Tungare, with contributions from (in alphabetical order) Manuel Pérez-Quiñones, Rhonda Phillips, Pardha Pyla, Naren Ramakrishnan, Bill Schilit, and Andrea Wiggins.

Download Slides

Download PDF

1. Introduction

This document is extremely terse, so you don’t end up spending too much time reading it.

2. Getting Started

2.1. Classes

In your first few semesters of grad school, you will be required to take classes. They provide an opportunity to learn about new areas, and gain some depth in your area of focus. But grad school is not about getting an A in every class. Don’t ignore research in favor of getting good grades.

Be clear about the requirements of your program: you don’t want to discover that you missed taking a required class and have to take it towards the end of your Ph.D.

Quote from a professor (used with permission):

Students need to know what advisors do when writing a letter of recommendation. I get requests to write letters of recommendation, and all I can say is “s/he was in class frequently”. Students need to know this, not just to avoid that awkward moment of asking for a letter of recommendation that might not be the one they want, but also so this might serve as motivation for them to attend class. They can’t be shy. They need to be known by professors. Ask questions. Go to research group meetings. Read their papers, etc.

2.2. Choosing an Advisor

Find an advisor whose research interests closely match yours. This is obvious. What is not obvious is the next part: find an advisor whose work culture and personality match yours. You will be working with this person for the next several years, and if you do not enjoy a great working relationship with your advisor, things might get rough.

How do you find out about an advisor’s personality? Talk to their current students and ask around. You’ll need to read between the lines of such conversations, keeping in mind that because they work with that advisor, (1) they either genuinely love working with him/her or (2) they will not talk ill of him/her for fear of retribution.

Ask around: some advisors graduate students rapidly, while others are seldom around, so their students take longer to graduate. Some advisors will provide for frequent, short interactions (face-to-face, email, etc.); others will require you to have completed a significant amount of work before they agree to meet you in person. Some encourage independent work that is mentored sporadically; others may engage in more frequent collaborations. Hardly any will provide hand-holding-this is grad school.

Faculty members can be roughly classified in two very broad groups: senior versus junior. Junior faculty looking for tenure must publish prolifically, while senior faculty have more experience in the field-both are good qualities that will ultimately help you. Weigh these factors if you ever need to pick between one from each group.

Before entering into a long-term relationship with someone you’re interested in, a good way to figure out if you’re compatible is to … do an “Independent Study” with them. This is a no-commitment research-oriented one-on-one project. It’s OK to double up class projects or follow them up as independent studies – talk to the faculty member involved.

Understand what professors want/need from you: support for the primary area of research they are interested in, and willingness to take that research forward in meaningful ways. Funding for graduate students comes from research grants, and if you help write one, you may get the money that the grant brings in.

Here’s a quote from a faculty member (used with permission):

For example, I often email students things like “Hey, this conference call seems close to your work. What do you think?” A negative work student (which, I might add, I have learned to spot quickly and don’t often want working with me) will often reply “Interesting, do you really think we should publish there?” That is clearly the wrong answer. If I didn’t think we should publish there, then I wouldn’t have forwarded this to the student.

The other extreme student (which is typical of the students that I work with) replies: “Very interesting. I went to their website and looked at a couple of papers from last year. They definitely seem similar to my work. I have taken the liberty of jotting down some notes about possible papers that we could write. Let’s discuss these next time we meet. Thanks for sending it.”

2.3. Mentoring

Your advisor isn’t your only mentor; find a senior student in your program with similar interests (and, if possible, the same advisor) who is willing to give you some sage senior advice and help you avoid a few common pitfalls. Peer mentoring can be truly invaluable. Astronauts and professional athletes aren’t the only role models; find a high-performing person at the next higher rank from you and emulate their behavior.

Quote from a Ph.D. student:

When I was a Master’s student, I tried to adopt the practices of a successful PhD student, and it really paid off. As a PhD student, I try to meet the expectations for junior faculty so I’ll be better prepared for that role.

While a dissertation is a very individual activity, there are patterns that all grad students go through that are similar (that’s why Ph.D. Comics is funny). Cultivate early, a group of peers, including a few more advanced students, that have experienced the ups/downs of doing a Ph.D. This will help you survive the difficult parts and help you celebrate the high points along the way.

3. Research

3.1. Literature Review Basics

Any research project begins with a review of the existing literature in the field (or at least, that’s how it should be.) Be exhaustive while citing sources; do not open yourself up to claims of plagiarism, even if you are honestly pursuing your own research. Look up citations for stuff that seems like it may have been done before. Ask your peers (lab mates, professional colleagues at other institutions) if they are aware of any papers near your area.

Posting specific questions about related work to your social network is a good way to learn about work that might not have been widely cited. Some colleagues might offer you a sneak peak at unpublished work. Respect the confidentiality that is implicit in the sharing of such work.

3.2. Managing Research

Research requires as much management as pure effort. You must be able to manage your time, your resources, your sources, information you collect, information you come across, information you generate, samples of stuff you record from experiment participants, interview transcripts and recordings, videos, log files from computer software, etc. Make sure you store these items safely.

There is software to conduct every kind of research activity: pick the best tool for the job.

Word-processing programs that were once adequate for undergrad-level reports and essays cease to scale up to the demands of academic publishing. Consider LaTeX for academic writing: it is free software available for all major operating system platforms, and will not cause corrupt documents, a common occurrence with proprietary software that reads/writes undocumented binary file formats. The effort required to gain familiarity with LaTeX pays itself back several times over during an academic career.

BibTeX is a tool and file format for managing bibliographies, along with several applications that help manage BibTeX-formatted bibliographies. Keep a single library of bibliographic material starting from Day One. This will save you several hours when collating annotated literature to cite in a paper or article.

3.3. Communication

Do not underestimate the value of communication in research. Not only will you perform research by yourself, but you also will be required to talk about it with your advisor, discuss it to lab-mates, write about it in papers and articles, and present it at conferences.

Get feedback from your colleagues before communicating with a wider audience – many heads are better than one. You will be surprised at what other people will spot in your work that you had somehow managed to overlook. This applies equally to papers, talks, presentations, defenses and everything in between.

4. Conferences

Starting to write early on makes your writing better over time. Make your presence known in your research area: publish interesting stuff, but not for the sake of publishing something. Some call this the `least-publishable unit’ model of publishing and many disagree with it. Submit meaningful, completed works of research to the appropriate conferences and journals.

Be proactive in submitting to conferences: do not wait for your advisor to come across a CFP (`Call for Papers’). Subscribe to announcement lists in your field where a lot of CFPs get posted. Your advisor can help you locate these.

Attend conferences even if you’re not presenting during that particular year. It is important to stay abreast of research in your area. Often you will have to pay your own way to attend a conference in another country; this money is well-spent.

It is said that during conferences, hallways are where the action is, not the session rooms. That is where the networking happens. Meet people from your field: these are the same people whose work you build upon, who will build upon your work, and contribute to the same scholarly community. In addition, some of these may be interviewing you after you’re done, either as faculty search committee members or as industry researchers.

Attend smaller research meetings too: departmental get-togethers and seminars are not a distraction from your regular work, but a part of it. Don’t be left out!

If you’re in your second year or higher, prepare an elevator pitch. Be ready to talk to complete strangers about what you do. Prepare individual spiels for the following sets of audiences: your grandma/grandpa, your friend from high school who is pursuing a degree in fine arts, your friend from undergrad in Computer Science, your lab-mates, people in your research area, and finally, your advisor. These are sorted by increasing levels of awareness of the field and research experience. This means that you can dig into several levels of detail as the need arises (sometimes also known as the pyramid model of communication.)

4.1. Presenting at Conferences

A conference presentation is not a verbatim recitation of your paper. Often, the time available to presenters during a conference session is barely enough to whet the appetite of your audience and entice them to read the entire paper. Do not try to cram every single finding from your paper into the 20-minute talk.

Practice presentations before a crowd of your peers before you go to that big conference. There is no substitute for rehearsing. Do not ever read from your slides. Do not EVER read from your slides. There are numerous tips for how to be a better presenter and public speaker, far too many to include in this document.

4.2. Student Volunteer Opportunities

Check whether conferences of your interest waive registration fees and/or provide additional perks in exchange for volunteer service during the conference. If you feel you are capable, take up service positions and be of assistance to the community at large. Apply to be a Student Volunteer as soon as (or even before) you submit your paper. Often, deadlines for Student Volunteer applications are earlier than paper submission deadlines (though this varies by conference.) If accepted, the conference organizers will pay a significant part of your travel expenses (registration, hotel, etc.)

`SV’-ing also is a great opportunity to meet your peers. These are other students who will one day be your research colleagues, collaborators on grants, paper reviewers, co-authors and life-long friends. Meet them, keep in touch.

But remember that your academics come first; service should not come at the cost of your research output.

5. Internships

Internships offer students a perspective on where their skills might be useful outside of the academic realm. Even if you do not intend to pursue a career in industry, opting instead for an academic career, an internship provides a unique perspective into your own work and how it fits into the research community at large – something that a summer at school would not be able to provide.

Make sure you plan this with your advisor well in advance. Do not spring a surprise on him/her after you receive an offer and rent an apartment. Some advisors may prefer that you not go on an internship, but continue to work on your research so you may graduate earlier. Others encourage their students to pursue internships, while a few others may actively provide you leads for promising internship positions. Most research internships are gained either through your own research reputation, or through your advisor’s professional contacts. Thus, a conversation with your advisor on this topic should be held earlier rather than later.

Researchers from industry often scrutinize their intern candidates at conference presentations and other socio-professional venues. You might even receive a spot offer over a casual conversation with your would-be mentor if they are impressed with your work.

While the internship application process is not as rigorous as that for a full-time position, there are few positions and they’re filled as soon as a candidate with matching skills is found. Apply early. Quote from a Ph.D. student:

Don’t be afraid to pursue opportunities. I was always intimidated when I saw solicitations for fellowship applications, travel funding, student grants, awards, etc, but I learned that it doesn’t hurt to apply, and sometimes you’re more qualified than you think you are.

Some research institutions offer internships that do not involve rigorous research, but instead utilize your prototype-building skills to push their research agenda. If you’re in your early years of the graduate program, these are fine ways to get you the proverbial foot in the door and a deeper insight into that institution’s research program.

While at an internship, do not completely ignore the research you conducted back at school. Advisors will understand that your current time demands must favor your current employer, but a three month hiatus in school research can lead to problems resuming it once you’re back.

6. Networking

Know your community and try and get them to know you.

Solving an important research problem sitting in one’s windowless cubicle is good, but networking with peers at research conferences and other academic venues increases one’s chances of getting a real job.

Most academic networking happens at conferences. Make sure you carry business cards so you can exchange contact info easily if required. Though, quite often, you will end up exchanging cards with your peers – who also just got their cards printed! – you never know when you might meet someone who really would like to get in touch with you. Since most academic positions are filled via word-of-mouth, it’s always good to be easy to contact.

You may be invited to spontaneous group lunches or dinners with newly-made acquaintances, which often lead to excellent conversations to remember for a long time. After all, these are people from your own research field who have all congregated to exchange ideas with one another. (But of course, don’t be pushy or show up uninvited.)

If your advisor, committee members, or senior students are attending, they may introduce you to their acquaintances from other schools/companies.

Many advise removing your social networking profiles, such as Facebook or Twitter profiles. While this advice has some merit if your profile has objectionable content, that is likely not the most common case for this audience. As information grows, one’s social network is a valuable resource to be tapped into for relevant, up-to-date news from your field. Often, such nuggets might be found in your friends’ status updates. After several conferences I’ve been to, attendees have taken the initiative to start online groups to continue the conversation beyond the conference. It is a good idea to participate in these.

There also are networking conferences designed for specific audiences; if you are a woman, consider attending the Grace Hopper Conference. If you belong to a minority group in computer science, consider attending the Richard Tapia Celebration of Diversity in Computing. There are likely be others in your own field as well.

7. Preparing for the Real World

7.1. Tenure Track (Academic Positions)

For an academic position, you will be evaluated on your ability to chart your own research agenda, and pursue it to fruition. This includes coming up with hypotheses, conducting experiments and publishing your results. As the first two steps are opaque to an external observer, you are thus judged primarily based on your publication output. Whether you like it or not, it’s a publish-or-perish world.

Always keep an updated copy of your résumé and CV available online. A CV (curriculum vitae) is a complete record of your relevant accomplishments and forms part of your application dossier for academic positions.

If your university supports student involvement in school governance, become involved in governance committees. This may seem like a distraction from your studies, but it is excellent preparation for a faculty position in which you will be expected to contribute in service roles in addition to research and teaching.

Do as a student what faculty do: writing research grant proposals. You may start off by assisting your advisor in writing sections of their grant proposals that are related to your research. It is often possible to turn your dissertation proposal into a grant proposal. Even though students cannot be officially co-PIs (Principal Investigator) of a grant, the experience you will gain even from being a nameless contributor is priceless.

7.2. Industry Track (Research Laboratory Positions)

Industry values slightly different skills in budding researchers than does academia. Industry often runs on much faster timelines than academic research, especially in terms of bringing products from the laboratory to the mainstream. Thus, in addition to fundamental research, an industry researcher plays an important secondary role in “productizing” research ideas. This involves developing prototypes and working with development teams to implement these on a larger scale.

It is a belief among some graduate students that a career in industry limits one’s freedom in what you may work on. While it is true that an individual researcher’s broad research direction may need to align with the company’s strategic vision, there is often wide scope for researchers to define the specific problems they are interested in working on and pursue those with their team. It is often the case in academic (though lamentable) that choice of research direction is dictated by available funding opportunities.

8. Closing Statements

Take care of yourself first. Paper submissions, assignments and pending work can take their toll on your health if you ignore yourself. Make sure you eat well, on time, and maintain the energy that is the foundation of everything else.

Enjoy your time here. You’re doing this because you love it!

9. Resources

One-button Phone Number Sharing

10
Feb
2009

Send this Phone Number to the Current Caller

How often have you found yourself calling a friend to get the phone number of a mutual friend? And then having to hold the phone while your friend pulls up the contact list on their phone, then recites the number to you, and then you write it on paper because your phone won’t let you add contacts while you’re on a call, and then you misplace the number you wrote on paper, ad nauseum. Why isn’t there a single button that says “Send this Phone Number to the Current Caller”?

It’s a common problem. You’re out and about, and realize you need to call a specific person, but you don’t have their phone number (or more often, you have it on your desktop computer, or your laptop, but that doesn’t do you any good in the current situation.) So you decide that the best thing to do is to call a mutual friend and ask them.

When they receive a phone call from you, they’re fumbling to hold the call while they look in their address book. (That is, if they’re lucky, and if their phone actually lets them open the contact list while they’re on a call.) More often, what happens is that they tell you to hang up while they consult their address book. And then you have to hunt for a piece of scrap paper because your phone won’t let you add a number to the list like that.

What the world needs is a button next to each phone number in the contact list that only appears whenever you’re on a call. The button, when pressed, sends an SMS from you to the current caller, and contains within it the information from the contact record you just selected. It doesn’t have to be too fancy, a two-line VCF record should do nicely.

If the recipient’s phone understands this method of contact transfer, it can prompt the user and import it automatically. If not, the user can still read the SMS herself, and dial the number. No more paper, no more fumbling, no more “let me call you back”.

It’s so easy, a caveman could do it. If only phones implemented it!

My Research Philosophy

Permanent Link | Filed under: Academic, Thoughts
8
Feb
2009

I wrote this recently, not as a blog post, but for another purpose. I figured I’d post it here like I do everything else.

Re•search: noun. Investigation or experimentation aimed at the discovery and interpretation of facts, revision of accepted theories or laws in the light of new facts, or practical application of such new or revised theories or laws.

—Merriam-Webster Dictionary.

The last part of that definition has always been the chief motivator for me in my research — practical application. While all research seeks to discover universal truths and deeper meaning, I strongly believe that researchers have a responsibility to contribute to society in other tangible ways as well.

Just as Gutenberg’s invention of the printing press in 1439 made literary works accessible to everyone, the Internet is likewise speeding up the propagation of knowledge now and will continue to do so in the decades to come. We are on the brink of a cultural revolution where ideas, prototypes, discussion and research know no boundaries of location or time. The Free Software Movement is promoting users’ freedom to understand and explore computer programs. The Creative Commons project encourages authors, scientists, artists and educators to distribute their creations under licenses that foster the sharing of ideas, encourage discussion, engender a culture of openness, and speed up innovation. This provides enormous opportunities for researchers to collaborate in real-time across institutions, countries and continents, and to serve the community by disseminating their research results via public blogs, videos, slides, prototypes and designs.

As a researcher in Human-Computer Interaction at the Dept. of Computer Science at Virginia Tech, I have developed several tools and prototypes that would be of benefit not just to researchers but also to computer users. It is by studying their habits that I designed these tools — to them, I owe these tools. I work in the area of Personal Information Management (PIM), and study how users access and manage information such as files, calendars, email messages, contacts and bookmarks on multiple devices. I release all such tools and software to the world at my web site under licenses that permit anyone to inspect the source code, build upon it, and benefit from it.

In the process of my research, I developed a program to access Google Calendar which now has over 25,000 users. A calendar converter program I wrote is used by an average of more than 200 users per day. During Sustainability Week 2008 at Virginia Tech, I released a Blacksburg Transit Schedule application for cell phones to encourage Blacksburg citizens to take the bus instead of driving. It is used by about 300 users every month and growing.

In the spirit of working on real products that are used by real people, I interned at Google three times during my Ph.D. (2005, 2006, 2007.) In 2007, my project enabled users of Google Book Search to clip personalized content from books and embed that into their own web site or blog. This enables teachers to excerpt from literary classics for their class home page, for literature scholars to debate the nuances of texts, and for commentators to dissect parts of books. My intern work was covered by several news outlets, chief among them, at Google’s Corporate Blog.

Academia encourages published work—publish or perish, they say—while original contributions such as new ideas and untested directions are undervalued in the traditional ways of evaluating research. The Internet changes that too. Several times when I have come up with ideas that may or may not be viable research projects, I have written about them on my blog. The public scrutiny and invaluable feedback I’ve received made it easy to separate the wheat from the chaff. My advisor has always been supportive of those ideas that were encouraging research directions: the latest among them resulted in a paper that has been nominated for the ACM SIGCHI Student Research Competition 2009.

An area that I have recently been concerned about is the open publication of raw data sets. I perform human experiments which are reviewed by the Institutional Review Board (IRB) for ethical compliance. There is inherent tension between the privacy implications of human experiments and the Open Science dream of being able to publish all experimental data publicly so that others may analyze it in novel ways. I plan to investigate the ethical, moral and legal responsibilities of such an endeavor, recognizing that we as researchers owe two allegiances: to our experiment participants and to the scientific community, in that order.

I am happy to be a researcher at a time in our history when competitive collaboration trumps closed confidentiality. Science and innovation can only progress faster when information is freely shared among researchers, scholars and citizens.

Book-as-Blog: Encouraging Reading by Posting a Chapter at a Time

17
Dec
2008

I realized I haven’t picked up a book in weeks, (non-academic book, that is), but I’ve read more than my fair share of blogs in that same time. I wonder if part of the reason is the longer time commitment required by a book. This prevents it from being read quickly and keeps it forever on my wish list. If so, then how about a service that breaks down books into blog-post-sized chunks and publishes them every few days?

The idea is inspired by, — nay, stolen from — Kevin Kelly, who is reissuing his 10-yr old book as a blog (hat-tip to Seth Godin’s post on the topic). His reasons are different, though. The book is out-of-print, and is already available as a downloadable PDF from his web site. Making it available as a blog is just another way of spreading his ideas wider, which is a great idea.

But apart from that, I like the idea of chopping up a book into chapter-sized chunks and making them available to readers one at a time. Not for any economic reasons, but because attentional resources are so scarce these days. A few times during the day, I have some free time which I use to read a few blog posts. If I ever thought about picking up a book during these breaks, I wouldn’t do it, simply because of the (arguably artificial) time commitment issues it raises in my mind. But talk about a chapter-sized, or even smaller blog post, and I’d read it.

Of course, not all book content has an affordance for this kind of splicing and dicing. If it takes several minutes for a reader to re-establish context from the last blog post, the purpose is lost. Some authors would consider their books a work of art too precious(ssss) to split it up into anything smaller. That’s also the reason why bands are often reluctant to sell singles instead of entire albums (apart from the record labels preferring to sell you 9 lame tracks bundled with 1 great track for $10 instead of $1, thank you very much.) But several non-fiction books could verily adapt to such a format.

The book-as-blog need not be free (as in no charge.) Sure, charge me for it. Implementation would be easy, charge me a micropayment and give me a secret watermarked feed URL. With so much new content licensed under a Creative Commons attribution license, it’s also possible to develop a web service that does this for liberally-licensed and public domain works. This is compatible with Creative Commons Attribution (BY), Attribution-ShareAlike (BY-SA), Attribution-Noncommercial (BY-NC), and Attribution Non-commercial Share-Alike (BY-NC-SA) licenses (but I’m not a lawyer, this is not legal advice, blah blah.)

Maybe something like this will finally get me back to the several-books-a-month club I used to be a member of, until I discovered this newfangled shiny thing called the Internet.

Email should have Expiration Dates

2
Nov
2008

The entire idea behind this blog post has been summed up in the title, so all I need to do now is to explain why I think email should have expiration dates, and how that would make personal information management better.

Email, as we all know, started off as a way of sending short messages to colleagues within a department. It has since evolved into a monster of a tool that does everything it was never designed to do. The paradox is that it is exactly the kinds of messages that email was designed to handle that cause me the most trouble these days.

  1. I often receive email from my friends about meeting up for lunch. This is important, but only for that particular day (and that too, if I receive it before lunch time).
  2. My research collaborators send me email when a paper submission deadline is near, with the draft attached to it. Those emails are not nearly as important after the deadline.
  3. My friends and I exchange travel plans over email, but is it as useful after the trip is done?

These are the kinds of messages I’m talking about: important but time-sensitive. Then there are others which are not really important, but simply one-time notifications that I can take action on and then forget (“bill is due in 2 days”, “X added you as a friend”, “your order was received”, “your package has shipped”, “free donuts in break room”, “we are not meeting today”, etc.)

Why do they linger on in my mailbox for years? They become indistinguishable from the really important email that I need to save for years, such as some very interesting and intelligent discussions I have had with others. Note that I’m not including spam in this discussion, because in my opinion, there are adequate spam-filtering tools circa 2008 that perform well enough for most users for the most part with an acceptable false positive rate. Not perfect, but acceptable.

The Keeping Problem

Email is no longer ephemeral — people hold on to their email for years. This is what results in the Keeping Problem in Personal Information Management: there is so much of information coming at us that we don’t want to spend the time to decide what to keep and what to trash, so we end up keeping all of it. We hope we never have to do spring cleaning, and instead rely on search to find what we want.

Filing is not the answer

Many people file and tag their email, but the question is, is the cost of doing so (time as well as attention) worth the payoff at the end? Consider the two alternatives: spending 10 minutes each day filing your email, versus spending an hour a month looking for that one email. Pretty soon, the second alternative starts looking better while swimming in a sea of email with no signs of abating.

Same needle, bigger haystack

The bigger the haystack grows, the harder it is to find the needle. The solution is to reduce the size of the haystack. Automatically. Most other solutions empower the user to filter, sort, file, tag and do other sorts of things to their email that do not scale very well. That’s where Email Expiration Dates come into play. For it to work, they need to be (1) defined and (2) honored.

Defining an Email Expiration Tag

Email expiration tags can be defined in several ways by several entities that handle the email message at some point of time in transit.

  1. By the sender of that email who cares about the recipients;
  2. By the email client (MUA) used by the sender, automatically inferring from certain common-sense words; e.g. subject contains lunch and body is less than 100 bytes;
  3. By the email server software that intelligently tags email based on common patterns seen across multiple users;
  4. By the recipient’s email client, based on heuristics;
  5. By the recipient’s email client, based on a user-defined rule set;
  6. Or explicitly by the recipient in a spring cleaning session.

Honoring an Email Expiration Tag : Fully standards-compliant

RFC 822 allows custom tags (Sec. 4.7.5). These are commonly referred to as X- headers, since the specification requires that all such tags be prefixed with “X-”. Many applications built on email make use of such tags: mailing lists use the X-List-* headers to specify the list name, subscribe URL and unsubscribe URL in a mail message. Spam filtering software such as SpamAssassin assigns a score to each email, saved as an X- header. Mail clients are free to interpret these tags as they see fit.

An expired email will not be automatically deleted if the user does not want it to be. This is important for archival purposes and to satisfy the stringent reporting requirements of the Sarbanes-Oxley Act. But now the user can make a one-button choice about whether or not expired emails be deleted, archived, moved away or kept around.

With help from legitimate bulk email senders (not spammers)

Bulk mail such as Facebook notifications could have expiration dates set to “one week after receipt”. Bill reminders could set the expiration date to be “2 days past deadline” (and then send another notification if payment is not received by then.) Donut announcements could expire at the end of the day. Talk announcements could expire at the end of the talk.

Fixing the post-vacation blues

Returning from a vacation is no longer refreshing, as we are thinking about the sheer volume of email we need to process once we get home. If I was on vacation when the donuts were on the table, I should not be bothered about it when I return. Go away! If it’s an invitation to a talk that happened while I was away, I don’t need to hear about it now.

What will it take for adoption?

Defining a standard is no use if it isn’t used. The best way for such a solution to be adopted is for a major email provider implement it themselves, perhaps in a limited beta? On the interface side, this requires two additions: one for sending, one for processing received messages. The widget at the sender’s end is simply a calendar picker, or a drop-down with relative dates (“tomorrow”, “next week”, etc.) At the receiving end, it’s a three-way radio button that lets users “Delete”, “Archive” or “Leave alone” expired messages.

Till then, it’s back to manual spring cleaning. Oh well.

Acknowledgments: I have had several stimulating discussions with my advisor, Manuel Pérez-Quiñones, and my colleague, Pardha Pyla, about our respective email filing strategies, (that mostly began as venting sessions). This idea no doubt borrows from my analysis and conclusions based on some of those conversations.

Why I love working here!

Permanent Link | Filed under: Academic, Funny, Life
27
Aug
2008

When most professors have closed-door policies and need weeks of lead time before being able to schedule a meeting, here’s why I love working here!

Who's Online?

Separating Phone Numbers from Phones

17
Jan
2008

Last night, I left my cell phone in my car. As with most of my follies, I realized it a few oh-no-seconds after I got home, but only after I’d taken off my jacket, gloves, cap, shoes and socks. It was an unnecessary walk in below-zero temperatures, but it got me thinking about phones, identities, what’s wrong about it all, and how it could be made better.

The problem is this: phones and phone numbers are tightly coupled together [1]. No wonder people keep their phones close to their heart — their personal identity is locked in it. If I don’t carry my phone, there’s no way to answer calls that I receive at that phone number. I can perhaps check voicemail from another phone, but still cannot make and receive phone calls under my own phone number.

Now compare this to email: if you go on a vacation without your own laptop computer, it is still possible to “borrow” someone’s random computer and check your messages. The messages you send will have your ID (your email address) attached to them, and the people you interact with will have no idea what machine you used (and there is no need for them to know.)

Why can’t we have a phone identity (our phone number) separate from the device (our phone) that is used to access it? If I forget my phone in the car overnight, I should be able to just add my phone identity to the home phone. That way, all calls that would have been received by my handset in the car will now be received at my home phone, and callers/callees will not know a thing. The next morning, I would re-establish my identity on my cell phone, and things will be back to usual.

I’m not a big fan of call redirects: that puts a temporary bandage on the problem instead of actually solving it. I don’t want my identity routed to another identity: I want to be able to use my own identity wherever.

This would also open up the market for multiple-identity phones. A couple can add both their identities to a single home phone in the evening, while they carry individual cell phones during the day. Forgot your cell phone at home? No problem, just borrow a loaner phone from the office receptionist and use it all day long (just as you would borrow a loaner security badge if you forgot yours). It would also make it easy for a group of people to be able to respond to a single phone call, e.g. despatch services for emergencies. A group of doctors could share a single phone number. Whoever is on emergency call duty would add the group phone number to his/her cell phone, and remove it after the duty ends.

Historically, a phone number has been tied to a phone, mostly because of technical constraints, beginning with the days of the human-operated telephone exchange. Email has shown that identities (email addresses) can be independent of devices (computers), that many identities can share a device, and many devices can be used by a single identity.

It’s an easy conceptual step forward to move to the many-to-many model instead of the current one-to-one. But there is a tremendous amount of change required of the infrastructure, and it won’t be cheap. But since I don’t happen to be in the business of implementing it (at least not yet!), so I’ll just write about this idea and hope that someone picks it up. Maybe someone will listen, and like it, and implement it.

Then I won’t have to walk out in the $#@*%$#^ snow to fetch a %$#%#$* cell phone.

[1] The more pedantic among us will point out that GSM phones keep the user’s identity on a SIM card, and CDMA phones maintain a single ID tied to the IMEI number of a phone. Although possible, that does not make swapping identities across phones easy: in the first case, you must have your current phone handy, which does not help solve my problem of having left the phone in the car overnight, and the second one requires a long phone call to the carrier to make the change. Neither is as quick or handy as the method I envision.

An awesome “prank” on the Virginia Tech campus

21
Oct
2007

I received the following email a few minutes ago, with fake headers and the works, and is formatted exactly the same way as the regular email we get from these folks. It’s probably viral marketing for the upcoming game, Portal, releasing November 23, 2007. Lots of references to it in the text.

1. UNDERGROUND HALLOWEEN ADVENTURE
2. BOBBY FISCHER – ENDED THE SOVIET CHESS HEGEMONY
3. SELECTING YOUR CABLE COMPANY IN BLACKSBURG
4. PI EATING CONTEST
5. POSSIBLE BAG BAN
6. DONALDSON-BROWN LOCKS TO BE CHANGED
7. ODD – OPEN DOOR DAY
8. MICROSOFT VISTA SERVICE PACK DEMO
9. WEEKLY SPEAKER SERIES
10. REGISTRATION FOR DEAN’S FORUM ON HEALTH, FOOD AND NUTRITION
11. STUDY PARTICIPANTS NEEDED

1. UNDERGROUND HALLOWEEN ADVENTURE
A Halloween tour of the steam tunnels beneath campus will be offered for the first time this year to four groups of eight people on Oct. 29th and 30th. Sign-up for each of the four tours will begin on Monday, October 22nd, and continue until all places are taken. Interested parties should contact Richard McCoy at 231-3200 for more information.

2. BOBBY FISCHER – ENDED THE SOVIET CHESS HEGEMONY
Monday, Oct 22, 5:30-7:00 in Williamsburg Rm, 7:00-8:00 in Haymarket Theater in Squires Center The man who ended the Soviet chess hegemony by defeating Boris Spassky will speak at Virginia Tech. A reception will precede his presentation at 7:00pm. Robert James “Bobby” Fischer is a United States-born chess Grandmaster who in 1972 became the only US-born chess player to become the official World Chess Champion. Fischer’s victory during the Cold War caused a great interest in chess and is responsible for the swelling of members of the World Chess Federation.

3. SELECTING YOUR CABLE COMPANY IN BLACKSBURG
Sometime between Tuesday, Oct 23 08:00am and next Friday, Nov 2 7:00pm in Room C in the GLC Are you interested in purchasing a subscription package from your local cable company? Presenters from NTC Communications Comcast Digital Cable and Cox Communications will talk about the different internet, phone and cable packages available and answer questions about rates and programming.

4. PI EATING CONTEST
Tuesday, Oct 23, 7:00pm in Room F in the GLC the VT Math club is sponsoring a Pi festival. Approximately 3,141 pies will be available for sampling. They will include but are not limited to Apple, Banoffee, Banana cream, Blackberry, Blueberry, Cheesecake, Cherry, Chestnut, Cream, Custard, Grape, Lemon meringue, Peach, Pecan, Pumpkin, and Rhubarb. In addition, at 7:30 there will also be a pie eating contest. The first contestant to eat an irrational number of pies will receive a hand-carved Penrose triangle.

5. POSSIBLE BAG BAN
Due to the heightened security of many university campuses, a possible ban of all bags on campus may be implemented in the next two weeks. Backpacks, duffels, shoulder-bags, and purses may soon join the list of items prohibited on campus. This measure has been proposed since it has been pointed out that bags may be able to conceal already illegal items. An unlikely supporter of the ban is the campus Health and Safety Department as it would also alleviate the troubling phenomenon of overweight book bags that commonly lead to health problems later in life. Acceptance of the proposal will be decided by the campus Board of Directors later this week.

6. DONALDSON-BROWN LOCKS TO BE CHANGED
It has come to the attention of university security personnel that many graduate students have access to the GLC 24 hours a day. In order to remedy this threat to campus security, all doors to the GLC will have their locks changed between Monday evening and Tuesday morning. In addition, Donaldson Brown dorm rooms will also have their locks changed on a short rotation. You may need to request a new room key from your Residential Fellow.

7. ODD – OPEN DOOR DAY
To help promote social interaction amongst the graduate students, Thurs, Oct 25, will be open door day. Graduate students on campus are encouraged to keep their door open and meet their neighbors as well as their Residential Fellow if they have not done so already. We are aware that the doors in the GLC rooms close on their own, this is why you have been provided with doorstops. Use them! Hopefully open door day will become more routine and no longer considered odd.

8. MICROSOFT VISTA SERVICE PACK DEMO
Wednesday, Oct 24, 6:00-7:00pm in McBryde 666, Microsoft will be giving an exclusive preview of service pack one for Vista. In response to the massive number of problems, compatibility, and stability issues in Vista, Microsoft has spent the past year fervently addressing these issues in the much anticipated service pack 1 (SP1). Representives from Microsoft will demonstrate the features and stability changes of SP1, such as the newly bolstered DRM software. This update and others in SP1 that will be demonstrated should help provide Vista users with new enhanced reduced functionality.

9. WEEKLY SPEAKER SERIES
Friday, Oct 26, 4:00-5:00pm in Room F in the GLC Faculty speaker: Dr. Henry Warren – Physics, on Structure of the Proton. Graduate students and faculty from across the university present weekly their teaching and research passions in a casual, coffee house atmosphere. Free coffee and pastries served from 3:45pm.

10. REGISTRATION FOR DEAN’S FORUM ON HEALTH, FOOD AND NUTRITION
Registration for the Nov 5 forum is now open. This forum will showcase health, food, and nutrition efforts in research, extension/outreach, and teaching currently underway at McDonalds, Kraft Foods, Monsanto, and LuthorCorp. Register by Sunday, Oct 28 if you plan on attending the event. Sponsors will showcase the health benefits of the latest developments in GMOs, growth hormones, preservatives, artificial sweeteners, hydrogenated oils, flavoring and texturizing food additives. For more information, including registration links, and to view the Forum agenda, please visit http://www.mcvideogame.com/index-eng.html

11. STUDY PARTICIPANTS NEEDED
A graduate student researcher working on behalf of Aperture Science is seeking highly-motivated individuals in good physical condition between the ages of 18-25 for her study. Participants will be asked to perform complex tasks. The entire study should last a minimum of 3 hours and moist, delicious cake will be served upon successful completion of the test. For further information or to sign up to participate, please contact Glados, glados@aperturescience.com

The Case for Decentralized Social Networks

3
Oct
2007

This article was originally written October 3, 2007 and published here before OpenSocial was announced. With this blog post, I’m moving it to my blog to avail of features such as commenting and cross-linking with other related posts. I have not edited it since the original writing; if I do, edits will appear as updates marked as such.

Social networks are currently walled gardens: you need an account on multiple social networking sites to be able to interact with all your friends. This article makes a case for opening up the core protocols that define person-to-person interaction (decentralized networks) and various aspects of your public personality (decentralized applications). It is possible to use a few well-known semantic Web protocols and microformats to break down the walls and make the Internet a true social network.

Social networking Web sites are currently walled gardens. If you’re on MySpace, you cannot communicate with Facebook users or Orkut users. Although the features provided by most sites are comparable, if not equivalent, one must have an account on each of these sites to interact with members from that community.

The Motivation

That is not how social networks work in the real world. I do not need to be a citizen of a country or a follower of a religion to converse with the members of that country or religion, respectively. OK, this is a far-fetched analogy, but consider email.

The Evolution of Email

Before email as we know it today was in wide-spread use, the earliest way to send a message to anyone using a computer was simply to drop a file in their home directory. You could thus only send a message to users of the same computer as you. Ray Tomlinson came up with the idea of addressing users using the “@” sign, so email could be sent across computers. In the opinion of Jon Postel, this was a nice hack that finally evolved to an IETF RFC.

Today, we are able to address email to anyone on any network that’s connected to the Internet. Their ISP, or operating system, or mail transmission agent (MTA) or mail user agent (MUA; commonly referred to as a mail client) has no bearing on whether they will receive our email or not. The diversity in the email ecosystem allows me to receive, download and view my email in exactly the way I want.

Fast forward to Instant Messaging

Instant messaging evolved similarly, with ICQ, AOL, Yahoo!, and Microsoft all developing their functionally-equivalent, but non-interoperable protocols for essentially the same task. You had to have an account with each of those providers to be able to talk to their users. Along came XMPP and Jabber, followed by the development of an IETF RFC for instant messaging, which has now found support in commercial products such as Google Talk. XMPP does not require users to have accounts on multiple servers; if you have an account on one XMPP server, you can chat with any user on any other XMPP network (provided other prerequisites such as authorization are met.)

Why this makes sense for Social Networking

Social networking is no longer one of the fringe activities on the Web. There are several Web sites that purportedly do the same thing (and I’m too lazy to list them all.) The point is, social networking is now becoming a conduit rather than a destination. Much of our online time is spent on social activities, and the importance of individual users and their individual contributions taken together is increasing.

So what would it look like?

A decentralized social network would let users sign up at whichever Host site they prefer (just as you can sign up with any email provider today.) They would be able to participate and interact with users of any other such Host site, with no additional signing up to do. They would be able to create a profile that best reflects their motivation in signing up: a college student may sign up at a Host that allows him to display his classes and academic interests, while a professional may choose a Host site that emphasizes her skills and experience. Applications running on any Host site will have access to the Friends List of the user account they are running under, even across Host sites. A user’s profile may even be fragmented across multiple Hosts, with each Host hosting a particular aspect, or type of content for the user.

Advantages for Users

  1. You don’t have to sign up at multiple sites.
  2. You can choose a Host site that best suits your personality. If you like a casual, “explosion-in-a-media-factory” look, sign up for MySpace. If you prefer a more professional look, go for Facebook. If you want to expose more professional data than personal data, Linked-In is your Host. If you would like to express your affiliation with your company or non-profit, use their Host site as your primary home.
  3. If none of these suit you, just roll out your own Host site that hosts exactly one profile: yours. That won’t prevent you from being part of the larger social network.
  4. Since individual Hosts manage their user’s profiles, privacy can be controlled better. You will retain the choice to pick a service that best matches your privacy expectations.

Advantages for Developers

  1. When a new social networking site announces their own API and protocols, developers won’t have to scamper to port their existing app to it. They simply continue hosting it themselves, and the newcomer Host will simply talk the same language out-of-the-box.
  2. Developers will also have the flexibility of tailoring their interface whichever way they want — they will not be required to adhere to the strict interface guidelines of individual sites.
  3. For those that heavily rely on eye-balls and advertising, they can continue to host their own content, not subject to a third party’s terms of services.
  4. Expert users may choose to develop apps for themselves. These one-offs will be easy to integrate with that particular user’s profile.

Advantages for Hosts

  1. Host sites will be able to position themselves in a market better, and distinguish themselves from other offerings in a better way than existing sites can.
  2. The network effect will no longer be the dominant reason for users picking one social networking site over another, and such sites will have to compete on real features and good design, rather than simply “because all my friends are here”.
  3. Closer ties between users and hosts will enable them to tailor their services to the particular class of users they attract.

A Change in Philosophy

A few things will need to be re-thought, because, in a decentralized network, there are better alternatives to existing ways of doing things.

Disseminating and aggregating specialized content

We have seen specific websites and companies excelling in managing different types of data. Flickr specializes in photos, YouTube in videos, Blogger in blog posts, Twitter in one-line “twits”. All-purpose sharing websites such as Facebook also let you upload and embed all these types of content. What is the use of duplicating this content on each individual social networking site?

In a decentralized network, my blog could stay at Blogger, my photos on Flickr, and my videos on YouTube. My personal profile is simply an aggregation of these multiple aspects of my personality. What’s more, to design my own profile, I could just pick and choose the “modules” I want from a palette of available syndication options. (In fact, my own website is already designed like that: content you see here is aggregated from Twitter, Flickr, and FeedBurner, plus a few hosted pages.)

Developers can concentrate on what they do best, and outsource the rest of it to experts in individual areas. Photo album designers will not have to reinvent Flickr, and video distributors can simply leech YouTube’s bandwidth for their hosting.

Profile information can be mashed up

A user’s profile information can easily be mashed up for quick one-off applications. For example, if I need to create a list of all my friends from a particular group to print greeting cards, I do not need to write an application, submit it to Facebook and wait for their approval. I simply deploy it to my own Host’s server and get done in the time it takes to write “SELECT * FROM Friends WHERE Group = ‘christmas-cards’” (oh, and I would totally pick a host that provides a SQL interface for social data!) I can have an address book that integrates with my web-based email client, that maintains an updated list of email addresses of all my friends, of course pulled from their individual profiles.

Rethinking privacy and authorization

Authentication is easy (we’ll look at that soon.) Authorization is hard. But this is a problem that should be easy for public-key cryptography to solve. I’m not a cryptography expert, so anything I say here will be wrong. But I trust that if the experts put their mind to this, it shouldn’t be too bad to solve without having Alice, Bob and other alphabet-soup-inspired characters to make all their keys public.

Enhanced Search

To some, this may sound like a gross invasion of privacy, but in fact, deciding what information should be public, and making that publicly-accessible information searchable, are two different problems. Privacy gate-keepers at each Host will decide what content to make publicly accessible. Once that decision has been made, all the major search engines can index the public information (without having access to any of the private stuff.) Google made the Web searchable. A search engine for The One Social Network will make the world’s population searchable.

The Ground Work

A quick analysis of what’s required to make this happen makes us realize that much of the groundwork has already been laid.

Representing People and Relationships

The chief contribution of the recent boom in social networking is the recognition of the Person as a first-class entity on the Web. Earlier, the only way to represent a person on the Web was via her home-page. But that, too, was a static representation, largely disconnected from the activities and evolution of that person.

A recent push towards including semantic markup in Web pages has led to the development of microformats, a light-weight method of marking up entities within Web content in terms of loosely defined formats that do not interfere with the already-existing presentation duties of HTML. There is the hCard microformat defined for representing a person. The XHTML Friends Network establishes a format for indicating relationships among individuals on the Web. A lot of users and Host sites have made their pages XFN-Friendly, i.e., they have added semantic markup to the lists of their friends to indicate relationships.

Representing Activities

Blogs and twits have emerged as easy ways for people to broadcast their activities to whoever is ready to lend an interested ear. There already are standards that help people share these activity logs in standard formats: Atom and RSS.

Representing Personal Information

Again, microformats have been defined for such diverse things as user-posted reviews, calendar entries, résumés, addresses, geographical location information, with a whole lot of other discussions in progress. The mother of all social networking artifacts, tagging, has also been microformatized.

Communicating Across Diverse Websites

Many sites these days are opening up their APIs for external applications to access and modify users’ data over the Web. SOAP, XML-RPC and other, more formal protocols have given way to REST (Representational State Transfer) as a light-weight software architecture for distributed systems. With RESTful websites, it is easy for independent applications to modify data stored on servers: examples include Google’s GData APIs for many properties, Flickr’s API for accessing photos and metadata, Twitter’s API for posting twits, and many other services.

Distributed Authentication

Systems such as Open ID are emerging as viable standards for truly distributed authentication and identity management. There is no reason why an OpenID-based system cannot be used for the Network We Talked About. If we throw in the ability for Hosts to share authentication lists, that would make all Hosts available to all Users, and the question of having to “pick” a particular host may be moot.

What’s Missing?

Communication Protocols for Posting Messages Across Hosts

REST is here, but it only defines the transport architecture. A RESTful communication protocol will have to be developed for users to be able to post messages to other users on other Hosts. Nothing monumental, but just one thing that needs to be done.

Representing Groups Across Hosts

Groups of users will need a way to be recognized across Hosts. A simple way of doing this would be a naming scheme that stays unique across the network, much as Usenet groups have been. A lot of the lessons learned from the design of Usenet can be used here, because today’s social networks are much similar to Usenet, with a few other goodies thrown in.

Current Efforts

Although we are far from this vision, some sites (mainly Facebook) seem to have started on this path. The Facebook platform was a unique step in allowing developers to access users’ profile information. Though, Facebook still is a walled garden. In part to increase traffic, they also have taken baby steps in making users’ profiles available to search engines. MySpace profile pages are still very un-crawl-able. Flickr, Upcoming, and other Yahoo! sites use microformats extensively. Facebook provides RSS feeds of user activity.

Although these are steps in the right direction, they are not enough. Hopefully, we will reach a critical mass of social networking sites that adopt an open social network policy. Till then, you can find me at my many online haunts.

You know what I did last Summer?

6
Sep
2007

This, covered at the Official Google Book Search Blog.

While it is easy to share links, photos, videos, and opinions on the Web, sharing books with your friends online used to be tough — and tougher even, to share individual clippings from a book. This summer, I worked with the Book Search team to add clip-sharing features to Google Book Search.

You can now highlight a section of text in any public domain book in Book Search, create a clip from it, and share it with the world. You can post your favorite clips to your blog along with a personal annotation, collect them in a Google Notebook, or share them with friends anywhere you decide to embed the link. Your clip looks exactly as it appears in the book, or if you prefer plain text, we have that too.

Also at the Official Google Blog, about collecting, sharing and discovering new books.

We’ve also launched a way to let users, select, copy and embed segments of public domain books (like the Newton quote) in any web page. We hope to make it as easy to blog and quote from a book as it is from any web page. Like many innovations at Google, a stellar summer intern worked on this.

Of course, no project is a single-person effort: Bill Schilit, my mentor; Nathan Naze, JavaScript God; Adam Mathes, Venu Vemula, and the rest of the Book Search team laid the foundation and were an integral part of this feature.

Public Transit as a Third Place

Permanent Link | Filed under: Academic, HCI, Thoughts
15
Aug
2007

Public transit seems to share many of the characteristics of the third place, as Ray Oldenburg calls them in The Great, Good Place. They’re full of people from all walks of life, having random conversations, and brings several of the same people together with amazing regularity.

Sitting at a café as I write this, and having used public transit for the three months of my internship at Google, I’ve wanted to pen these thoughts down for a long time. Every morning and every evening, I used to hang out with the same set of people. Sometimes a few fresh faces would make their way onto the bus; sometimes one of the regulars would sleep in late and miss their bus.

Whenever I happened to take a later bus than usual, some time around noon, the commuter crowd would have shrunk down to a trickle, and most passengers would be headed to finish off errands, or simply out and about the Bay Area. These passengers had an even greater rapport with the bus driver: I’ve been part of thoroughly engaging conversations with these people, who I do not know the names of, and probably never will. For them, it was a natural group that had formed because of their respective travel habits.

Public transit is markedly absent in America, but it is alive and kicking in most other countries: I’ve seen it in Dublin, I’ve seen it in Montréal, I’ve seen it in Bombay. The local trains of Bombay are the lifeline of the working population. The frequency and timeliness of the trains is something to be proud of (regrettably, the same cannot be said of the rest of the population.) Thus, groups of commuters who travel by the same trains day in and day out form their own cliques. There’s even a name in the local lingo for it: “train friends.” Just as you have family friends and work colleagues, this is a part of your social life that stays with you for a significant part of your life. You don’t visit the homes of your train friends; you hardly talk shop with them; and you hardly meet them outside of the commuter context. But the place is a third place, after all.

As in all the other instances of the third place having a strong existence in Europe and all over the world, but lacking in America, the “place” of public transit exhibits similar properties. In USA, commuters are holed up in their oil-fueled cars and vans and SUVs, all the while blaming the other guy for causing all the traffic jams on the 8-lane highways. It is obvious that this causes at least a small amount of increase in stress levels of the driver (though I can’t be bothered to look up a citation for that right now.) Compare that to urban populations elsewhere that share conversations on a bus or a train.

Ray Oldenburg could probably add “keeping stress levels low” to the ways in which third places affect the daily lives of those who inhabit them.

A Meeting with the Father of the Internet

Permanent Link | Filed under: Academic, Google, HCI
15
Jun
2007

It’s not often that one gets to be in a meeting with Vinton Cerf — who’s credited as the “Father of the Internet”, and holds the official job title of “Chief Internet Evangelist” at Google. (No, I’m not kidding.)

Vinton Cerf, Father of the Internet

So when I was invited to a research meeting with him, my mentor Bill Schilit, and others at Google, I was totally in awe. Of course I can’t discuss what we talked, but the little kid in me was awe-struck enough to want to write a blog post simply mentioning it! ;)

I remember having seen him first about 9 years ago. I was a sophomore at Bombay, and I heard from the ACM community that Vint Cerf was to give a talk at SNDT, Churchgate. It was examination time, and as hard as I tried, I couldn’t get anyone else enthused enough to make the hour-long journey to listen to Vint Cerf’s talk. In an engineering school with over 800 students, I had expected to find at least a few takers. None. Nada. Zilch. Everyone was too concerned about their examinations to find time to listen to the Father of the Internet. I gave up, took the train, and went to the talk, all alone. It was totally worth it, I recollect his ambitious Interplanetary Internet project back when he was at MCI.

I had never imagined that 9 years later, I would be attending a meeting with him. It’s not a dream come true, because I had never even dreamt it would be possible to share a table with Vinton Cerf.

A Proposal to Integrate Site-Specific Search Boxes into Browser Chrome

14
May
2007

Why do I have to search for the search box on any site I visit, before I can type my query into it? Given that almost every well-designed site has a search field, and it has been recommended as a good usability practice since 2001, why is it sometimes hidden deep inside the layout? Here is a suggestion for a change in the browser UI that will enable users to find the search box faster. Even faster than other suggestions so far. It only involves a little semantic markup on part of page authors and some redesign on part of browser makers.

The search box within the browser is underused.

The search box is one of the few basic design patterns omnipresent on the Web. It is also a de facto usability practice to place this search box towards the top right corner of the page. Yet, every site has it at a slightly different location (and some, even towards the bottom.) The user needs to search visually for the search box, or at least glance around until she finds it.

During this entire time, the search box within the browser chrome typically lies unused. It has a default search engine defined where it directs all queries typed into it. Why can’t the currently loaded website take advantage of this in-built search field for its own site-restricted search?

How it would work:

When a user is browsing a site that supports this feature, any searches conducted using the browser search box will send the queries to the site in question, and be able to display search results directly. When no site is loaded, the queries are directed to the user’s default search engine, just as it is now.

This proposal does not require any significant changes to any markup language — all that is needed is to enhance the markup with semantic knowledge, and microformats are just the answer! Simply marking up a <form> element with the CSS class "search" should be enough to tell the browser that this is a search form that should be promoted to the browser’s search box chrome.

Prior work on similar problems

HTML 3.2 (yes, 3.2) defines a link relationship type for search pages. Adding <a href="/search" rel="search"> indicates that the outgoing link is to a search page. Browsers currently don’t do much with this information (please correct me if you are aware of a browser that does something intelligent with this information). OpenSearch is (in their words) a simple format for the sharing of search results. Useful as it is, OpenSearch is more geared towards large-scale general search engines, and browser makers are adopting it as a standard for letting their users pick and install search plugins in browsers.

However, being able to customize the search field on a per-site basis with zero configuration on part of the user is not addressed by any of these proposals.

Addressing Potential Criticism

A few critics might argue that such usage dilutes the purpose of the search field (“It’s meant for searching via a search engine, not on a per-site basis”) or be concerned about possible user confusion (“Is my query being sent to a search engine or to the site I’m now visiting?”).

Considering an intentionality-driven approach to design, this UI is perfectly aligned with the user’s intentions. If a page has been rendered in a browser window, the user’s intention is likely to search within that site. If the user wanted to perform a search using a search engine, there is always the possibility of loading a new tab and then searching via the same field.

Issues of Mode

This also brings up the question of whether such a UI is inherently mode-based. (Modes in a User Interface are said to exist when a single input can result in two or more possible outcomes, depending upon the state of the UI at that point. It is generally considered bad design to employ modes in a UI because it invariably leads to user confusion.) In this case, it is arguable whether or not this UI employs modes.

The user task is “to search”, and the current site can be considered a specialized search engine (the specialization is that they only search within their own site). Given that a lot of sites employ site-limited versions of generalized search engines (e.g. Google’s Custom Search Engines), this notion is not very hard to think of. Hence, I argue that these are not two different modes, but two different search providers for the same box, just as current implementations offer users a choice among Google, Yahoo, Altavista and others.

Indicating the currently active search providers in browser chrome

It is also easy to indicate the destination of the search queries in a visually accessible format. Safari (for example) displays the word “Google” in the search field. When a site-search box is displayed in its place, it is trivial to display the name of the site instead of the word “Google”. It is also trivial to reuse the favicon for the same purpose.

Why not?

What do you think about this idea? Comments, suggestions, enhancements appreciated!

Can’t Speak The English

Permanent Link | Filed under: Academic, Life, Stupid
13
May
2007

This is a gem from a long time ago, when I used to be a Graduate Teaching Assistant (GTA) at Virginia Tech. In all three semesters, I have received mostly positive reviews from my students, and I’m told that the scores I got were pretty high to be awarded by an undergraduate class to a GTA. Well, anyway, there’s always the occasional negative review. Here’s one. Any other comment I make on it will ruin it for you, so I’ll just shut up. :)

Cant Speak The English

A Tale of Two Interfaces

23
Oct
2006

Synergy, a mouse and keyboard sharing utility, has proven insanely useful to us users of multiple machines on a single desk. Think of it as a software KVM switch, but minus the “V” (for video.) You can arrange multiple machines side-by-side and Synergy seamlessly moves the mouse pointer and keyboard input from one machine to another at desktop boundaries. It’s a great idea and a great tool.

I use QuickSynergy on my PowerBook and Mac Mini, but later happened to look at the official GUI client on my friend’s Windows laptop. It’s not often that a user interface provokes a blog post on a Monday morning, but this was it.

Here are the screenshots:

QuickSynergy
On Mac OS X
Synergy
On Windows

QuickSynergy.png

QuickSynergy Client.png

QuickSynergy About.png

Synergy Main Screen.png

2. Synergy Configuration.png

3. Synergy Options.png

4. Synergy Hot Keys.png

5. Synergy Advanced Options.png

6. Synergy Auto Start.png

7. Synergy Info.png

8. Synergy Log.png

9. Synergy Running Test.png

10. Synergy Started.png

You will notice that QuickSynergy has exactly one dialog box (with two tabs, one to use when running as a server, and another when running as a client) plus one About dialog. Synergy has a total of 9 dialog boxes (plus one About dialog.) The question, I wish, the developers had asked themselves, was whether throwing in a dialog box for every single configurable parameter was the right thing to do. It seems like the UI Designer(s) simply gave up on trying to understand the users’ needs, and instead just threw everything out to the user: “here, now there’s a dialog box for every single line in the configuration file, go figure it all out.” In my opinion, that’s the designer shirking his or her responsibility of actually designing.

Synergy Relative Mouse Moves.png I wonder how many regular users would ever want to change some of the arcane options. And if there was a savvy user that wanted to, she could just edit the config file! Even as a Computer Science Ph.D. student, I have no idea what the “Relative Mouse Moves” option means, or why I should care about it. (If you say RTFM, that’s already the sign of a bad interface.)
QuickSynergy
On Mac OS X
Synergy
On Windows
QuickSynergy.png 2. Synergy Configuration.png

Notice how, in the configuration screen, QuickSynergy simply shows you one screen with four text fields on the four sides, whereas Synergy expects you to enter the positions as “Machine X is to Direction Y of Machine Z.” The first way is so much more natural, but guess why the Synergy implements the second way? Because the configuration file is written that way.

These are clearly two very different styles of GUI design (though I would strongly argue that a text field for editing a configuration file does not count as a “GUI”, it’s simply a command-line interface (CLI) inside a text field.) Quick Synergy puts the user first, and is designed to let the user work naturally with his/her mental model of a keyboard/mouse layout. Synergy starts from the configuration file and slaps on a UI on top of it. Thus, Quick Synergy comes closer to the user, while Synergy stays closer to the machine.

Synergy QuickSynergy Comparison.png

UI Design is not about letting users edit configuration files, it’s about letting them do what they started out to do. That a config file needs to be edited to make that happen is a side story.

Presentations: The Good, The Bad and The Ugly

18
Sep
2006

I’ve been reading Presentation Zen lately, and various related resources. I’ll credit Stanford Law School Professor, Larry Lessig, with exposing me to “alternate” styles of presentation when he gave a talk at Google last Summer.

Some interesting quotes and links I picked up along the way, with credits.

“If someone that did not attend to [sic] my presentation can understand anything if I mail them my slides, I have made a really bad set of slides. Really bad.” — eirikso.com.

“What a computer is to me is it’s the most remarkable tool that we’ve ever come up with, and it’s the equivalent of a bicycle for our minds.” — Steve Jobs.

An excellent presentation from Seth Godin at the Gel Conference, on all things broken!

Jonathan Shewchuk’s tips for academic talks.

“Start-up a PowerPoint presentation and the average IQ of the room drops by 10 points.” – Anon

A suicide PowerPoint presentation featured on The Onion.

There’s Always One in the Pipeline

Permanent Link | Filed under: Academic, Life
8
May
2006

Adaptive Hypermedia 2006

We just got our paper on creating a standardized representation of syllabi accepted to the workshop on Semantic Web Technologies for e-Learning at Adaptive Hypermedia 2006. Although the work took some time to warm up and get written, it was all quite nice at the end. The project is still beginning, though, and I’m sure we’ll have more on it soon. I’m not sure if I’ll make it to Dublin or not, but paper acceptance already makes it worth it!

With that paper out of the pipeline, we wrote one to submit to MobileHCI 2006. It’s a short paper based on our recent work on seamless task migration across platforms, and I hope that one gets in too.

… Which basically means that I need to start writing about the other things we’ve been working on, so the pipeline (of papers waiting acceptance notification) is always non-empty!

Bookmark and share using ...

Delicious Facebook Digg Google Friendfeed Stumbleupon Twitter Linked In