ReTweeting: Attribution for Discovery versus Attribution for Creation

During the past few months, I have found myself consuming more news and articles via recommendations from friends and those I follow on Twitter than via traditional source-based subscription (e.g. subscribing to specific feeds or newspapers). Social media discovery is here, and the best part of reTweeted links is that they have already gone through a round of peer review by peers I trust.

Often, I’m tempted to reTweet that content myself, or post it to Facebook, or share it via Google Reader. A few of these media keep attribution intact (e.g. Google Reader adds the “Shared by” metadata for each person in the chain that shared the content.) Others such as Twitter are restricted by the length of the post, so the “RT @” list quickly gets too long and inevitably gets trimmed along the way.

But there’s no accepted practice for how this list should be trimmed. Should you keep the first Tweeter, even if that person is not the author of the content? (E.g. someone who read an NY Times article and tweeted about it.) Should you keep the last reTweeter, who was your direct link to the content in question? What about multiple Tweeters re-posting links to the same content, so it’s not a tree any more, but a forest of links (imagine a directed graph with edges denoting “shared by X to Y”).

The problem is that by including attribution about the process of discovery, we end up attaching higher value to discovery than creation. When someone reTweets a secondary source of information, attribution for the primary source is often trimmed away. This is especially bad for Creative Commons works that require attribution when re-posted, but is bad in general for any kind of work and for authors of that work.

I have come to the conclusion that although attribution for discovery is important, it’s hard to apply consistently in fixed-character-length media. It’s a completely different story in case of original content generated by the tweeter himself/herself: e.g. one-liners, or authors tweeting links to their (longer) content. Attribution for original content is vastly more meaningful than attribution for promoting someone else’s content (although the value of that act is substantial as well.)

So from now on, I will only attribute original content in my tweets and Facebook updates. My intention is not to discount the value of the source that shared the content with me, but instead to promote the original author of that content wherever possible.

HOWTO Use custom DNS redirects to save browser keystrokes

Given the recent interest in DNS and its role in the public infrastructure of the Internet, sparked by the release of Google Public DNS, here’s a hack that can help you save keystrokes in the browser while accessing your favorite sites. Instead of typing in “youtube.com” or “twitter.com”, you can just type “y” or “t”. If you’re looking for a map of San Francisco, CA, you can type “map/sf” and jump to the right place in Google Maps.

A bright bold blinking marquee disclaimer before we start: this is advanced territory. If you don’t know what sudo is and why 127.0.0.1 is special, be careful following these instructions because you may unintentionally destroy your ability to do anything at all on the Internet — including looking up instructions for getting unstuck. Also, these instructions only apply to Mac OS X and Linux, or other UNIX variants.

Redirect custom DNS hostnames to frequently-accessed sites

The file /etc/hosts on your machine is consulted by the DNS resolver before making a request to a DNS server. The idea is to add new DNS entries to the hosts file on your machine, pointing short domains such as g and t to 127.0.0.1. Now, whenever you type g or t into your browser, the hostname will be matched from your /etc/hosts file, instead of receiving an NXDOMAIN reply (i.e., this domain does not exist) from an upstream DNS provider. Since this request is received by your own machine, you can then handle it to do whatever you want, including, but not limited to, redirecting the user to the intended destination.

This HOWTO assumes that Apache is installed and running on your system with PHP and mod_rewrite support.

Modify /etc/hosts

Open /etc/hosts in your favorite text editor, and add one line for each shortcut you’d like to set up. Leave everything else unchanged. (You will need to sudo edit this file.)

##
# Host Database
#
# localhost is used to configure the loopback interface
# when the system is booting.  Do not change this entry.
##
127.0.0.1 localhost
127.0.0.1 c # for Calendar
127.0.0.1 f # for Facebook
127.0.0.1 g # for Google Search
127.0.0.1 m # for Mail
127.0.0.1 map # for Maps
127.0.0.1 t # for Twitter
127.0.0.1 w # for Wikipedia
127.0.0.1 y # for Yelp
127.0.0.1 yo # for YouTube

255.255.255.255	broadcasthost
::1             localhost
fe80::1%lo0	localhost

You can test that this change worked, by typing in the address (e.g. http://g/ in your browser. Instead of seeing a page that says that your browser “can’t find the server ‘g’”, now you would see a page saying that your server isn’t configured correctly, or welcome to Apache, or whatever you would see if you typed http://localhost/ instead. If that worked, proceed.

Configure Apache to handle requests for unknown domains/URIs

Edit the following lines in /etc/apache2/httpd.conf. The following code shows an excerpt with lots of context around the line you need to edit. Locate the relevant section in your file.


#
# This should be changed to whatever you set DocumentRoot to.
#
<Directory "/Library/WebServer/Documents">
    #
    # Possible values for the Options directive are "None", "All",
    # or any combination of:
    #   Indexes Includes FollowSymLinks SymLinksifOwnerMatch ExecCGI MultiViews
    #
    # Note that "MultiViews" must be named *explicitly* --- "Options All"
    # doesn't give it to you.
    #
    # The Options directive is both complicated and important.  Please see
    # http://httpd.apache.org/docs/2.2/mod/core.html#options
    # for more information.
    #
    Options +Indexes +FollowSymLinks +MultiViews

    #
    # AllowOverride controls what directives may be placed in .htaccess files.
    # It can be "All", "None", or any combination of the keywords:
    #   Options FileInfo AuthConfig Limit
    #
    AllowOverride All # <-- Change this from None to All

    #
    # Controls who can get stuff from this server.
    #
    Order allow,deny
    Allow from all

</Directory>

Locate your Apache root directory. It’s usually /Library/WebServer/Documents on the Mac or /var/www in Ubuntu. If you’re unsure, check where it is by issuing the following command in a terminal: (assuming you’re running Apache 2.x)

grep “DocumentRoot” /etc/apache2/httpd.conf

In that directory, save the following file. It should be named exactly .htaccess. (That’s htaccess with a period at the beginning, so it’s a hidden file on UNIX.) Save it as /Library/WebServer/Documents/.htaccess on Mac OS X or /var/www/.htaccess on Ubuntu.

<IfModule mod_rewrite.c>

RewriteEngine on
RewriteBase /

RewriteCond    %{REQUEST_FILENAME} !-f
RewriteCond    %{REQUEST_FILENAME} !-d
RewriteRule    (.*) /index.php [L]

</IfModule>

The actual redirection script

Here’s the script that I use for redirection, but you can roll out your own, and do anything with each request you receive. (If you do something phenomenally awesome, I’d love to hear about it in your comments.) As you can see, it’s customized to the sites I frequent, including location preferences (e.g. the Yelp shortcut takes me to Yelp San Francisco directly. The search box is preconfigured for SF.)

Save this as /Library/WebServer/Documents/index.php (on Mac OS X) or as /var/www/index.php on Ubuntu.

<?php
  $uri = preg_replace('/^\//', '', $_SERVER['REQUEST_URI']);
  switch($_SERVER['SERVER_NAME']) {
    case 'c':
      redir('http://calendar.google.com/');
      break;
    case 'f':
      redir('http://facebook.com/');
      break;
    case 'g':
      redir('http://www.google.com/search?q=' . $uri);
      break;
    case 'm':
      redir('http://mail.google.com/');
      break;
    case 'map':
      redir('http://maps.google.com/?q=' . $uri);
      break;
    case 't':
      redir('http://twitter.com/');
      break;
    case 'w':
      if ('' === $uri) {
        redir('http://en.wikipedia.org/wiki/');
      } else {
        redir('http://en.wikipedia.org/wiki/Special:Search/' . $uri);
      }
      break;
    case 'y':
      if ('' === $uri) {
        redir('http://yelp.com/sf/');
      } else {
        redir('http://yelp.com/search?ns=1&find_loc=San%20Francisco,%20CA&find_desc=' . $uri);
      }
      break;
    case 'yo':
      if ('' === $uri) {
        redir('http://www.youtube.com/');
      } else {
        redir('http://www.youtube.com/results?search_query=' . $uri);
      }
      break;
  }

  function redir($url) {
    header('Location: ' . $url);
  }
?>

That’s it, now type your shortcuts into your browser instead of the longer URLs, and there you are. If you run into trouble, leave a comment and I’ll address it.

The only downside of this approach

Redirecting involves an additional HTTP request to your machine, which introduces additional latency. The request, however, is from your machine to your machine itself, so there’s no network involved. Personally, I feel that the keystrokes saved by the technique would have taken longer to type than the shortcuts I set via this method. But you don’t lose anything if you set this up and don’t use it — just continue to type entire URLs and you will never pay a latency penalty.

Marathon Fundraising: A Noble Goal or Exploiting your Social Network?

20 Sep, 2009 — Thoughts

I’ve grown increasingly skeptical of organized marathons that request donations from one’s friends in order for runners to participate. Both goals on their own — personal fitness and charitable fundraising — are noble; it’s their marriage that seems unholy to me.

Personal Goals versus Charity:
It’s not like the runner in question is doing anything to directly help the populations in need. Let’s be frank, they’re out there fulfilling a personal fitness goal of running a marathon — which is commendable in its own right. I have tremendous respect for marathon runners’ endurance that lets them sustain 26.2 miles of running. It’s also a great way to meet other people with similar interests instead of running alone. For their part, the charitable organizations also do excellent work to solve the issues they’re committed to. Of course, any such work requires financial support and money must be raised to make their projects successful. Unfortunately, these sponsoring organizations have found a great way to exploit marathon runners’ zeal to fulfill their own fund-raising goals.

Consider these statements:

I’m running a marathon, would you donate to Organization X?

I go to the gym thrice a week, would you donate to Organization X?

Both sound absurd to me for the same reason. If I’m putting in effort towards a personal goal, what does donating have anything to do with it? The case would be different if, say, the person were actively working towards a humanitarian goal with direct benefit to the affected populations, and all they needed was a little financial support.

If they’d said:

I’m working on Project X for the people of Y, our budget was $A but we only have $B, would you donate to fill the gap?

…, I’d gladly have contributed, knowing (1) my friend is actively making a difference, not simply pursuing their personal goals and (2) given that my friend is actively involved in the organization makes me trust it more (the notion of transitive trust.) On the other hand, marathon runners typically have no interaction with the organizations under the banner of which they run, except for training with their assigned trainers and running the marathon. I have not yet met a marathon runner who has also actively participated in the non-marathon activities of the organization that directly benefitted the served populations.

Exploiting Friendships: My chief objection to this arrangement is that it blatantly requires marathoners to exploit their friend connections. Charitable donations should be made with an honorable intent, not because not donating will piss off a friend — which is often what marathon donations end up being. Of course, the sponsoring organizations have hit upon a brilliant idea that fills their coffers, never mind the ethical implications of asking friends to donate because you pledged to fulfill a personal goal.

Here’s an excerpt from the Frequently-Asked Questions web page of one such organization (link intentionally not provided). At least this organization is providing this information upfront; others I surveyed did not have anything on their web site, instead requiring users to submit their personally-identifiable information so they could get in touch with you.

What if I cannot raise the pledge amount?

- Org X has to keep its pledge of raising more than its costs. In order to keep this pledge Org X makes to the community, we will secure your commitment in the form of a credit card. We will only charge it for the difference between the required minimum and the money you’ve raised. [...]

So, in effect, runners are simply trying to recoup their out-of-pocket participation costs by requesting donations from acquaintances. That doesn’t seem very charitable to me.

Overhead: One criterion I have for donating to charitable organizations is their level of overhead: what percentage of each $100 of contributions fails to make it to the served population? Overhead costs (sometimes also measured as Fundraising Efficiency) are genuine, and can never be zero; there will always be paperwork, publicity expenses and the like.

In this light, charitable organizations that spend money on marathons do not seem to me to be using their funds wisely. The counter-argument is that they’re spending on activities that generate more funds for them, so the net gain is positive, which I concede to, begrudgingly. Though, I’d much rather this money be spent on catering to their humanitarian mission than on training urban youth for marathons.

In closing: So that’s my point of view. I’ve had face-to-face discussions on this topic with several marathon runners, and I’ve been criticized as someone who doesn’t support any charitable giving (never mind the charities that I do believe in, and regularly donate to.) I’m sure many of the readers of this blog will disagree, and I welcome you to express your mind in the comments. But let’s be clear about one thing: I respect runners and I respect the work of charities. I just do not approve of the sneaky bundling of both these activities.

On-the-fly CSS Compression in PHP

8 Sep, 2009 — Design & Usability, HOWTO, Release

Web site optimization experts suggest that webmasters try to minimize the number and size of HTTP requests necessary to serve web pages. Web designers often use multiple CSS files because they are easier to manage, but this requires as many HTTP requests as there are CSS files.

This free script serves all your CSS files as a single HTTP resource, minified (by removing comments and extraneous whitespace), and gzip-compressed. It also requests browsers to cache the CSS content for at least a day before trying to fetch a new version.

The best part is that this does not pre-process the files, so it does not add any steps to your deployment process. It’s licensed free for commercial and non-commercial use, with attribution requested.

The query: Protocol

Update: I implemented this idea at http://queryprotocol.appspot.com. Comments, questions, and suggestions are welcome!

When trying to explain a concept to others over email, I often find myself linking to a search engine’s result pages for a specific query, instead of a single destination URL. These are non-navigational queries, and there is no single result that I expect to be the most important one. Instead, my intention is to provide the reader a variety of links on the topic such that s/he may draw her own conclusions, or solve their own problem — all they need is a nudge towards the right query term to use. If, over time, better search results are available for the same query, then future readers get the benefit of automatically updated results.

E.g. Q: Where can I find the latest numbers related to the spread of the Swine Flu?
A: Try [H1N1 update].

To do this today, I simply link to my favorite search engine, Google. But that does not seem fair to fans of other search engines: Bing, Yahoo!, Altavista, and others. I would prefer to use a notation that allows the reader to use their choice of search engine to obtain the results. Just as we specify our default browser and default email client, we should be able to pick our default search engine.

We have already solved the first two problems (picking default browsers and email clients) using protocol handlers in the operating system. When I pass around a link to a web page, starting with http://, I do not specify the browser it should open in. Your operating system determines that it’s a link to a hyper-text transfer protocol (HTTP) document, and invokes your default browser. Similarly, for emails, the mailto: protocol provides for an application-agnostic way to invoke the user’s default email client to send an email.

It is easy to see how a query: protocol could be implemented similarly. To point you to the search results for a particular term, I would send you the following link: (don’t click on it, it won’t work — at least as of this writing.)

[h1n1 update]

The URL that the above links to is query:h1n1+update. Note there’s no HTTP protocol marker specified. If the OS wanted, it could provide local results as well. This means that the protocol extends seamlessly to Desktop Search as well.

Syntactically, this validates as a URI. Just as the mailto: protocol handler defines standard parameter names, subject, cc, and bcc, similar parameters can be standardized for the query: protocol. These may include corpus restricts (corpus={web, images, desktop, ...}), pagination controls (start=0, num=10), or domain restricts (site=manas.tungare.name).

Implementation is simple: all operating systems and major browsers support external custom protocol handlers. They can be configured as follows:

Protocol Prefix: query
Application Name: /Path/to/Application

The application does not need to be very complicated. It’s a mere stub, which, depending upon the user’s preferred search engine, converts a URI of the form query:h1n1+update to http://google.com/search?q=h1n1+update or http://bing.com/search?q=h1n1+update and opens that link in the user’s default browser.

Eventually, if browsers understand the query: protocol, there is no need for the stub application, and users may be able to share and exchange queries and yet seek results using their favorite search engines.

(The opinions expressed in this blog post are solely my own, and may not reflect the opinions of my employer, Google.)

Next Page »