Contents

Joshua Schachter, creator of del.icio.us, a "social bookmarks manager"

history

Joshua created del.icio.us in 2003, as the third version of a bookmarking system that started out "single player", as an organized text file in 2001. He used this system for about two years, enabled searching by tags, and then made it public when his set of bookmarks had around ten thousand users.

Del.icio.us was built on the side, while Joshua worked on active trading systems at Morgan Stanley. It turned out simple and "ugly" because Joshua didn't know HTML or CSS that well.

Timeline:

Date Milestone
Late 2004 15k users
Jan. 2005 30k users, approached by Yahoo, Google, and VCs
Mar. 2005 60k users, Union Square Ventures and AMZN invested, up until then del.icio.us ran on a single machine
Dec. 2005 300k users, 60 servers (2 full racks in a data center in downtown NYC, not a great place)
Sep. 2006 1 million users
Dec. 2006 1.5 million
Feb. 2007 2.0 million

A lot of time was spent scaling and building up the system -- tough to evolve backend and frontend at the same time. Right now working on a top+bottom rewrite, inherited Y! photos backend team (about 40 people).

(For a more detailed account of the history of del.icio.us, see Founders at Work by Jessica Livingston.)

development process

  • Why 40 people? 1500 support requests a day, 6 people answer email.
  • 6 backend, 4 frontend engineers, another 5 or so on the del.icio.us Firefox toolbar.
  • Firefox doesn't give you very much -- they had to reimplement a lot of UI and data store functionality, FF bookmark code pretty old, tough to extend

One of the challenges working for a large corporation is that if you have simple metrics, such as site traffic, managers push you to improve those metrics even if the result isn't necessarily aligned with the goals of the site.

For example, forums are known to increase traffic and yet del.icio.us doesn't yet have conversations. On the flip side, the FF extension actually decreases traffic to the site, and yet still serves a useful purpose.

motivation

Joshua sees del.icio.us as a kind of "global shared memory platform", a kind of "knowledge store" that reduces the transactional cost in remembering something.

He started the project as something he would use, as originally he used to paste links into a Unix box to share links between work and home. Bookmarking in browsers hasn't really changed since NCSA Mosaic, and folders make it hard to organize past a certain point.

Del.icio.us is a kind of "attention economy system" and thus appeals to those awash in information. They "see a lotta students", web folk.

Nowadays, del.icio.us is more than just bookmarking, it's an "attention stream". Might like to integrate ideas from tumblelogging.

architecture / systems

Why a new rewrite? Usability for new users. It used to be "terrifying perl code" as at the time, Joshua didn't see "engineering as a social process".

  • Current architecture uses a lot of materialized views, generated-after-demand caching. Can tolerate slop in a few places, for example, updating tag clouds or views.
  • Uses Yahoo tech for asynchronous inserts. Found that caching wasn't actually useful, was slower than hitting DB.
  • Single master setup -- (multi-master would mean that if one master fails, still have to bounce other masters).
  • About 200 million objects in the system for bookmarks, average 256 bytes a piece. It's the indexing that's the expensive part, but the data partitions really well. Lots of redundant databases, individual shards. MySQL.

MySQL vs PostgreSQL is a useful interview question -- "zealotry is not that useful".

How do you handle bringing up new machines? Current architectures don't support this "insert new server operation" in a generic way for Internet apps.

"Librarians hate del.icio.us"

tagging

You can see patterns in tags, but the spread of tags is widening.

Most popular bookmark is Elise Bauer's recipe bookmarks.

People want alphabetization and some ask for full tag calculus, but 0.5% queries are on two tags.

Joshua and co. are "continuing to evolve what the organizational calculus looks like". An open area for HCI research.

del.icio.us is great in terms of recency, search engines tend to suck. For example, 2/3 of stuff bookmarked in del.icio.us lately is not in search engines yet (!).

Flickr's "machine tags... the most unfortunate name ever".

competitors

Where did del.icio.us get it right over competitors? Open API/RSS feeds -- 2/3 of traffic comes from RSS, but RSS has soft viral property of linking back to del.icio.us.

Make it easy for users to make use of the system without registering, can support as many verbs as possible without signing in.

If had to do it over, perhaps would enable users to surf just with a cookie -- no sign-in -- and rope them in later to save their data.

operations

Pretty aggressively spammed, about 15%. Algorithmically removing stuff.

Lesson learned? "not telling users that you've caught them"

Have a full test suite, run daily, check sometimes.

Big problem with Unicode! "ongoing pain" -- 70% of time it's a Unicode issue.

user experience

Now they have a UE (user experience) team, but originally designed from the gut.

Recently did a five-day usability testing lab with old and new users, affected naming and navigation.

Joshua mentioned a book by Temple Grandin, an autistic woman who designs humane animal processing facilities, noting that animals see the world very differently than the designers do. For UX design, this means that users often see the world very differently compared to the designers.

Rather than using personaes, they would rather release a feature every week and see what people say about it. Front-end prototyping isn't a big problem.

users

"tools and not rules" -- people do some interesting things with del.icio.us, like simulating comment threads with recursive bookmark descriptions, or stuffing data.

  • They have about two writers for each reader, a quarter of a million bookmarks daily vs 5,000 for a site like Digg.
  • How do you grow your audience? Go for soft viral growth rather than invites, like email, where it's actually useful, with a high utility for the individual.
  • FF extension was the biggest driver of traffic, followed by RSS, Flickr save-this also helps.
  • Cell-based user migration -- allows ops to migrate users out of a cell, upgrade cell, and migrate back.

the future

  1. "the notion of a schema is a mistake" -- really want an object store with fast reads?
  2. what would you like to do in the future? "sleep"
  3. Flickr or Delicious shouldn't be 40 devs and a year, it should be 3 devs and a long weekend
Last modified May 14, 2007 10:14 pm / Skin by Kevin Hughes
MediaWiki