This is an archive of my old Stanford pages.
These pages will cease to be updated.
To find out what I'm up to these days, check out
I am a Ph.D. student in the Computer Science Department, working with
as part of the InfoLab
Recently, I have been investigating collaborative tagging systems.
Tagging systems are based around "tags": (usually) single word,
user-contributed, keyword annotations.
The big difference between tags and keyword annotations is that users can
contribute tags, whereas keyword annotations are usually added by authors or
This allows tagging to scale to massive and dynamic corpora on the web.
Popular examples of collaborative tagging systems include:
These systems work pretty well, but they also have some problems.
Tags have caveats for text corpora
(February 2007—February 2008)
Social bookmarking systems are a type of tagging system for URLs.
We looked at how these systems can impact web search.
We call this work:
"Can Social Bookmarking Improve Web Search?"
We found that tags are often redundant, though there are other features of
social bookmarking systems that make them valuable.
Users spam tag sites
When users can add anything they desire, they often add spam.
In recent work on Tag Spam
, we looked at methods to
fight tag spam and models for attackers in tagging systems.
Tags are flat and disorganized
One big challenge is to decide how to organize and interface with tags.
We did some early work on organizing tags into Tag
Ultimately, the question is: how can we improve tags, and how do tags
compare to other types of metadata that scale to millions of users?
Gates Building, Room 424
353 Serra Mall
Stanford, CA 94305
cs-stanford-paul AT heymann.be