Visualizing Folksonomies using Machine Learning Algorithms

a collection of projects and ideas
CUtunes is looking better after another semester of work. Here is the updated documentation, and some screenshots. Notable new features are user profile pages, flash-based visualizations of your musical neighborhood, and inteligent playlist creation in itunes, allowing the user to say make me a playlist that is like a specified list of musical artists and CUtunes users.
Traditionally, metadata is thought of simply as keywords that describe some content, and while the primary aim of folksonomic systems like the Del.icio.us bookmarking tool is to produce these keywords, a richer set of metadata is also produced. Because these keywords are now contributed from many different individuals and aggregated, useful information comes not only from the keyword itself but also from the information about who contributed to labeling the content with that keyword. This idea can be broadened to a general framework for producing a new layer of metadata: similarity between concepts. By analyzing the distributions of how users apply tags, how tags are applied to links, and how users pick content, we should be able to calculate the "distance" between tags, users, and content. This "distance" metric could then be used to construct a more powerful tool for browsing content, allowing the user to specify a query made up of keywords, content, or even other users. Furthermore, this metadata can be condensed into a lower dimensional space and visualized in order to gain better insight into the relationships between the concepts themselves. (Full paper found here)
We live in an age flooded with information. New technologies are making available many large unstructured sets of information. As this information becomes more available, it becomes more difficult to navigate without a guide. Now that a typical user can carry around 10,000 songs in his pocket, the choice of picking which song to listen to becomes increasingly more difficult. Now that a typical user can access 13 billion websites, how does a person know which sites are relevant to him?