Sunday, August 27, 2006

 

Launching BlogScope

Internet has provided a platform using which everyone can express his or her opinion to a wider audience. Millions of people from all around the world are using blogs for documenting their life; people write their experiences about everything, from movies to restaurants, and politics to personal life. This phenomenal popularity of blogs as media form has made blogosphere an invaluable source of information. But, it is not an easy task to find relevant and interesting content, as there are millions of blogs and a lot of content is repetitive and noisy. For past seven months, we have been working on how can we assist the user in his or her journey of exploration through blogosphere. As a result, we have developed BlogScope.

BlogScope is a knowledge discovery and analysis tool for blogosphere. To assist user find interesting information from large amount of text is blogosphere, BlogScope analysis engine offers tools like popularity curves, correlated terms, popularity bursts, and comparison curves. These tools can be used to track trends and temporal evolution of topics, investigate the relationship between keywords, or even for comparing apples with oranges. For bloggers, BlogScope offers some interesting tools like Summary Cloud. To know more read the about page. We also have a small 90 second flash movie demonstrating the system.

We are currently tracking over 4 million blogs with 25 million blog posts in our database. Number of posts is growing at a fast rate of hundred thousand per day. Codebase of BlogScope is also growing at an equivalently fast rate, and we plan to add many more cool features in future, so keep checking. We hope that you like it. And do send your feedback.

This is a flash demo of BlogScope and requires availability of flash player.

Our promotional logos:

Last updated: 11th Sept, 2006.


Comments:
Hi,

i've found you blog looking for profiling in eclipse/tomcat and was fascinated by your blogscope project. I've read the informational pages on the site but i've got some questions you might be willing to answer. The questions relate to the libraries used and the reason why you use them.

1. The graphs. I really like the graphs you generate. Do you use a particular library to make them? I've been using jfreechart but graphs like this i havn't seen, or been able to create.

2. The Database Pool. Is there a specific reason for choosing this library over, for example the DBCP from the jakarta commons, included in tomcat?

3. The cache. Kinda the same question as above, is htere a reason you use whirlycache instead of managing a Map by yourself?

regards,
Roderik
 
@roderik
I am happy to hear that you liked BlogScope. Here are the answers:
1. I have my own code for drawing the graphs using the standard java.awt.Graphics.
2. BlogScope has many other components those run outside the tomcat contained, e.g., crawler, spam analyzer and indexer. DBPool that I am using is a good easy to use library that does everything I need.
3. I could have written a simple cache library myself in 400-500 lines of code, but why reinvent the wheel? Whirlycache is good enough with all the features that I need.
 
Currently all the search results are only from the blogspot blogs. Any specific reason?
 
@abhijit: our current crawler only fetches posts from blogspot.com, because they ping weblogs.com and their XML feeds contain complete text of post (unlike wordpress). But we are planning to add more sites soon.
 
Good job Nilesh. Looks kinda cool to me!

-Suddha Kalyan Basu.
 
Post a Comment



<< Home

This page is powered by Blogger. Isn't yours?