Are you a movie buff?
Especially bollywood movies. Then you must be knowing some good details about your favorite movies and the cast and crew of it. If you go for exploring something more than that you’ll probably end up searching details either on internet of video library/museum.
Let us say you look for it on internet. Some of the sites you may refer to IMDB which provides all the information of movies mostly Hollywood. But now a days bollywood flicks are also being listed
Apart from this you may also visit some online music sites like MusicIndiaOnline that lists songs from different movies (dated back to 1940s). So you can search for songs based on movies, actors, music directors, year they were released and so no.
I think I have set the context. What if all this information is available to you visually. As in all at one in some mesh/network form. Take a look at DiggLabs. The way they present real time data of the diggs users are doing around the world in a nicely categorized way is just awesome. Though the information is coming from all over the internet, showing it in a compact form makes it more usable and the learning curve is very gentle.
The adjoining image shows the users (on the periphery) of digg who are bookmarking links across the internet and different colors shows the major categories, the shades of which show the hierarchy within that category. So it’s aptly said that an image is worth 10,000 words.
Now my idea here is to make such a visual system wherein all bollywood infomration can be linked together and shown.
Since India has got a huge mass of population who are bollywood fan, presenting them the data in a new format will be a huge success in its own. And yes the platform or the medium of presenting this data may not be limited to just computer screen, but brought out to multi touch displays. This will make the environment more immersive and engaging. Such installations can be used at Movie libraries or museums or at the offices of some production house.
I have build a small prototype of this in flash, which I’ll be uploading soon. Major task is of building the database. Well all the information is available on internet. We need to write some script (crawlers) which can navigate internet and collect information. I think MusicIndiaOnline is a nice source of information. Though limited it’s very well seggregated, hence easy to parse through and get the information.
Given is a small screen shot of the same.
If you click on one of the movies, it lists all the songs of it and also gives the names of actors and music director, year of release in brief description, which can be easily parsed. And an image is a additional information you can have.
So the premise of this idea is to create a visual system of navigating through bollywood information, which is engaging, rich and usable.



October 6th, 2008 at 1:24 am
I digg for the first time!! Sounds like a great idea. The digg sphere is truly innovative way of presenting contents. I think I love to know the real-time architecture behind this rotating sphere of contents !!!
October 6th, 2008 at 5:14 am
The client end is developed in Flash.
Flash communicates with back end in two different ways. HTTP request response and Socket connection.
In case of HTTP architecture, Flash has to make a request at regular interval to get the new contents. This takes away lot of network bandwidth and load on server. Assume there are 50000 request coming every 5 seconds. Though we have tough servers available these days, this kind of architecture can be a death bell.
Socket connection is cool. Once you have socket connection made there can be push and pull of data. So Flash build a socket connection with server. The application in server broadcasts the fresh content to all the socket connections it has. This saves lot of network bandwidth and server load.
Hence I believe that this must be happening by Socket connection. As far as getting latest digg data at backend, it’s kinda black for me
October 7th, 2008 at 4:06 am
Hey, this sounds cool. I think we can create similar Arch then.
I don’t have much knowledge of Flash but can easily work around network part
October 7th, 2008 at 4:21 am
well than here we go.
tushar which scripting language are you comfortable with? Take one of that and write a crawler as I said. I don’t how it is done. May be will search on internet a bit. we can then host the app on silentideas itself in the lab section.
what say??
October 9th, 2008 at 4:16 am
Because you have an idea how crawler will be; you might have thought of the most suitable scripting language for it!
From my side, I think perl will be good candidate!
October 9th, 2008 at 4:12 pm
ok if you are comfortable with perl can you write a script which can extract particular data from a set of webpages which have same kind of html layouts/structuring.
Take for example different pages of movies at musicindiaonline website
October 10th, 2008 at 12:39 am
Alright here is my opinion. We had discussed about this application in the past, but not much. I think it’s a really great idea because there is also a business in this area. Two questions, is writing crawler the efficient way to fetch data? How about if we make a flash application and make a partnership with some big website? Just some ideas on that front.
I really love what people do with flash and digg. But I think application like this (the one you posted) is not meant to be for largest audience possible. It asks for more brain usage than a normal person would like to use. In my experience there are times when I just need information direct on my face, I don’t like to interpret it from different representation. And that’s why imdb works really great or boxofficemojo which uses simple table representations. So you get to think which works for the most number of people instead of what is really a clever idea if you are getting into commercial application.
October 10th, 2008 at 5:14 am
Tying up with a website/organization to get the data is surely the straightforward way. But that needs agreement and lot of talks which means lot of time. Why not to use data that is available on the internet. We ask the website for using their data in this way and give them the application as a promotional item(with limited feature) and that works as an enough incentive to use their data.
When you are looking for something specific at that time getting data straight is necessary. But when you are just browsing with an intention to discover something at that time navigation and how you present the data becomes important. Anyways we can have different views. But to start with let’s have a small bit of code grabbing the data from web.
October 14th, 2008 at 7:31 am
why is there so much silence??
where is everybody??
silent !!!!!!!!!!!!
October 14th, 2008 at 3:48 pm
It’s not that much silence dont worry. Me and Tushar had long discussion on this topic last weekend. And we felt that we should move further on this idea.
Now I like your reasoning and idea about application that leads to discovering new information. Similar idea is implemented in websites like Amazon. Amazon generally tries to figure out what kind of stuff you like and what you might like and it throws to you all related item based on that.
In my previous comment what I wanted to figure out was what we really want to accomplish with this project. Are we targeting a particular audience or are we doing a complete experimental program which may or may not be audience oriented? And it would be great if we can come up with a system that can be applied to any information system that is similar to bollywood. But then again we should be clear on what we ultimately want from this project.
October 15th, 2008 at 5:22 am
There are two aspects of any project that we take up.
Learning and Business
I think we start with “learning”. But cautiously. We should not reinvent the wheel. We’ll start with something that’s unexplored and learn while designing and developing it.
And when I say unexplored there will be market and business for it. Just that we need to work around it to make it available.
So how do we start?
where are other guys? why aren’t they contributing? Come on people let’s look beyond. we need to slog for these things, but at least giving a bit will help.And one more thing, this is not restricted to our group only.
October 23rd, 2008 at 1:33 am
My few hours of research and reading have clear my understanding of Web Crawler. In fact there are many web crawlers code and ‘how to write’ strategies available on Internet.
My first phase of Web Crawler design
Step 1. Source of the information (URL): Who (web pages) are going to be the initial source from where we (Crawler) extract information. Your suggestion required!
Step 2. Identify the container and design a method to access it’s contents : Web pages comprises of different “containers” - tables, charts etc (What’s etc??…I don’t know other than these:)),. After that Crawler parse the information and retrieve information of interest
Step 3. Data or Information Processing: Information will be processed by Crawler (Information on Internet leads to other Information via link or more specifically hyperlink. We will decide how to process link it in next phase.)
Step 4. Network connections: sockets
Once we have Crawler ready that retrieves data, we start discussing how to present contents like Digg Arch at the initial level. And then we move to next phase of Crawler design - there is still more to it
What do you think?
November 14th, 2008 at 6:11 am
sorry friends for this late reply. I was caught up in some work.
So as Tushar mentioned let’s first decide on our source of information. If we try to crawl on one particular website for information, the firewall may declare us a spam and restrict access (this came to my mind lately). However dumb servers may not notice. But while gathering data we don’t intend to create traffic for the host server and application. Hence need to design the crawler so that it appears to be coming from different ip and at some interval rather than running a loop.