Archive for January 2011

In a previous post, we looked at how to consume a live streaming API from Twitter.  That was great and all, but you’re not going to become the next Twitter by consuming content — you need to have your own live streaming API.  In this post we’ll look at how to create a basic streaming API that others can open a streaming connection to and get real-time updates.

So, what groundbreaking web service will we be exposing to the masses?  We’ll be streaming the web server’s current datetime!  Ingenious, I know!  Hopefully, you’ll be able to adapt this to a slightly more engaging experience.

To start, let’s just take an ASP.NET MVC project and code the default controller action.  We clear all the headers of the response and add the “application/json” header.  Since we won’t be returning a view in the traditional sense of ASP.NET MVC, we don’t have to create an actual view.  We’ll just pause the thread for 1 second and then return the current timestamp for as long as the service is running.

On the consumer side, we’ll just refactor the code we had in the previous post to connect to the URL of the project we just created.  I’m running in on the local development machine so your port will likely be different.

You’ll notice we added some exception handling logic in case we lose our connection to our streaming API.  In this case, we’ll just keep trying to re-connect to the stream.  I’ll show you a sample of this in the last part of the post.

Once we have all the bits in place, we run the ASP.NET MVC application first so it is ready to start handling requests.  Then, we create multiple instances of our consumer console application to simulate multiple clients connecting to the streaming API.  You’ll end up seeing something like this:

To simulate what would happen on the consumer side if there was a broken connection or other exception on the server side, I killed the local development server for a few seconds and then restarted it.  Below you can see that we recovered after a few exception handling loops.

Of course, this post was just intended to provide a possible option for implementing a streaming API with ASP.NET MVC.  For a more realistic scenario you would likely have a queue listener around our Thread.Sleep() code.  As messages are picked up from the queue, you could stream it out and immediately start listening for the next message on the queue.

In a follow-up post, I will demo this same concept using Node.js.

Hope this helps.

In a previous post, we saw how easy it was to start consuming the Twitter Streaming API and display the messages on a console window. In this post, we’ll take it a step further and try to derive some useful information about the the activity stream on Twitter in real-time. I’ve given myself a hypothetical goal of deriving an answer to the following question:

Which Twitter users are mentioned most often when a given keyword is included in the tweet?

How useful is this information?  One idea could be to identify influencers in real time when you’re expecting a sudden increase in a keyword or hashtag for a sponsored event or TV spot.  If people are mentioning a particular user over and over again with your brand in it, you could connect with that user and help spread your message.

Let’s Get Started

For demonstration purposes, the components I’ll be building include:

  1. twitter_stream_db (SQL Server Database) – This will store the mention count for individual users
  2. MSMQ – I’ll be creating a queue that will be sent messages from the Twitter Streaming API. I’m going to leave this for a future post since I’m just doing a POC at this point. If I were to even consider using this in production I would definitely develop a queuing system but at this time it’s a bit of overkill.
  3. TwitterReader (console) – A small application that will read the Twitter API and drop messages into an MSMQ channel.
  4. TwitterWriter (console) – A small application that will read messages from MSMQ and update the database No MSMQ as explained above so no need to read off the queue. Again, I’ll write these components so we can scale in a later post. For now, consider this just a POC

TwitterReader

This will be a slightly modified version of the console application we created in the previous post. Instead of writing to a console window, we’re going to parse the JSON objects using Json.net and then insert users mentioned in the tweet into a table using the SQL MERGE command.

Looking at the code, you’ll see we modified the stream URL slightly to include the keyword we want to track and sending that JSON result to a new method ParseJson.   In this case, let’s see who are the most popular users mentioned every time someone tweets something with the keyword “love” in it.   (I know it sounds corny but I needed something popular so that I can show off the results.)

We’re using Json.net’s Linq to Json feature to navigate to the user_mentions array. Once we have it, we just loop through all the users in the array and MERGE them into the database table through the stored procedure (see below).

twitter_stream_db

For simplicity, I’m going to create a single table to store the data as it comes in.  The primary key is the Twitter user’s id since there should be only one record per user at any given time.

I used the new MERGE command to perform an “upsert” of the data.  If its the first time the user has been mentioned, it will perform an insert and set the mention_count to 1.  Otherwise, we’ll update the record by setting mention_count to mention_count + 1.

Since I’m using the MERGE command, I encapsulated it into a stored procedure as oppose to writing LINQ queries. The stored procedure receives a user id and twitter name and performs the insert/update logic.

The MERGE command is an incredibly useful feature introduced in SQL Server 2008.

Parsing the JSON Result

As mentioned above,  we’re just going to parse the JSON object and iterate through the users mentioned in the tweet.  As we do that, we’ll pass the users into the stored procedure above and MERGE the data into the SQL table.


Important: Since we’re not using a queuing system, the rate at which we can process tweets will depend largely on the speed of our SQL stored procedure. If you’re considering something similar in a production environment, please implement a queuing system to handle the load

Results

Once you start running the application, you’ll start to see Twitter ids and screen names appearing on the console window.  Let it run for a few minutes and, depending on the popularity of your search term, you should start to see some results.  You can then go to SQL Server Management Studio and run a simple query to get a view of the activity on Twitter for that keyword / user mention combination.

Hope you enjoyed this post and have some ideas for implementing something similar with your next social media and Twitter campaigns!

Several social networking sites including Twitter and Digg have implemented some form of a streaming API. This is an emerging pattern with many high-volume content providers. At first glance, a streaming API may seem to be more resource intensive than traditional polling API, but as discussed here, it is actually much more streamlined.

Most of these server implementations use some sort of queuing system and a long-lived HTTP connection which clients use to have data delivered in near-realtime. In this post, I’m going to demonstrate just how easy it is to hook into the Twitter Streaming API.  Specifically, we’ll be using the streaming/filter method to consume any tweets which match a filter condition in realtime. I’m sure you can think of some neat ideas that can leverage this concept!

To demonstrate, let’s just create a console application in Visual Studio. We’re going to create a simple WebRequest to connect and just start reading the stream. Then, we’ll just print out the json result to the console window. I’ll leave it up to you to write some parsing logic to actually do something with the data.

That’s all there is to it!   Your output on the console window will look like this:

What you do with the data is where the real magic happens. The last I heard Twitter was producing somewhere in the range of 1000 tweets per second and I’m sure its much higher than that now.  You will probably want to implement message queuing system where you hand of the response as soon as possible for some other process to handle without blocking the incoming stream.

In a future post I’ll describe some neat ideas for what you can do with the data from the stream.

Hope this helps.