In a previous post, we saw how easy it was to start consuming the Twitter Streaming API and display the messages on a console window. In this post, we’ll take it a step further and try to derive some useful information about the the activity stream on Twitter in real-time. I’ve given myself a hypothetical goal of deriving an answer to the following question:
Which Twitter users are mentioned most often when a given keyword is included in the tweet?
How useful is this information? One idea could be to identify influencers in real time when you’re expecting a sudden increase in a keyword or hashtag for a sponsored event or TV spot. If people are mentioning a particular user over and over again with your brand in it, you could connect with that user and help spread your message.
Let’s Get Started
For demonstration purposes, the components I’ll be building include:
- twitter_stream_db (SQL Server Database) – This will store the mention count for individual users
MSMQ – I’ll be creating a queue that will be sent messages from the Twitter Streaming API.I’m going to leave this for a future post since I’m just doing a POC at this point. If I were to even consider using this in production I would definitely develop a queuing system but at this time it’s a bit of overkill.
- TwitterReader (console) – A small application that will read the Twitter API and drop messages into an MSMQ channel.
TwitterWriter (console) – A small application that will read messages from MSMQ and update the databaseNo MSMQ as explained above so no need to read off the queue. Again, I’ll write these components so we can scale in a later post. For now, consider this just a POC
This will be a slightly modified version of the console application we created in the previous post. Instead of writing to a console window, we’re going to parse the JSON objects using Json.net and then insert users mentioned in the tweet into a table using the SQL MERGE command.
Looking at the code, you’ll see we modified the stream URL slightly to include the keyword we want to track and sending that JSON result to a new method ParseJson. In this case, let’s see who are the most popular users mentioned every time someone tweets something with the keyword “love” in it. (I know it sounds corny but I needed something popular so that I can show off the results.)
We’re using Json.net’s Linq to Json feature to navigate to the user_mentions array. Once we have it, we just loop through all the users in the array and MERGE them into the database table through the stored procedure (see below).
For simplicity, I’m going to create a single table to store the data as it comes in. The primary key is the Twitter user’s id since there should be only one record per user at any given time.
I used the new MERGE command to perform an “upsert” of the data. If its the first time the user has been mentioned, it will perform an insert and set the mention_count to 1. Otherwise, we’ll update the record by setting mention_count to mention_count + 1.
Since I’m using the MERGE command, I encapsulated it into a stored procedure as oppose to writing LINQ queries. The stored procedure receives a user id and twitter name and performs the insert/update logic.
The MERGE command is an incredibly useful feature introduced in SQL Server 2008.
Parsing the JSON Result
As mentioned above, we’re just going to parse the JSON object and iterate through the users mentioned in the tweet. As we do that, we’ll pass the users into the stored procedure above and MERGE the data into the SQL table.
Important: Since we’re not using a queuing system, the rate at which we can process tweets will depend largely on the speed of our SQL stored procedure. If you’re considering something similar in a production environment, please implement a queuing system to handle the load
Once you start running the application, you’ll start to see Twitter ids and screen names appearing on the console window. Let it run for a few minutes and, depending on the popularity of your search term, you should start to see some results. You can then go to SQL Server Management Studio and run a simple query to get a view of the activity on Twitter for that keyword / user mention combination.
Hope you enjoyed this post and have some ideas for implementing something similar with your next social media and Twitter campaigns!