A Simple Twitter Filter Stream Parser

January 2, 2011

In this post, I provide an example of how to use the Twitter4J library to pull tweets that contain specific keywords or from individual users. While this is a very simple example, I hope it helps new comers to Twitter application development, because the simple functionality provided here is the basis for creating many interesting Twitter related applications.

This example can also be run as-is by a non-developer to simply pull Twitter data into a text file.

The code for this example is posted on GitHub at https://github.com/drenz/TwitterFilterStreamParser

Running the Example

Install the required software to download and run the example

Clone the git repository

  • git clone git@github.com:drenz/TwitterFilterStreamParser.git

Update property files

  • Update properties/runtime.properties file to supply your Twitter screen name and password. Twitter requires authentication before opening a data stream.
  • Update the properties/keywords file to specify which keywords you’d like to receive tweets about. For each keyword you would like to track on twitter, add the keyword on its own line. NOTE: keywords cannot have spaces.
  • Update the properties/users to specify which Twitter users you’d like to receive tweets about. For each user you would like to track on twitter, add the user’s Twitter ID number on its own line. You can find a user’s Twitter ID given their screen name  at http://www.idfromuser.com/

Run the application

  • In the project’s base directory, execute: ant run

Next Steps

While this example is functional, there are many things that should be done to create a solid application based on Twitter data.  For instance, DontTweetThat (http://www.donttweetthat.com) uses the same basic principle to collect the tweets that it displays, but is refined to be robust, scalable and flexible.

Here is a brief list things you may want to do while expanding on this example:

  • Handle exceptions that arise from potential network issues, Twitter hiccups or other problems that would stop the parser from collecting data
  • Store tweets in a database rather than a simple text file
  • Once a tweet is received, do some processing to filter out Tweets that may not be of any value
  • Do data analysis on tweets you’ve stored

I intend to do posts on some of these improvements, so stay tuned!

Helpful Resources

Twitter4J: Great documentation and more examples for using the Twitter4J library
Twitter Developers site: Lots of info on the various ways to access Twitter’s and best practices for implementing Twitter-based applications.

Leave a Comment

Previous post:

Next post: