sparkplot: creating sparklines in Python with matplotlib
Sparkplot is a Python module that uses the matplotlib plotting library to create sparklines.
Edward Tufte introduced sparklines in a sample chapter of his upcoming book "Beautiful Evidence". In his words, sparklines are small, high-resolution graphics embedded in a context of words, numbers, images. Sparklines are data-intense, design-simple, word-sized graphics.
The following examples of sparkline graphics were created with sparkplot.
Here is the Los Angeles Lakers' road to their NBA titlein 2002. Wins are pictured with blue bars and losses with red bars. Note how easy it is to see the streaks for wins and losses. The Lakers' 2004 season was their last with Shaq, when they reached the NBA finals and lost to Detroit (note the last 3 losses which sealed their fate in the finals). Compare those days of glory with their abysmal 2005 performance, with only 2 wins in the last 21 games. Also note how the last graphic has a smaller width than the previous two graphics, a consequence of the Lakers not making the playoffs this year.
Here are the same sparklines in table format, to facilitate year-to-year comparison:
The southern oscillation is defined as the barometric pressure difference between Tahiti and the Darwin Islands at sea level. The southern oscillation is a predictor of El Nino which in turn is thought to be a driver of world-wide weather. Specifically, repeated southern oscillation values less than -1 typically defines an El Nino. Here is a sparkline for the southern oscillation from 1955to 1992 (456 sample data points obtained from NIST). The sparkline is plotted with a horizontal span drawn along the x axis covering data values between -1 and 0, so that values less than -1 can be more clearly seen.
Here is the per capita income in California from 1959to 2003. And here is the "real" per capita income (adjusted for inflation) in California, from 1959 to 2003.
Here is the monthly distribution of messages sent to comp.lang.py from 1994 to 2004, plotted per year. Minimum and maximum values are shown with blue dots and labeled in the graphics.
|Year||Message distribution||Total messages|
There was an almost constant increase in the number of messages per year, from 1994to 2004, the only exception being 2004, when there were fewer message than in 2002 and 2003.
Sparkplot installation and usage
2) Install matplotlib.
3) Check out the sparkplot module via subversion:
svn co http://svn.idyll.org/repos/sparkplot/ sparkplot
4) Prepare data file: sparkplot simplistically assumes that its input data file contains just 1 column of numbers.
5) Run sparkplot.py.
Here are some command-line examples to get you going:
- given only the input file and no other option, sparkplot.py will generate a gray sparkline with the first and last data points plotted in red
sparkplot.py -i CA_real_percapita_income.txt
The name of the output file is by default input_file_name_with_no_extension.png. It can be changed with the -o option. The plotting of the first and last data points can be disabled with the --noplot_first and --noplot_last options.
- given the input file and the --label_first --label_last --format=currency options, sparkplot.py will generate a gray sparkline with the first and last data points plotted in red and with the first and last data values displayed in a currency format
sparkplot.py -i CA_real_percapita_income.txt --label_first --label_last --format=currency
The currency symbol is $ by default, but it can be changed with the --currency option.
- given the input file and the --plot_min --plot_max --label_min --label_max --format=comma options, sparkplot.py will generate a gray sparkline with the first and last data points plotted in red, with the min. and max. data points plotted in blue, and with the min. and max. data values displayed in a 'comma' format (e.g. 23,456,789)
sparkplot.py -i clpy_1997.txt --plot_min --plot_max --label_min --label_max --format=comma
- given the input file and the --type=bars option, sparkplot.py will draw blue bars for the positive data values and red bars for the negative data values
sparkplot.py -i lakers2005.txt --type=bars
As a side note, I think bar plots look better when the data file contains a relatively large number of data points, and the variation of the data is relatively small. This type of plots works especially well for sports-related graphics, where wins are represented as +1 and losses as -1.
- given the input file and the -t or --transparency option, sparkplot.py will generate a transparent background for the PNG image it produces
sparkplot.py -i CA_real_percapita_income.txt -t
produces a transparent-background image
For other sparkplot options, run:
Sparkplot is licensed under the Python Software Foundation license.
Author contact info
Email: grig at gheorghiu dot net
Web site: http://agiletesting.blogspot.com
Kudos to John Hunter, the creator of matplotlib. I found this module extremely powerful and versatile. For a nice introduction to matplotlib, see also John's talk at PyCon05.
Thanks for Mark Kuba for contributing patches to the sparkplot code.
I hope the sparkplot module will prove to be useful when you need to include sparkline graphics in your Web pages. All the caveats associated with alpha-level software apply :-) Let me know if you find it useful. I'm very much a beginner at using matplotlib, and as I become more acquainted with it I'll add more functionality to sparkplot.
Other sparkline implementations
- Joe Gregorio has a Python implementation of sparklines that uses data:URI to embed them into HTML pages
- There is also a Ruby implementation of the data:URI mechanism
- James Byers has a PHP implementation of sparklines using a GD backend
- A topic on the Ask E.T. forum on Edward Tufte's site is dedicated to various sparkline implementations
- CA_percapita_income.png (1.5 kB) - added by grig on 08/05/09 22:33:13.
- CA_real_percapita_income.png (1.6 kB) - added by grig on 08/05/09 22:33:25.
- CA_real_percapita_income_transparent.png (1.3 kB) - added by grig on 08/05/09 22:33:34.
- CA_real_percapita_income_v1.png (1.1 kB) - added by grig on 08/05/09 22:33:42.
- CA_real_percapita_income_v2.png (2.0 kB) - added by grig on 08/05/09 22:33:47.
- clpy_1994.png (1.3 kB) - added by grig on 08/05/09 22:33:53.
- clpy_1995.png (1.2 kB) - added by grig on 08/05/09 22:33:59.
- clpy_1996.png (1.5 kB) - added by grig on 08/05/09 22:34:05.
- clpy_1997.png (1.5 kB) - added by grig on 08/05/09 22:34:11.
- clpy_1998.png (1.5 kB) - added by grig on 08/05/09 22:34:20.
- clpy_1999.png (1.4 kB) - added by grig on 08/05/09 22:34:27.
- clpy_2000.png (1.4 kB) - added by grig on 08/05/09 22:34:32.
- clpy_2001.png (1.6 kB) - added by grig on 08/05/09 22:34:37.
- clpy_2002.png (1.4 kB) - added by grig on 08/05/09 22:34:42.
- clpy_2003.png (1.4 kB) - added by grig on 08/05/09 22:34:47.
- clpy_2004.png (1.6 kB) - added by grig on 08/05/09 22:34:52.
- clpy_totals.png (1.4 kB) - added by grig on 08/05/09 22:34:59.
- lakers2002.png (1.2 kB) - added by grig on 08/05/09 22:35:05.
- lakers2004.png (1.3 kB) - added by grig on 08/05/09 22:35:11.
- lakers2005.png (1.3 kB) - added by grig on 08/05/09 22:35:17.
- southern_oscillation.png (6.1 kB) - added by grig on 08/05/09 22:35:23.