One of the questions we face when liberating government data is prioritization. Our end goal is to release all raw govt data to the public in open machine readable formats – but which ones first? Our approach is to go after the low-hanging fruit initially (already available) and then focus on measuring public demand through the use of an open source voting platform.
While tracking demand through community voting is a decent proxy for actual demand, we were curious as to what the demand would look like in a mature environment; more specifically the frequency distribution of data consumption. So we reached out to the folks in DC who pioneered this space (special thanks to David Strigel, Program Manager – Citywide Data Warehouse).
What you see in the chart below(full size) isn’t too surprising; crime and 311 data are clear hits with 56% of total downloads. We also see that 80% of downloads come from 20% of datasets. The other fact that is notable is the number of downloads – 600,548 which doesn’t even take into account use in dashboards, reports, and applications (10/1/08 – 5/13/09).
The large number of downloads is very encouraging for other municipalities and speaks to how much public demand exists for machine readable datasets. This is an important fact, as we attempt to define measures for Gov 2.0 efforts that demonstrate value to our stakeholders.