Many Latin American countries publish open data—government data made freely available online in machine-readable formats and without license restrictions. However, there is a tremendous amount of variation in the quantity and type of datasets governments publish on national open data portals—central online repositories for open data that make it easier for users to find data. Despite the wide variation among the countries, the most popular datasets tend to be those that either provide transparency into government operations or offer information that citizens can use directly. As governments continue to update and improve their open data portals, they should take steps to ensure that they are publishing the datasets most valuable to their citizens.
To better understand this variation, we collected information about open data portals in 20 Latin American countries including Argentina, Bolivia, Brazil, Chile, Colombia, Costa Rica, Ecuador, Mexico, Panama, Paraguay, Peru, and Uruguay. Not all Latin American countries have an open data portal, but even if they do not operate a unified portal, some governments may still have open data. Four Latin American countries—Belize, Guatemala, Honduras, and Nicaragua—do not have open data portals. One country— El Salvador—does not have a government-run open data portal, but does have a national open data portal (datoselsalvador.org) run by volunteers.
Below is a table of countries we analyzed, along with whether the country operates an open data portal and its score on the Open Data Barometer.
Table 1: Latin American Countries With Open Data Portals
* The Open Data Barometer is a global measure of how governments are publishing and using open data. The scores are out of 100 points possible.
** The Bolivian government has a catalog of national statistics, but it does not refer to this site as an open data portal.
*** A non-government organization operates the open data portal in El Salvador.
Among those countries that do have open data portals, the number of datasets they publish varies greatly. Mexico, Colombia, and Brazil published some of the most extensive data catalogues featuring approximately 20,000, 6,000, and 2,500 datasets, respectively. Other countries published considerably fewer datasets such as Argentina, El Salvador, and, Panama which all published under 100 datasets on their portals. Countries publish data in a wide variety of formats, as well. In countries like Costa Rica and Peru, much of the data published was quantitative, such as government spending data and average prices of goods, and accessible in XLS and CSV spreadsheet files. Meanwhile, data catalogues in countries such as Paraguay and Brazil included lists of qualitative information such as lists of government web domains, university classes, certified lawyers, and court experts. Many of these lists were formatted in HTML or PDF files. Governments should increasingly use machine-readable formats and metadata because these files are easier to manipulate, standardize, and analyze.
For Bolivia, Colombia, Mexico, and Uruguay, where the open data portals provided information on the most downloaded or most visited datasets, we recorded the most popular datasets. However, the portals in Argentina, Brazil, Chile, Costa Rica, Ecuador, El Salvador, Panama, Paraguay, and Peru did not provide this information. Many times, these sites lacked this feature because they were either using older versions of the open source open data portal software CKAN or they were running a current version, but had not yet implemented page view tracking.
Software Used in Open Data Portals in Latin American Countries
Reviewing the most popular datasets in a country’s open data portal provides some insights into which datasets are most valuable, however, there are limitations to using this metric. First, this metric might not count certain datasets. For example, some government open data may be available from another source, such as directly from a government agency’s website or from a copy hosted by a third-party. In addition, this metric may overlook some high-value datasets which only have a small group of users, but provide them with a large amount of value. For those portals that provide information on the most popular datasets, certain patterns emerged.
The most popular datasets in Bolivia and Mexico included census data and surveys about housing, agriculture, and economic indicators. In other countries, the most popular datasets provided information directly useful to citizens, such as Colombia, which publishes lists of approved medical institutions and TV channels, and Uruguay, which publishes bus timetables. Countries should consider what information is most useful to citizens. For example, Paraguay publishes lists of the contact information for lawyers and translators.
Some open data portals had unique features that other countries may want to replicate. Chile sorted its datasets into folders that were arranged based on whether a particular dataset was published as a result of public requests for information, formal audits, or other accountability functions. Colombia is unique in that its open data portal also includes data visualizations. One of Colombia’s most popular datasets is an interactive map that pinpoints the location and address of a university headquarters, and another is a comparative bar graph about mobile telecommunication penetration rates in consecutive quarters. Other countries might find that one way to increase usage of their datasets is to make them more accessible to the average user through data visualizations.
There are many steps Latin American governments can take to improve open data in their country. Those nations without open data portals should create them, and those who already have them should continue to update them and publish more datasets to better serve their constituents. One way to do this is to monitor the popular datasets on other countries’ open data portals, and where applicable, ensure the government produces similar datasets. Those running open data portals should also routinely monitor search queries to see what users are looking for, and if they are looking for datasets that have not yet been posted, work with the relevant government agencies to make these datasets available.
In summary, there are stark differences in the amount of data published, the format of the data, and the most popular datasets in open data portals in Latin America. However, in every country there is an appetite for data that either provides public accountability for government functions or supplies helpful information to citizens.
Reporting from Center for Data Innovation. Michael Steinberg is a Google policy fellow at the Center for Data Innovation, where he researches open data issues in government.