Week 2: Opening and using government data

Introduction to week 2 … What is open government data? … Open Government data processes …  Considerations when opening government data …  Open data portals
Week 2: Opening and using government data > 2.1 What is open government data? > Video 2.1

  • I’m a researcher at Delft University of Technology, and in this video, we will be exploring open government data and its relationship to the open government movement.
  • In order to function, governments need to collect and process a considerable amount of data each day.
  • What is this data? How is it used? And what data is in the public’s interest? After this video you should be able to describe basic concepts related to Open Government Data, including a) its key elements, b) a commonly agreed definition, and c) examples of OGD.
  • As the name suggests, it is often stated that Open Government Data consists of three key elements: OPENNESS, GOVERNMENT, and DATA.
  • The combination of these four elements, so Openness, Government, Data and Usability, is what we focus on in this part of our course.
  • Although there are more elements to an open government, the release and use of open government data is an important element, since this may provide insight in what the government is doing.
  • OGD is data that governments and publicly-funded research organizations actively publish on the internet It is for public reuse and the data can be accessed without restrictions and used without payment.
  • So which real-life examples of open government data do you know? In 2006, Dekkers and colleagues described examples of types of open government data: Open Government Data can be Geographic data, such as address information, aerial photos, buildings, cadastral information, geodetic networks, geology, hydrographical data and topographic information.
  • It can also be Legal data, which includes decisions of national, foreign and international courts, national legislation and treaties; The data can be Meteorological data, including climate data and models and weather forecasts; It is social data, such as statistics data about the economy, employment, health, population, public administration; Transport data can be open government data.
  • A sixth type of open government data mentioned by Dekkers and colleagues is Business data, such as Chamber of commerce information, official business registers, patent and trademark information and public tender databases.
  • The data are often fragmented and provided at different places.

Week 2: Opening and using government data > 2.2 Open Government data processes > Video 2.2

  • So what are the processes for collecting, publishing, finding and using governmental data? And which actors are involved in these processes? In this video, we will be exploring these issues.
  • You should be able to a) describe the basic processes of the collection, publication, finding and use of open government data, and b) to describe the actors involved in these processes.
  • Government agencies and publicly funded research organizations produce, collect and integrate large amounts of data each day.
  • Second, public agencies and publicly-funded research organizations decide whether they will open their data on the internet.
  • Governmental data is published on the internet increasingly, and is then referred to as open data.
  • Since open government data is provided through a large variety of portals, finding the data that someone is looking for can be challenging, especially if he or she does not know whether the data exists and which government organization creates or collects the data.
  • Open government data can be used in many different ways, for instance by cleansing, analyzing, visualizing, enriching, combining and linking it.
  • Data cleansing refers to detecting and correcting records in a dataset.
  • An analysis of a dataset could lead to new insights and understanding of the data, possibly by analyzing data in a way that was not done before.
  • Another important way of using open data is by combining data with other datasets or by linking it to other data, as this reveals relationships and correlations between datasets.
  • In order to analyze or combine open datasets, the user needs to be able to interpret the data and understand the context in which it has been created.
  • In sum, we just explored four basic processes, namely those of collecting data, publishing data, finding data, and using data.
  • Now we turn to our second learning objective, so which actors are involved in the processes of collecting, publishing, finding and using governmental data? First, we saw that data providers are involved since they supply the governmental data to the public.
  • Data providers and data users are the two key actors.
  • There may be tensions between data users and data providers, since they have different interests.
  • Governmental organizations may have collected and published data in a format that is not preferred by open data users, which can be a barrier for using the data.
  • In sum, before the public can obtain insight from government data, this data needs to go through several processes of data collection, publication, finding and use.
  • Governments and publicly funded research organizations are involved as data publishers, and the public is involved as data users.

Week 2: Opening and using government data > 2.3 Considerations when opening government data > Video 2.3

  • Opening governmental data may seem simple and easy, but in fact it is not an easy job, and it requires several considerations.
  • Let us imagine that you are a civil servant who has collected a number of datasets, and you consider to open the data.
  • Which aspects do you need to consider? The key considerations are related to embargo periods, data openness, data sensitivity and privacy, data quality and completeness, and to data documentation.
  • A potential risk of releasing such data is that it may be unclear who is responsible and accountable for the data release.
  • Second, is there an embargo period for the data? On the one hand, adopting a long embargo period reduces the risk on wrongfully publishing data and data may become less sensitive over a longer time period.
  • Third, to which extent should openness be provided? In one respect releasing governmental data may provide the public with more insight in what governmental processes encompass and what public agencies do.
  • Openness comes at a cost, since data providers need to put effort and resources in opening the data.
  • Each dataset may be secured in a different way, and data users may be provided with different “keys” to the data.
  • Fourth, is the data sensitive and is it legally allowed to open the dataset? It can be very difficult to determine the borderline between sensitive and non-sensitive data, especially since this needs to be done for each dataset individually.
  • The decision on the data’s sensitivity requires interpretation, and mistakes might be made.
  • Then the opened data implies some kind of bias and an unrealistic perspective may be created with the disclosed datasets.
  • Fifth, another consideration when opening government data concerns the quality and the completeness of the data.
  • Data users may comment on the data, try to increase the quality and this may create an incentive for the data publisher to improve the data.
  • The low quality data may be reused and decisions and conclusions may be based on this data.
  • To be able to use open government data, users need to have some information about the meaning of the data and the semantics need to be clear.
  • Adding considerable documentation to governmental datasets requires effort and time investments from the data provider, since this information often cannot be derived automatically from the data provider’s systems.
  • Some data has many benefits and hardly any disadvantages and can be opened without any discussion.
  • Is it more important that data are of high quality or is it more important just to publish the data and to let data users point out aspects of low quality? Is it more important to ensure that absolutely no datasets are published which are sensitive, and to remove all potentially sensitive variables? Or is it more important that the data is more useful, but might potentially be sensitive when combined with other data? These are important trade-offs.
  • In sum, opening government data is not easy, and there are many aspects that need to be considered when a public agency decides to open datasets.
  • The key considerations are related to embargo periods, data openness, data sensitivity and privacy, data quality and completeness, and to data documentation.

Week 2: Opening and using government data > 2.4 Open data portals > Video 2.4

  • You should be able to a) define open data portals, b) to describe their key elements and forms, and c) to give some examples of open data portals.
  • The portal is in between the end-user and the data provider.
  • Through open data portals, governmental organizations can give citizens and other open data users access to their data.
  • The portals can be owned and maintained by governments or by other actors, such as foundations, which may publish processed government data.
  • The data itself is usually not stored in the portal, but through the portal, the user can get access to the data, and to tools and services to use the data.
  • Citizens can find data through a variety of portals, and can then use the tools and services to analyze, visualize and otherwise use the data.
  • Existing open data portals differ with regard to their key elements and forms.
  • The existing portals also reach out to different user groups, have different objectives and publish different types of open data.
  • Some countries focus on economic benefits of publishing and using open data, on businesses and entrepreneurs as data users and on data that are expected to stimulate the development of innovative products and services.
  • Other countries focus more on objectives such as transparency and accountability, on researchers and citizens as data users, and on data that creates new insights and allows for holding governments accountable, such as budget data.
  • If user experience and high uptake are seen as important goals for making government data available, a more advanced portal may be developed.
  • Most portals only aim to make data searchable and findable.
  • The updates will take place at the location of the data itself, and the links to the data at the portal need to be updated occasionally.
  • Metadata, or data that describes the data, is often used to make government datasets searchable and findable through open data portals.
  • There are many metadata initiatives going on to make data more easily searchable and findable.
  • Technologies such as Application Programming Interfaces, or APIs, and Linked Data can be used to make data more easily searchable and findable.
  • APIs provide an interface to access the data, and Linked Data allows for relating chunks of data to each other.
  • Open data is made available through internet-based portals by governments from all over the world, and different forms of open data portals exist.
  • Various governmental agencies have their own open data portals, such as the portal of the Dutch meteorological institution, KNMI, about weather and climate.
  • This portal offers processed open data from sources such as government data portals and Dbpedia, and it analyzes, corrects, interlinks and depicts the data.
  • In sum, many open data portals are available to make government data searchable and findable.
  • In the ideal situation, underlying tools and services can then be used to analyze, visualize and otherwise use the data, and to derive useful conclusions.

