Big Data is not a free lunch

Antti Vasanen,
Head of Information Services,
Regional Council of Southwest Finland,
Finland

Regional development has become more and more complex phenomenon. The regions are increasingly interconnected and the development of one region is affected by the development of other regions. Flows and interactions between regions are increasingly important in addition to regions’ internal characteristics. This is particularly true in city regions with highly developed urban functions, but similar processes take place in all kinds of regions.

As such, it is also increasingly important to understand the processes that affect the regional development. Traditionally, the indicators of regional development have included parameters such as population growth, employment rate or regional GDP. The emerged complexity in regional development, however, requires new data sources, which determine the level of interaction or flows of people and information between the regions. This kind of data is often not readily available from traditional statistical sources and, therefore, new data sources such as big data are needed to understand regional interactions.

The need for new information sources yielded a recently finished ESPON project (https://www.espon.eu/big-data-corridors), of which goal was to enhance integrated policy development in the European growth corridors by strengthening the knowledge-base of regional development. The aims of the project were to find and evaluate new available data sources for evidence-based policy making and to research the potentials of big data and location-based data mining to better inform comprehensive spatial policy in growth corridors.

In the project, the feasibility of using big data to understand flows and interactions in the context of growth corridors was approached thorough three case studies. First case study used traffic measurement data in order to analyse movement of vehicles and hence people within the E18 transport corridor. Second one utilised EU project databases to analyse inter-regional interactions through project partnerships and the third case study used mobile phone data in order to analyse everyday mobility in Estonia.

The results of the project were controversial. On the one hand, turning traffic measurement data into a feasible origin-destination matrix was not possible at the required detail even though advanced modelling methodologies were used. Although the data included extremely detailed information of bypassing vehicles, it was not possible to extract the origin-destination information on the travel behaviour of people. Similarly, the analysis of project partner network yielded limited results on inter-regional interactions because of limitations in the data, which over-emphasised interactions between certain regions.

On the other hand, third case study resulted in a high-quality origin-destination matrix of the flows of people between Estonian territorial communities. Such data is valuable particularly because it provides precise information on actual movements of people. Other similar data, such as commuting statistics, describe only certain segment of all mobility and may be biased as commuting statistics, for instance in Finland, are based on locations of homes and workplaces of people, not the actual movements of workforce.

From these examples, it become clear that the ability of big data to answer various societal questions is smaller than is often thought. Finding suitable data sources is difficult and once suitable data is found, getting actually hold of the data is usually not the as simple as downloading it from the internet. And if one manages to obtain needed data, the data itself requires tedious work to transfer it into the format suitable for the analysis.

As such, despite the hype around the possibilities of big data to solve the problems which other data sources cannot answer, utilising big data is not a simple task. Although there are huge and ever-increasing amount of data to be utilised, only a tiny fragment of it is useful and freely available for analysing regional development. And even this fragment may require huge effort in data processing and analysis before the actual results are obtained. Indeed, despite not without potential, big data is surely not a free lunch.

Expert article 2642

> Back to Baltic Rim Economies 5/2019