22 juin 2018
info:eu-repo/semantics/openAccess
Bernd Resch et al., « Generating Big Spatial Data on Firm Innovation Activity from Text- Mined Firm Websites. GI_Forum|GI_Forum 2018, Volume 1 | », Elektronisches Publikationsportal der Österreichischen Akademie der Wissenschafte, ID : 10.1553/giscience2018_01_s82
Innovation is one of the major drivers of economic growth, where spatial processes of knowledge spillover play a vital role. Current practices in assessing firms’ innovation activity, including patent analysis and questionnaires, suffer from severe limitations. In this paper, we propose a novel approach to estimate firms’ innovation activity based on the texts on their websites. We use an automated web-scraper to harvest text from the websites, then extract semantic topics in a self-learning, generative topic-modelling approach, and finally analyse these topics using an Artificial Neural Networks (ANN) method to assess each firm’s level of innovation. This procedure results in a large-scale dataset that will be used for further spatial economic analysis of the distribution of innovative firms and the processes that drive the development of innovation in firms.