Introduction

anHALytics is a project aiming at creating an analytic platform for the HAL research archive and other scientific Open Access repositories, exploring various analytic aspects such as search/discovery, activity and collaboration statistics, trend/technology maps, knowledge graph and data visualization.

The project has three main objectives:

  • to demonstrate that when the scientific publications produced by an institution are archived at a very high rate on an Open Access (OA) repository such as the HAL research archive, the repository becomes a reliable mirror of the scientific production of this institution. An OA repository can then can be exploited, in particular, for better understanding the evolution of science and technologies and for anticipating the strenght and needs of the institution. We explore this objective with Inria (the French National Insitute in Computer Science and Control) which exhibits a rate of 80% OA publication deposit.

  • by illustrating the interest of science analytics at the level of an institution with a state-of-the-art Open Source platform, to encourage all the research institutions to implement an Open Access mandate. Such Open Access policy is the best solution for increasing the amount of self-archived publications freely accessible on institutional repository or disciplinary repository (Green Open Access).

  • finally to integrate a set of state-of-the-art Machine Learning text mining tools we have developed in the last years to automatically ingest, normalize, annotate, and index scientific documents. This transfer from research to production is an opportunity to improve the robustness and the scalability of these research prototypes and to demonstrate their impact and interests for innovative explotations of scientific information.

Started in November 2014, the project is supported by an ADT Inria grant and Open Source contributions. anHALytics is also the starting point of an http://www.istex.fr/en "use study" project ("chantier d'usage") where the ingestion, enrichment, indexing, aggregation and visualisation a large amount of ISTEX documents will be experimented until august 2017.

People

If you are interested in contributing to the project, please contact patrice.lopez@inria.fr.