[rede.APPIA] DaSSWeb – Data Science and Statistics Webinar – 27 April – Jorge Silva

DaSSWeb – Data Science and Statistics Webinar
Tuesday, 27 April, 14:30
Speaker: Jorge Silva University of Porto
Title: Bibliometrics: studying the present to make the best decisions for the future
Zoom link: videoconf-colibri.zoom.us/j/83875914782
Abstract: For several years bibliometrics have been used to help researchers understanding science and aid with decision making processes that pave the way for future research (e.g. funding allocation). Due to the exponential growth in research in recent years, bibliometric tools have become even more important and there is an increasing necessity for these tools to couple with large scale databases. In this talk, I will focus on three bibliometrics problems: publication similarity, author profiling and author ranking. Publication similarity is the task of measuring how similar two publications are. Author profiling is the problem of categorizing the knowledge of researchers. Author ranking is the task of estimating the scientific impact of researchers. For each one of these problems I will discuss their applications, research challenges, current state of the art and present our line of work. In more detail, I will discuss how complex networks (structures that are widely used to model and study complex systems) offer a unique way to study relations and patterns in bibliographic data and how this information can be used to improve bibliometric studies in the three mentioned problems.

***********************************************************************

[rede.APPIA] ECML/PKDD Discovery Challenges

ECML/PKDD Discovery Challenges

It is a great pleasure for us to announce three different Data Challenges associated with the 2021 edition of the European Conference on Machine Learning and Principle and Practice of Knowledge Discovery from Data<2021.ecmlpkdd.org/> (ECML/PKDD).

You can find the list of the challenges with their associated website here:

* Farfetch Fashion Recommendations Challenge:
www.ffrecschallenge.com/ecmlpkdd2021/

* Finding the planet in the noise: De-trending exoplanet light curves from non-linear noise:
www.ariel-datachallenge.space/

* Discover the mysteries of the Maya:
biasvariancelabs.github.io/maya_challenge/

All the challenges have just started (1st of April 2021) and they will end the 1st of July 2021. The papers associated with the best performing methods are expected to be submitted for the 22nd of July 2022.

Detailed instructions are available on the web page of each Data Challenge.

Looking forward to see your submission to one of the ongoing challenges,

All the best,

Paula Brito and Dino Ienco
Discovery Challenge Chairs

[rede.APPIA] DaSSWeb – Data Science and Statistics Webinar – 13 April – Conceição Silva

DaSSWeb – Data Science and Statistics Webinar
Tuesday, 13 April, 17:00
Speaker: Conceição Silva Católica Porto Business School & CEGE
Title: Benchmarking organizations: Lessons Learned, DEA, and Applications
Zoom link: videoconf-colibri.zoom.us/j/86416424581
Abstract:
The objective of this presentation is to introduce the concepts and techniques for benchmarking of organizations. The presentation will be based on the vast number of applications on which I have been involved over the past years, mainly in the public sector, including the benchmarking of schools, hospitals, primary health units and courts. Benchmarking can be undertaken through frontier techniques and Data Envelopment Analysis is the technique I have mostly used in my empirical applications. I will go into the basics of this technique for benchmarking purposes pinpointing its advantages and the main difficulties and challenges involved on its implementation. The presentation will be based on real-case examples drawn from my own experience.
***********************************************************************

[rede.APPIA] DaSSWeb – Data Science and Statistics Webinar – 30 Mar – Paulo Sousa

DaSSWeb – Data Science and Statistics Webinar
Tuesday, 30 March, 14:30
Speaker: Paulo Sousa FEP, Univ. Porto
Title: Discovering the link between a Japanese management system and performance: a statistical approach
Zoom link: videoconf-colibri.zoom.us/j/88637402848
Abstract:
Lean production originated in Japan and gained wide attention from its inception. It is defined as a multi-dimensional system with the central objective of waste elimination through practices that minimize supplier, customer and internal variability. It is believed that there is a link between Lean production and superior performance. In literature, however, the conclusions regarding Lean’s impact on performance are not totally consistent. In this work, by using a sample of 329 Portuguese enterprises and partial least squares–structural equation modelling (PLS-SEM), we find that there is a positive impact of Lean on performance (operational, financial and market performance).
***********************************************************************

[rede.APPIA] DaSSWeb – Data Science and Statistics Webinar – 16 Mar – Luís Torgo

DaSSWeb – Data Science and Statistics Webinar
Tuesday, 16 March, 17:00
Speaker: Luís Torgo Dalhousie University, Canada & Fac. Sciences, Univ. Porto
Title: Time Series Forecasting: some challenges and possible solutions
Zoom link: videoconf-colibri.zoom.us/j/81896928976
Abstract:
With the widespread availability of a multitude of data collection devices measuring different properties frequently in real time, time series forecasting is becoming increasingly important for many application domains. Approaches from many research disciplines (e.g.statistics, econometrics, machine learning, etc.) are available to practitioners and researchers. All these facts raise several challenges that we will discuss during this talk. We will describe alternative methods for correctly evaluating and comparing these approaches, thus facilitating the relevant task of model selection. We will also address some of the reasons leading to models performing rather differently across diverse application domains. Finally, we discuss how approaches based on ensembles can help in overcoming some of these difficulties.
***********************************************************************

[rede.APPIA] DaSSWeb – Data Science and Statistics Webinar – 23 Fev – Anabela Carneiro

DaSSWeb – Data Science and Statistics Webinar
Tuesday, 23 February, 14:30
Speaker: Anabela Carneiro (FEP, U.Porto and CEF.UP)
Title: Decomposition Methods in Economics to Assess which Covariates Matter
Zoom link: videoconf-colibri.zoom.us/j/81745596464
Abstract: It is very common in empirical research to estimate several regression models to check the robustness of the results or to evaluate how the estimate of the coefficient of interest changes as we add a set of covariates to a baseline model. For example, in explaining the sources of the gender wage gap, very often researchers estimate multiple wage equations in order to evaluate how the gender-dummy coefficient changes as individual and job characteristics are added to the model and then attribute this difference to the new set of variables included in the model. This approach is not exempt of criticism as the order in which covariates are added is not irrelevant. In this seminar, I will present a decomposition technique, proposed by Gelbach (2016), that appeals to the omitted variable bias formula to unambiguously disentangle the contribution of each covariate to the change in the estimate of the coefficient of the variable under scrutiny. This procedure was applied to matched employer-employee data in Portugal to decompose the sources of the wage losses of displaced workers (Raposo, Portugal & Carneiro, 2021).

[rede.APPIA] DaSSWeb – Data Science and Statistics Webinar – 9 Fev – Susana Barbosa

DaSSWeb – Data Science and Statistics Webinar
Tuesday, 9 February, 14:30
Speaker: Susana Barbosa INESC TEC
Title: Perspectives from an unwilling data scientist
Zoom link: videoconf-colibri.zoom.us/j/81316919512
Abstract:
In this talk I will share my perspective on both basic concepts and practical aspects of getting information from data, aka data science practice. In terms of concepts I will address the subtleties of prediction, explanation and understanding, particularly in the context of time series. Based on my experience with environmental data and field campaigns, I will discuss data collection and data management features and its relevance in terms of rational use of resources and greener computing. I will also outline future opportunities and challenges in exploring the complementary nature of AI-based and physical-based modeling for the study of complex systems, such as the earth system. Finally I will discuss the postnormal stage of data science and its implications, particularly the need of transparent, open, and explainable data science.