DaSSWeb – Data Science and Statistics Webinar
Tuesday 9 November, 14:30
Speaker: Victor Lobo NOVA-Information Management School & Escola Naval
Title: Cartograms, and how to obtain them using Self-Organizing Maps and controling their Magnification Effect.
Zoom link: videoconf-colibri.zoom.us/j/84889120945
<videoconf-colibri.zoom.us/j/87373848710> Abstract: Cartograms are geographic representations where the area of each region is distorted to as to be proportional to a given variable of interest. The most common ones are population cartograms, where regions with large populations are enlarged, squeezing the regions with less population. To obtain a good cartogram, not only each regions should occupy an area proportional to the variable of interest, but at the same time the necessary distortions should allow the users to still recognize the map. Several examples of cartograms shall be given, together with the algorithms that produce them. Finally, a method based on Self-Organizing Maps, named CartoSOM will be presented, together with a recent improvement based on a better estimation of the “Magnification Effect” of the SOM.
***********************************************************************
— Esta mensagem foi enviada para a rede APPIA, que engloba os associados da APPIA. Se desejar deixar de receber este tipo de mensagens, p.f. envie um email para infos [at] appia [ponto] pt
Author: Maria Paula Brito Silva
[rede.APPIA] DaSSWeb – Data Science and Statistics Webinar – 26 Oct – Is there a free lunch in imbalanced learning?
DaSSWeb – Data Science and Statistics Webinar
Tuesday 26 October, 14:30
Speaker: Nuno Moniz
Researcher INESC TEC Invited Professor @ Faculty of Sciences, University of Porto Invited Professor @ Faculty of Engineering, University of Porto
Title: Is there a free lunch in imbalanced learning?
Zoom link: videoconf-colibri.zoom.us/j/89767753105
<videoconf-colibri.zoom.us/j/87373848710> Abstract: The ability to predict rare events remains one of the most challenging tasks to solve in machine learning. For almost three decades, research in imbalanced learning has produced many strategies to help in this endeavour. The most popular – resampling strategies – work by creating new data sets where the original data is biased towards cases describing rare events having higher probability. Today, it would seem that both research and industry have widespread assumptions concerning which are the “best” or “worst” strategies. In this talk, we will set up a face-to-face between theory and practice. First, we will leverage the concept of no free lunch to analyse if we can assume there are resampling strategies more likely to be the best in solving imbalanced learning problems. Second, we will evaluate if data characteristics can help us automatically decide which strategies are most likely to produce the best outcome in unseen data.
— Esta mensagem foi enviada para a rede APPIA, que engloba os associados da APPIA. Se desejar deixar de receber este tipo de mensagens, p.f. envie um email para infos [at] appia [ponto] pt
[rede.APPIA] DaSSWeb – Data Science and Statistics Webinar – 13 July – WATER CHALLENGES
DaSSWeb – Data Science and Statistics Webinar
Tuesday 13 July, 14:30
Speaker: Clara Cordeiro FCT, Universidade do Algarve and CEAUL
Title: WATER CHALLENGES
Zoom link: <videoconf-colibri.zoom.us/j/87373848710>videoconf-colibri.zoom.us/j/83875914782 <videoconf-colibri.zoom.us/j/87373848710>
Abstract: Water is one of the world’s most important natural resources. However, in recent years, periods of drought attributed to climate change have affected many countries worldwide. Consequently, this natural resource has become limited in some regions globally, such as the Algarve region in Portugal. Water utilities in the region have felt responsible for raising consumers’ awareness about responsible water use. In this context, water utilities want to promote sustainable water use and reduce water consumption through a water metering policy. However, when it comes to the processing of actual metering data, several difficulties arise. This seminar will show the challenges proposed by water utilities of this region and the strategies used to overcome them.
***********************************************************************
[rede.APPIA] Data Science and Statistics Webinar – 29 June – Detection of Internet Traffic Redirection Attacks using Histogram PCA
DaSSWeb – Data Science and Statistics Webinar
Tuesday 29 June, 14:30
Speaker: M. Rosário Oliveira CEMAT and Mathematics Department, Instituto Superior Técnico, Univ. Lisboa
Title: Detection of Internet Traffic Redirection Attacks using Histogram Principal Component Analysis
Zoom link: videoconf-colibri.zoom.us/j/87373848710
Abstract: Internet security is a major concern for users and Internet Service Providers, since successful attacks can produce substantial damage. Illicit Internet traffic redirection cause man-in-the-middle attacks, in which a malicious agent secretly intercepts the traffic between two hosts connected to the Internet. The attack may be aimed at gaining access to sensitive information from the victim, monitoring its online activity, causing network delay, among other motivations.
To identify traffic redirection attacks we had access to measurements obtained from a worldwide distributed probing platform, designed to detect routing variations based on round-trip-times (RTT) deviations inferred from multiple and disperse geographic locations. At each timestamp, various measurements are collected and summarized by histograms. We propose anomaly detection methods based on histogram principal component analysis. To do so, we discuss how to define a weighted sum of histogram-valued data and how to use the projected data on the first histogram principal component to successfully detect traffic redirections attacks.
This is a joint work with Ana Subtil, Eduardo Mendes, and Lina Oliveira.
***********************************************************************
[rede.APPIA] Data Science and Statistics Webinar – 15 June – Structural Equation Modelling in Tourism Management Research
DaSSWeb – Data Science and Statistics Webinar
Tuesday 15 June, 11:00
Speaker: Patrícia Pinto Faculty of Economics, University of Algarve
Title: Structural Equation Modelling in Tourism Management Research
Zoom link: videoconf-colibri.zoom.us/j/82189454885
Abstract: Structural Equation Modelling (SEM) is frequently used in the top hospitality and tourism journals, and its application has significantly increased since 2000. This seminar will provide a conceptual overview of SEM in tourism management research, emphasizing the recommended guidelines for its correct use. Motives for choosing the method, data requirements, model characteristics, and model assessment steps will be revised. Some advanced applications will also be identified and several practical applications will be presented. The seminar will end with a more detailed application of SEM by exploring an empirical article recently published in an influential journal in the Tourism Management field.
***********************************************************************
[rede.APPIA] Data Science and Statistics Webinar – 1 June – Model-based clustering of time series data
DaSSWeb – Data Science and Statistics Webinar
Tuesday 1 June, 14:30
Speaker: José G. Dias ISCTE Business School
Title: Model-based clustering of time series data
Zoom link: videoconf-colibri.zoom.us/j/84508964379
Abstract: In the digital age, data streams have been produced at an increasing pace from different sources for instance from biometric devices (sensors) and stock market (high frequency) data to digital platforms (feeds, audio, video). This type of data, measured on one or more variables over time (or sequence), is called time series, panel, or more generally longitudinal data. Time-dependent modeling has been applied in many contexts not only forecasting, but also outlier detection, matching, clustering, indexing, etc. This talk discusses the use of finite mixture models in time series clustering. First, I present the overall finite mixture framework. Then, a second level of analysis is added to model sequences within each observation. An application to COVID time series data illustrates the main concepts. The talk concludes with a brief discussion in the context of cross-sectional, dynamic clustering, and biclusteringwith implications for density estimation, outlier detection, and measurement error modeling.
***********************************************************************
[rede.APPIA] DaSSWeb – Data Science and Statistics Webinar – 11 May – Info-metrics, by Pedro Macedo
DaSSWeb – Data Science and Statistics Webinar
Tuesday 11 May, 14:30
Speaker: Pedro Macedo University of Aveiro
Title: Info-metrics: some attractive tools for Data Science and Statistics
Zoom link: videoconf-colibri.zoom.us/j/83875914782
Abstract: Info-metrics is a research area at the intersection of statistics, information theory, computer science and decision analysis, where the maximum entropy principle, established by Edwin Jaynes in 1957, plays a fundamental role. The maximum entropy principle provides a tool for solving ill-posed problems that occur in diverse areas of science and it can be seen as an extension of the Bernoulli’s principle of insufficient reason. Some methodologies and related concepts from info-metrics will be briefly presented and illustrated in the webinar.
***********************************************************************