Big data learning with evolutionary algorithms

This page has been created to provide information about the tutorial: "Big Data Learning with Evolutionary Algorithms" given at CEC 2017: http://www.cec2017.org


Abstract

In the era of big data, the leverage of recent advances achieved in distributed technologies enables data mining techniques to discover unknown patterns or hidden relations from voluminous data in a faster way. Extracting knowledge from big data becomes a very interesting and challenging task where we must consider new paradigms to develop scalable algorithms. However, evolutionary models for machine learning and data mining cannot be straightforwardly adapted to the new space and time requirements. Hence, existing algorithms should be redesigned or new ones developed in order to take advantage of their capabilities in the big data context. Moreover, several issues are posed by real-world complex big data problems besides from computational complexity, and big data mining techniques should be able to deal with challenges such as dimensionality, class-imbalance, and lack of annotated samples among others.
In the first part of this tutorial, we will provide a brief introduction to the big data problem, including MapReduce, as the most representative programing paradigm, as well as an overview of recent technologies (Hadoop ecosystem, Spark). Then, we will dive into the field of big data analytics, explaining the challenges that come to the evolutionary models and introducing machine learning libraries such as Mahout, MLlib and FlinkML.
Afterwards, we will go across the main topic of the CEC 2017: evolutionary models in the big data context. Some cases of study will be presented for evolutionary instance selection/generation, feature selection/weighting and imbalanced data classification.
Finally, we will carry out a live demonstration with the MLlib and the evolutionary models we have developed for imbalanced big data classification.

Material:




Short biography of the organizers:


Mikel Galar received the M.Sc. and Ph.D. degrees in Computer Science in 2009 and 2012, both from the Public University of Navarre, Pamplona, Spain. He is currently an assistant professor in the Department of Automatic and Computation at the Public University of Navarre. He is the author of 28 published original articles in international journals and 40 contributions to conferences. He is also reviewer of more than 35 international journals. He received the IEEE Transactions on Fuzzy Systems Outstanding Paper Award 2013 (bestowed in 2016). His research interests are data-mining, classification, big data learning, ensemble learning, evolutionary algorithms and fuzzy systems. He is a member of the European Society for Fuzzy Logic and Technology (EUSFLAT), the Spanish Association of Artificial Intelligence (AEPIA) and the IEEE.

Isaac Triguero received the M.Sc. and Ph.D. degrees in Computer Science from the University of Granada, Granada, Spain, in 2009 and 2014, respectively. He is currently an Assistant Professor in Data Science at the School of Computer Science of the University of Nottingham. He has published more than 25 international journal papers as well as more than 20 contributions to conferences. His research interests include data mining, data reduction, biometrics, optimization, evolutionary algorithms, semi-supervised learning, bioinformatics and big data learning.



Contact information:

Name: Mikel Galar

Email address: mikel.galar@unavarra.es

Affiliation: Public University of Navarra

Postal address: Department of Automatics and Computations, Public University of Navarre, 31006 Pamplona, Spain
Telephone number: +34 948 166040


Name: Isaac Triguero

Email address: Isaac.Triguero@nottingham.ac.uk

Affiliation: School of Computer Science, University of Nottingham.

Postal address: Jubilee Campus,Wollaton Road, Nottingham NG8 1BB, United Kingdom.

Telephone number: +44(0)115 8466416


(c) Copyright: Isaac Triguero Velázquez

Totally Valid XHTML 1.0 Totally Valid WCAG AAA