During the first day of Why R? 2018 conference, 2nd July 2018, we are more than pleased to proposed following set of workshops. Below you can find the plan and descriptions. See you at the venue!
Please note that RLadies workshop is a free accompanying event that will take place in a separate building. We invite all participants that start their journey with R!
The plan of workshops (rooms and hours) is available here - https://whyr2018.github.io/plan/workshop_plans/)
Instructor: Piotr Sobczyk
Creating spatial data visualization is one of the coolest elements in the R toolbox. In the workshop I shall show you tips to follow, pitfalls to avoid and hacks that might be either one of them :) Starting from a basic plot function, we will cover ggplot2 and finish with R packages that use interactive javascript libraries. You will get familiar with the full process from finding the right data, its processing in R up to preparing final data to plot.
I suggest to install the following packages in advance: c(‘dplyr’, ‘tidyr’, ‘sf’, ‘ggplot2’, ‘ggmap’,‘ggthemes’, ‘animation’, ‘leaflet’, ‘rnaturalearth’).
Shorty a repo with full workshop material will be available.
Instructor: Mikołaj Olszewski & Mikołaj Bogucki (iDash)
Preparing a slide deck with the results of your research seem to be quite straightforward. You produce all the plots and tables in R and just paste them into PowerPoint, right? Or you might have gone a bit further and already used RMarkdown with ioslides, Slidy or Beamer. Those technologies however have many drawbacks. Their default look is quite outdated and it’s hard to customise it and make each slide look exactly as you wanted. This might be especially problematic in case of companies that needs to follow strict brand guidelines.
This hands-on workshop will introduce participants to a different package called xaringan that solves all the issues. It allows to customise each slide entirely to suit needs of the most demanding users. Since it also uses RMarkdown, it allows to produce not only eye-catching but also reproducible results. Moreover, it allows to preview your slides dynamically in RStudio making your work much easier. It’s also relatively easy to export the slide deck (natively in HTML) to a pixel perfect PDF.
Join us if you want to learn how make your next R slide deck awesome!
We kindly ask participants to bring their own laptops with the following software installed:
R packages required for the workshops:
Those packages can be installed by running the following script:
install.packages("xaringan")
install.packages("rmarkdown")
install.packages("leaflet")
install.packages("plotly")
install.packages("ggplot2")
install.packages("DT")
Instructor: Roman Popat (Jumping Rivers)
a quick introduction to creating interactive visualisations of data using shiny. The workshop will first make sure everyone is familiar with rmarkdown and htmlwidgets for creating a document with nice visualisations. We will then extend this knowledge by examining the shiny package for creating input output bindings to interaction with our R data structures. We will cover the basics of input and output for a shiny application and then explore creating our own page layouts. By the end of the workshop participants should feel comfortable getting started with creating their own shiny applications.
recap/intro to markdown
input widgets and render functions
page layouts using shiny and shiny dashboard
Tools for exploration, validation and explanation of complex machine learning models.
Instructor: Mateusz Staniak (Uniwersytet Wrocławski)
Complex machine learning models are frequently used in predictive modeling. There are a lot of examples for random forest like or boosting like models in medicine, finance, agriculture etc.
In this workshop we will show why and how one would analyze the structure of the black-box model.
This will be a hands-on workshop with four parts. In each part there will be a short lecture and then time for practice and discussion. Find the description for each part below.
Introduction Here we will show what problems may arise from blind application of black-box models. Also we will show situations in which the understanding of a model structure leads to model improvements, model stability and larger trust in the model. During the hands-on part we will fit few complex models (like xgboost, randomForest) with the mlr package and discuss basic diagnostic tools for these models.
Conditional Explainers In this part we will introduce techniques for understanding of marginal/conditional response of a model given a one- two- variables. We will cover PDP (Partial Dependence Plots) and ICE (Individual Conditional Expectations) packages for continuous variables and MPP (Merging Path Plot from factorMerger package) for categorical variables.
Local Explainers In this part we will introduce techniques that explain key factors that drive single model predictions. This covers Break Down plots for linear models (lm / glm) and tree-based models (randomForestExplainer, xgboostExplainer) along with model agnostic approaches implemented in the live package (an extension of the LIME method).
Global Explainers In this part we will introduce tools for global analysis of the black-box model, like variable importance plots, interaction importance plots and tools for model diagnostic.
Packages that we will use include mlr (Bernd Bischl and others), DALEX (Przemysław Biecek), live (Staniak Mateusz, and Przemysław Biecek), FactorMerger(Sitko Agnieszka, and Przemyslaw Biecek), pdp (Greenwell, Brandon), ALEPlot (Apley, Dan).
Instructor: Tomasz Żółtak (Educational Research Institute (Warsaw, Poland))
Surveys often include sets of questions on the same subject, designed to create more general indicators of views, attitudes, knowledge or other characteristics of respondents. Such an indicators allow for synthesis of information, drawing more general conclusions and reduction of random measurement errors. As continuous variables, they are also easier to use in further analysis.
However, the use of survey questions often involves a number of problems: - answers are given on scales that can’t be treated as continuous (eg. a Likert scale); - response to the questions may depend on the way in which they are worded, eg. respondents may react a little different to negative statements; - respondents may have different styles of answering questions, eg. some may prefer more extreme answers than the other; - in self-assessment questionnaires some respondents may be inclined to give untruthfully answers indicating a higher level of knowledge or skills.
Workshop participants will learn how to use R to: - create scales based on sets of categorical variables using Categorical Exploratory/Confirmatory Factor Analysis (CEFA / CCFA) and IRT models; - use models with bi-factor rotation to deal with different forms of asking questions; - correct for differences in a style of answering questions asked using a Likert scale; - use the possibility to correct self-assessment knowledge/skill indicators using fake items.
During the workshop R packages ‘polycor’, ‘mirt’ and ‘laavan’ will be used along with the data from international surveys: ESS, PISA and PIAAC.
Instructor: Bartłomiej Kraszewski (Forest Research Institute)
Remote sensing data from different sensors is a rich source of information for studying the natural environment, natural phenomena and monitoring some extreme phenomena, i.e. floods. Analyses and products made on remote sensing data are often essential for supporting decision-making processes in cities, forests and agriculture. Analyses of RS data are carried out for large areas which amounts to the use of advanced tools for their processing, i.e. databases or programming languages. For this type of analyses the R language is used more and more often. Its tools inventory in this area is still growing. R packages can be used for data analysis, processing and visualization.
The workshop aim is to present R language packages that can be used to work with remote sensing data. During the course packages for GIS analysis (rgdal, rgeos, sf), raster data processing (raster) and ALS data processing (lidR) will be used. The possibility of mutual data integration will be presented in order to obtain new information for later analyses and modelling using machine learning. The entire workshop will be carried out as a simple project of remote sensing data analysis in the forest environment. During the workshop lecturers will put emphasis on the practical use of R packages, which they usually use in their daily work in large remote sensing projects (LIFE ForBioSensing and RemBioFor) carried out by the Forest Research Institute in Sękocin Stary.
Co-host of workshop: Agnieszka Kamińska from Forest Research Institute
Instructor: Roman Popat (Jumping Rivers)
This workshop would suit attendees who are already comfortable with creating shiny applications. We will explore how to add functionality to our app using javascript packages and code. No real javascript knowledge is required to get started if you are a confident R programmer but the session will contain examples with written javascript. We will then explore how one might deal with routines in a shiny application that take a long time to run, or how to provide a good experience for simultaneous users of your app. We will then explore creating a standalone web served API to our R code and integrate the use of it into a shiny application.
adding functionality from javascript code
futures and promises for long running code
create and integrate with an external API
Instructor: Michał Maj (Appsilon Data Science)
With the release of the R Keras package (https://keras.rstudio.com/) (by JJ Allaire and Francois Chollet) at the end of 2017 / beginning 2018 the topic of artificial neural networks and especially deep learning in R became red-hot within the R community.
In this workshop you will get answers for the following questions:
Please make sure to bring your laptop including an up to date R version, RStudio and install Keras:
Setup Keras (Make sure to install required prerequisites, before installing Keras using the commands below)
install_keras() # CPU version
# install_keras(tensorflow = "gpu") # GPU version (recommended)