During the first day of Why R? 2018 conference, 2nd July 2018, we are more than pleased to proposed following set of workshops. Below you can find the plan and descriptions. To optimally schedule all workshops, we will send the workshop preference form to all registered participants.

See you at the venue!

Please note that RLadies workshop is a free accompanying event that will take place in a separate building. We invite all participants that start their journey with R!

Why R? 2018 workshops

Responding to analysis and communication: Data science the R way

Instructor: Tatjana Kecojevic (DataTeka)

You have heard about R and would like to learn and see what it does and how it works? Then you should come to our workshop. Through a series of demonstrations and hands on exercises you will learn some of the fundamental concepts of R and you will get to know how to use its tools needed in a typical data science project. We will introduce you to graphical and numerical techniques for exploring the information concealed within a dataset. After we develop an understanding of our data we will use R’s reproducible and interactive approach in telling our data story by creating reproducible R-Markdown documents and Shiny Web Apps. You should bring a laptop with latest version of R and RStudio installed. All the material will be available for download.

Presentation for the workshop available at: https://github.com/TanjaKec/DSRWay

iDash - Make your R slides awesome with xaringan

Instructor: Mikołaj Olszewski (iDash)

Preparing a slide deck with the results of your research seem to be quite straightforward. You produce all the plots and tables in R and just paste them into PowerPoint, right? Or you might have gone a bit further and already used RMarkdown with ioslides, Slidy or Beamer. Those technologies however have many drawbacks. Their default look is quite outdated and it’s hard to customise it and make each slide look exactly as you wanted. This might be especially problematic in case of companies that needs to follow strict brand guidelines.

This hands-on workshop will introduce participants to a different package called xaringan that solves all the issues. It allows to customise each slide entirely to suit needs of the most demanding users. Since it also uses RMarkdown, it allows to produce not only eye-catching but also reproducible results. Moreover, it allows to preview your slides dynamically in RStudio making your work much easier. It’s also relatively easy to export the slide deck (natively in HTML) to a pixel perfect PDF.

Join us if you want to learn how make your next R slide deck awesome!

Jumping Rivers - Shiny Basics

Instructor: Roman Popat (Jumping Rivers)

a quick introduction to creating interactive visualisations of data using shiny. The workshop will first make sure everyone is familiar with rmarkdown and htmlwidgets for creating a document with nice visualisations. We will then extend this knowledge by examining the shiny package for creating input output bindings to interaction with our R data structures. We will cover the basics of input and output for a shiny application and then explore creating our own page layouts. By the end of the workshop participants should feel comfortable getting started with creating their own shiny applications.

recap/intro to markdown

  • a quick introduction/refresher on rmarkdown for document styling
  • adding some interactive graphs through htmlwidgets

input widgets and render functions

  • extend a markdown document to run using shiny
  • adding input controls
  • using inputs to render output tables and graphs

page layouts using shiny and shiny dashboard

  • shiny and shinydashboard allow more control over page layouts
  • creating a layout with input and output “slots”

DALEX - Descriptive mAchine Learning EXplanations

Tools for exploration, validation and explanation of complex machine learning models.

Instructor: Mateusz Staniak (Uniwersytet Wrocławski)

Complex machine learning models are frequently used in predictive modeling. There are a lot of examples for random forest like or boosting like models in medicine, finance, agriculture etc.

In this workshop we will show why and how one would analyze the structure of the black-box model.

This will be a hands-on workshop with four parts. In each part there will be a short lecture and then time for practice and discussion. Find the description for each part below.

  1. Introduction Here we will show what problems may arise from blind application of black-box models. Also we will show situations in which the understanding of a model structure leads to model improvements, model stability and larger trust in the model. During the hands-on part we will fit few complex models (like xgboost, randomForest) with the mlr package and discuss basic diagnostic tools for these models.

  2. Conditional Explainers In this part we will introduce techniques for understanding of marginal/conditional response of a model given a one- two- variables. We will cover PDP (Partial Dependence Plots) and ICE (Individual Conditional Expectations) packages for continuous variables and MPP (Merging Path Plot from factorMerger package) for categorical variables.

  3. Local Explainers In this part we will introduce techniques that explain key factors that drive single model predictions. This covers Break Down plots for linear models (lm / glm) and tree-based models (randomForestExplainer, xgboostExplainer) along with model agnostic approaches implemented in the live package (an extension of the LIME method).

  4. Global Explainers In this part we will introduce tools for global analysis of the black-box model, like variable importance plots, interaction importance plots and tools for model diagnostic.

Packages that we will use include mlr (Bernd Bischl and others), DALEX (Przemysław Biecek), live (Staniak Mateusz, and Przemysław Biecek), FactorMerger(Sitko Agnieszka, and Przemyslaw Biecek), pdp (Greenwell, Brandon), ALEPlot (Apley, Dan).

Constructing scales from survey questions

Instructor: Tomasz Żółtak (Educational Research Institute (Warsaw, Poland))

Surveys often include sets of questions on the same subject, designed to create more general indicators of views, attitudes, knowledge or other characteristics of respondents. Such an indicators allow for synthesis of information, drawing more general conclusions and reduction of random measurement errors. As continuous variables, they are also easier to use in further analysis.

However, the use of survey questions often involves a number of problems: - answers are given on scales that can’t be treated as continuous (eg. a Likert scale); - response to the questions may depend on the way in which they are worded, eg. respondents may react a little different to negative statements; - respondents may have different styles of answering questions, eg. some may prefer more extreme answers than the other; - in self-assessment questionnaires some respondents may be inclined to give untruthfully answers indicating a higher level of knowledge or skills.

Workshop participants will learn how to use R to: - create scales based on sets of categorical variables using Categorical Exploratory/Confirmatory Factor Analysis (CEFA / CCFA) and IRT models; - use models with bi-factor rotation to deal with different forms of asking questions; - correct for differences in a style of answering questions asked using a Likert scale; - use the possibility to correct self-assessment knowledge/skill indicators using fake items.

During the workshop R packages ‘polycor’, ‘mirt’ and ‘laavan’ will be used along with the data from international surveys: ESS, PISA and PIAAC.

From RS data to knowledge – Remote Sensing in R

Instructor: Bartłomiej Kraszewski (Forest Research Institute)

Remote sensing data from different sensors is a rich source of information for studying the natural environment, natural phenomena and monitoring some extreme phenomena, i.e. floods. Analyses and products made on remote sensing data are often essential for supporting decision-making processes in cities, forests and agriculture. Analyses of RS data are carried out for large areas which amounts to the use of advanced tools for their processing, i.e. databases or programming languages. For this type of analyses the R language is used more and more often. Its tools inventory in this area is still growing. R packages can be used for data analysis, processing and visualization.

The workshop aim is to present R language packages that can be used to work with remote sensing data. During the course packages for GIS analysis (rgdal, rgeos, sf), raster data processing (raster) and ALS data processing (lidR) will be used. The possibility of mutual data integration will be presented in order to obtain new information for later analyses and modelling using machine learning. The entire workshop will be carried out as a simple project of remote sensing data analysis in the forest environment. During the workshop lecturers will put emphasis on the practical use of R packages, which they usually use in their daily work in large remote sensing projects (LIFE ForBioSensing and RemBioFor) carried out by the Forest Research Institute in Sękocin Stary.

Co-host of workshop: Agnieszka Kamińska from Forest Research Institute

Jumping Rivers - Advanced Shiny

Instructor: Roman Popat (Jumping Rivers)

This workshop would suit attendees who are already comfortable with creating shiny applications. We will explore how to add functionality to our app using javascript packages and code. No real javascript knowledge is required to get started if you are a confident R programmer but the session will contain examples with written javascript. We will then explore how one might deal with routines in a shiny application that take a long time to run, or how to provide a good experience for simultaneous users of your app. We will then explore creating a standalone web served API to our R code and integrate the use of it into a shiny application.

adding functionality from javascript code

  • An introduction to using a javascript package with a shiny application
  • the basics of passing javascript values to a shiny app as inputs

futures and promises for long running code

  • An introduction to the wonderful promises package by rstudio’s joe cheng
  • promise pipes
  • With some small changes to your app, stop long running tasks from blocking the main application

create and integrate with an external API

  • Plumber is a great R package for creating a REST API on your R code and functions, we will explore how to get up and running with serving our R functions as an API
  • Integrate our separate plumber API with our shiny app

Introduction to Deep Learning with Keras in R

Instructor: Michał Maj (Appsilon Data Science)

With the release of the R Keras package (https://keras.rstudio.com/) (by JJ Allaire and Francois Chollet) at the end of 2017 / beginning 2018 the topic of artificial neural networks and especially deep learning in R became red-hot within the R community.

In this workshop you will get answers for the following questions:

  • What are fully conected and convolutional neural networks ?
  • How to build a sequential model in Keras (keras_model_sequential() function) ?
  • How to compile and fit naural netwrks in Keras (compile() and fit() functions) ?
  • How to add regularization to neural networks (L1, L2, dropout) ?
  • How to save and load existing models ?
  • How to perform data ingestion and augmentation using generators ?
  • How to use pre-trained models and perform fine-tuning ?
  • How to use callbacks ?

Please make sure to bring your laptop including an up to date R version, RStudio and install Keras:

Setup Keras (Make sure to install required prerequisites, before installing Keras using the commands below)

install_keras() # CPU version 
# install_keras(tensorflow = "gpu") # GPU version (recommended)