R For Data Science: A free book by Garrett Grolemund and Hadley Wickham

Everyone within R community knows very well Garrett Grolemund and Hadley Wickham. These two super brilliant guys have written an open source book for free of course here :


On the first page, they write “This is the website for “R for Data Science”. This book will teach you how to do data science with R: You’ll learn how to get your data into R, get it into the most useful structure, transform it, visualise it and model it. In this book, you will find a practicum of skills for data science. Just as a chemist learns how to clean test tubes and stock a lab, you’ll learn how to clean data and draw plots—and many other things besides. These are the skills that allow data science to happen, and here you will find the best practices for doing each of these things with R. You’ll learn how to use the grammar of graphics, literate programming, and reproducible research to save time. You’ll also learn how to manage cognitive ….”

A simple example of using ‘sprintf’

In this example, I am showing a very brief example of using ‘sprintf’ in a user-defined function for R

Removing duplicates in R using ‘dplyr’ and ‘data.table’

In this post, I will show how to remove duplicates of observations in a data frame.


Package ‘tibble’ in R

What is ‘tibble’ package?

According to Hadley Wickham “Tibbles are a modern reimagining of the data.frame, keeping what time has proven to be effective, and throwing out what is not.

The name comes from dplyr: originally you created these objects with tbl_df(), which was most easily pronounced as “tibble diff”. “

Find its similarities and dissimilarities with data.frame. More info here : tibble

mapply in R – an example

mapply() looks like an interesting function in R. here an example of what you can do with mapply() function

The results are :