Talks
Here are slides from a few talks I’ve given in recent years:
-
Solving Big Data Problems With Apache Arrow (video): Some applications of the
arrow
R package provided by the user community. Presented at UseR 2021, July 2021. -
Bigger Data With Ease Using Apache Arrow (video): This talk examines the unique characteristics of the Arrow project that enable it to redefine what is possible in R. It also highlights some of the latest developments in the arrow R package, including how you can query and manipulate multi-file datasets, and it presents strategies for speeding up workflows by up to 100x. Presented at rstudio::global(2021) conference, January 2021.
-
Fast Data Access With Apache Arrow (video): Overview of the Apache Arrow project and how it improves the speed and efficiency of data access, in many cases without your even being aware of it, plus a discussion of the growth of the Arrow community and Ursa Labs’s apprenticeship program. Chan-Zuckerberg Initiative Essential Open Source Software for Science Meeting, December 2020.
-
Fast Data Access in R and Python With Apache Arrow: Review latest features in the 2.0.0 release of Apache Arrow and build up to an example of using the Flight RPC framework to ship data. Tutorial presented at the ODSC West 2020 conference, October 2020.
-
Speeding Up Data Access With Apache Arrow (video): Overview of enhancements from 2020 in the
arrow
R package, plus some CSV reading benchmarks. Talk given at the New York (virtual) R Conference, August 2020. -
Accelerating Analytics With Apache Arrow (video): What Arrow is, what the latest developments in the
arrow
R package are, and how you can get involved. Talk given at rstudio::conf(2020), San Francisco. -
Publishing an R Package Repository with Public (Free!) Services: With just a couple of additional scripts, you can use Travis-CI and Appveyor to publish binary R packages to a repository hosted on Bintray that lets your users install the latest version of your package with
install.packages()
, all for free. August 2019. -
Wrapping Web APIs in R: How packaging and testing can allow you to spend less time dealing with the API and more time working with data in R. Highlights the use of the
httptest
andskeletor
packages. Talk given at the Bay Area R User Meetup August 2018 meeting. -
Dr. Datascience, Or: How I Learned to Stop Munging and Love Tests: Lessons from test-driven development (TDD) for data scientists. Lecture given at the 2016 Big Dive data science workshop in Turin, IT. This talk is a revised version of the one Mike Malecki and I gave at the 2016 New York R Conference.