Нынешняя культура «компетенций» и «практик» предполагает, что человека обучают каким-то подходам и рецептам к решению набора задач. При этом за рамками скрывается время актуальности этих «рецептов» и они, фактически, отливаются в монолит, тиражируясь человеком годами. Порой приходится слышать изречения о «лучших практиках», которым уж лет 30 и за это время прошло несколько смен парадигм. А с этой «лучшей практикой» находишься как-будто во временнОй капсуле.
Да, это ментально удобно и сохраняет энергию «специалиста». Да, это создает ощущение стабильности. Но для качественной и эффективной работы необходимо постоянно править и подтачивать инструмент.
R образца 2020 года очень сильно отличается от R даже 2018 года. В самом базовом коде были внесены достаточно значимые изменения для повышения эффективности и стабильности работы (скорость и потребление памяти). Но более динамичная часть экосистемы — это пакеты. Их коллекцию полезно периодически пересматривать с тем, чтобы перейти на более удобные и производительные реализации. С момента прошлой публикации «Джентельменский набор пакетов R для автоматизации бизнес-задач» и сами пакеты претерпели серьезные модернизации и спектр их достаточно сильно расширился и лидеры многократно менялись местами.
Не секрет, что мейнстрим не означает максимальную эффективность и универсальность. Придерживаясь рамок мейнстрима очень легко пропустить пакеты, которые являются жемчужинами. Особенно удобно открывать их на R конференциях UseR!, Rconf, eRum, и т.д.
Ниже приведен список пакетов общего применения, который оказывается весьма полезным при решении повседневных задач (x пакетов из >10K на CRAN). Часто оказывается так, что многие новинки оказываются неизвестны собеседникам. Для сводного ознакомления по срезу на июль 2020 публикую в виде подборки. Ссылки, в большинстве случаев, ведут на страницу с подборкой функций. Уверен, что каждый найдет для себя что-то полезное.
R: EDA
- arsenal: An Arsenal of 'R' Functions for Large-Scale Statistical Summaries
- compareGroups package
- DataExplorer: Automate Data Exploration and Treatment
- diff & compare
- funModeling quick-start
- gtsummary: Presentation-Ready Data Summary and Analytic Result Tables
- XanderHorn/autoEDA: Automated exploratory data analysis
- ekstroem/dataMaid: An R package for data screening
- rolkra/explore: R package that makes basic data exploration radically simple (interactive data exploration, reproducible data science)
- skimr (rOpenSci): Compact and Flexible Summaries of Data
- inspectdf: Inspection, Comparison and Visualisation of Data Frames
- rspivot: View data frames as Shiny pivot tables
- summarytools: R Package for quickly and neatly summarizing vectors and dataframes
- visdat: Preliminary Visualisation of Data
R: data_pkg
- arrangements: Fast Generators and Iterators for Permutations, Combinations, Integer Partitions and Compositions
- Arrow R Package: Integration to Apache Arrow
- batchtools: Tools for Computation on Batch Systems
- bigreadr: Read Large Text Files
- bupaR: Business Process Analysis in R
- business-science
- anomalize: Tidy Anomaly Detection
- sweep: Tidy Tools for Forecasting
- tidyquant: Tidy Quantitative Financial Analysis
- timetk: A Tool Kit for Working with Time Series in R
- modeltime: The tidymodels extension for time series modeling
- correlationfunnel: Speed Up Exploratory Data Analysis (EDA) with the Correlation Funnel
- carbonate: Interact with carbon.js
- checkmate: Fast and Versatile Argument Checks
- crrri: An interface with headless Chromium/Chrome
- data.table: Extension of `data.frame`
- datapasta: R Tools for Data Copy-Pasta
- dm: Relational Data Models
- dplyr: A Grammar of Data Manipulation
- docxtractr: Extract Tables from Microsoft Word Documents with R
- drake (rOpenSci): A Pipeline Toolkit for Reproducible Computation at Scale
- fake data
- forcats: Tools for Working with Categorical Variables (Factors)
- friendlyeval: A friendly interface to the tidy eval framework
- fs: Cross-Platform File System Operations Based on libuv
- fst: Lightning Fast Serialization of Data Frames for R
- fuzzyjoin: Join Tables Together on Inexact Matching
- gargle: Utilities for Working with Google APIs
- glue: Interpreted String Literals
- goodpress: From R Markdown to WordPress with a Modern Stack
- hts: Hierarchical and Grouped Time Series
- httr: Tools for Working with URLs and HTTP
- igraph R package
- imputeTS: Time Series Missing Value Imputation
- infer: Tidy Statistical Inference
- inferr: Tools for Inferential Statistics
- ipaddress: This package provides classes for working with IP addresses
- janitor: Simple Tools for Examining and Cleaning Dirty Data
- labelled: Manipulating Labelled Data
- lubridate: Make Dealing with Dates a Little Easier
- magrittr: A Forward-Pipe Operator for R
- naniar: Data Structures, Summaries, and Visualisations for Missing Data
- officedown: Enhanced R Markdown Format for Word and PowerPoint
- officer: Manipulation of Microsoft Word and PowerPoint Documents
- openxlsx: Read, Write and Edit xlsx Files
- palmerpenguins: Palmer Station LTER Penguins Data
- panelr: Regression Models and Utilities for Repeated Measures and Panel Data
- parsnip: A Common API to Modeling and Analysis Functions
- pillar: Coloured Formatting for Columns
- pointblank: Validation of Local and Remote Data Tables
- polite: Be Nice on the Web
- poorman: A copy of dplyr verbs using only base R
- purrr: Functional Programming Tools
- qdapRegex: A collection of regular expression tools associated with the qdap package
- qs: Quick serialization of R objects
- R6: Encapsulated Classes with Reference Semantics
- rainette: The Reinert Method for Textual Data Clustering
- rbin: Tools for Binning Data
- RcppSimdJSON: Rcpp Bindings for the simdjson Header Library
- re2r: Regular Expressions with re2
- readr: Read Rectangular Text Data
- readxl: Read Excel Files
- reprex: Prepare Reproducible Example Code via the Clipboard
- santoku: A Versatile Cutting Tool for R
- slider: Sliding Window Functions
- stringi: THE R package for… string/text processing
- stringr: Simple, Consistent Wrappers for Common String Operations
- text2vec
- tibble: Simple Data Frames
- tidyfast: Fast Tidying of Data based on data.table
- tidyfst: Tidy Verbs for Fast Data Manipulation
- tidylog: Logging for 'dplyr' and 'tidyr' Functions
- tidync: A Tidy Approach to NetCDF Data Exploration and Extraction
- tidyverts
- Ttidyverts Landing
- tidyverts/fable: Forecasting Models for Tidy Time Series
- fabletools: Core Tools for Packages in the fable Framework
- tidyverts/feasts: Feature Extraction and Statistics for Time Series
- tidyverts/tsibble: Tidy Temporal Data Frames and Tools
- tidyverts/tsibbledata: Example datasets for tsibble
- sugrrants: Supporting Graphs for Analysing Time Series
- tidyxl: Read Untidy Excel Files
- tidyr: Tidy Messy Data
- tidyselect: Select from a Set of Strings
- trimmer: toolkit to trim R objects
- tsbox: Class-Agnostic Time Series
- usethis: Automate Package and Project Setup
- vctrs: Vector Helpers
- vroom: Read and Write Rectangular Text Data Quickly
- vtreat: A Statistically Sound data.frame Processor/Conditioner
- withr: Run Code With Temporarily Modified Global State
- ymlthis: Write YAML for R Markdown, bookdown, blogdown, and More
- xslt (rOpenSci): Extensible Style-Sheet Language Transformations
- xml2: Parse XML
R: algo_pkg
- bayesAB: Fast Bayesian Methods for AB Testing
- broom: Convert Statistical Objects into Tidy Tibbles
- bupaR: Business Process Analysis in R
- caracas: Computer Algebra
- DrWhy: Explain, Explore and Debug Predictive Machine Learning Models
- easystats
- bayestestR: Understand and Describe Bayesian Models and Posterior Distributions
- estimate: Estimate Effects, Contrasts and Means
- insight: Easy Access to Model Information for Various Model Objects
- parameters: Processing of Model Parameters
- performance: Assessment of Regression Models Performance
- report: Automated reporting of statistical models in R
- see: Visualisation Toolbox for easystats and Extra Geoms, Themes and Color Palettes for ggplot2
- gratia: An R package for working with generalized additive models • gratia
- greta: Simple and Scalable Statistical Modelling in R
- ML
- breakDown: Model Agnostic Explainers for Individual Predictions
- C50: C5.0 Decision Trees and Rule-Based Models
- The caret Package
- DALEX: moDel Agnostic Language for Exploration and eXplanation
- keras: R Interface to 'Keras'
- lime: Local Interpretable Model-Agnostic Explanations
- mlr3: Machine Learning in R
- TensorFlow for R: TensorFlow for R Blog
- modelr: Modelling Functions that Work with the Pipe
- precisely: Estimate Sample Size Based on Precision Rather than Power
- propro: Build Probabilistic Process Models Using MCMC
- quanteda: Quantitative Analysis of Textual Data
- R: Optimisations in data.table
- sbmr: Fit and investigate Stochastic Block Models in R
- simmer: Discrete-Event Simulation for R
- stats
- tidybayes: Tidy Data and Geoms for Bayesian Models
- jonathan-g/datafsm: Learning Finite State Machine Models from Data with a Genetic Algorithm
- tidymodels
- Tidymodels landing
- embed: Extra Recipes for Encoding Categorical Predictors
- corrr: Correlations in R
- hardhat: Construct Modeling Packages
- recipes: Preprocessing Tools to Create Design Matrices
- rsample: General Resampling Infrastructure
- rules: Model Wrappers for Rule-Based Models
- tune: Tidy Tuning Tools
- yardstick: Tidy Characterizations of Model Performance
R: vis_pkg
- Tables
- ggplot2 themes & colours
- ggthemes: Function reference
- ggthemes: demo
- cttobin/ggthemr: Themes for ggplot2.
- ghibli: Studio Ghibli Colour Palettes
- HCL Wizard — Somewhere over the Rainbow
- сolorspace: Function reference
- Polychrome: Plots of Many Colors
- ggtext: Function reference
- ggplot2 extensions
- ggsci: Scientific Journal and Sci-Fi Themed Color Palettes for ggplot2
- hrbrthemes: Additional Themes, Theme Components and Utilities for 'ggplot2'
- paletteer: Comprehensive Collection of Color Palettes
- scico: Palettes for R based on the Scientific Colour-Maps
- Tol Color Schemes — The USGS OWI blog
- ceramic: Download Online Imagery Tiles
- cowplot: Streamlined Plot Theme and Plot Annotations for 'ggplot2'
- dataui: dataui and reactable
- daranzolin/d3rain: An htmlwidget bringing D3 drip to R
- DiagrammeR: Graph/Network Visualization
- DiagrammeR — Documentation
- echarts4r: Create Interactive Graphs with 'Echarts JavaScript' Version 4
- factoextra: Extract and Visualize the Results of Multivariate Data Analyses
- farver: High Performance Colour Space Manipulation
- ggalluvial: Alluvial Plots in ggplot2
- GGally: Extension to ggplot2
- ggalt: Extra Coordinate Systems, 'Geoms', Statistical Transformations, Scales and Fonts for 'ggplot2'
- gganimate: A Grammar of Animated Graphics
- ggchicklet: Create Chicklet (Rounded Segmented Column) Charts
- ggExtra: Add marginal histograms to ggplot2, and more ggplot2 enhancements
- gghighlight: Highlight Lines and Points in ggplot2
- ggiraph: Make ggplot2 Graphics Interactive
- ggfittext: Fit Text Inside a Box in ggplot2
- ggforce: Accelerating ggplot2
- gghdr: Visualisation of Highest Density Regions (HDR) using ggplot2
- ggnet2: network visualization with ggplot2
- ggnewscale: Multiple Fill and Colour Scales in ggplot2
- ggpage: Creates Page Layout Visualizations
- ggparty: Graphic Partying
- ggplot2: Create Elegant Data Visualisations Using the Grammar of Graphics
- ggpubr: ggplot2 Based Publication Ready Plots
- ggrepel: Automatically Position Non-Overlapping Text Labels with ggplot2
- ggridges: Ridgeline Plots in 'ggplot2'
- ggrough: Convert ggplot2 charts to roughjs
- ggsignif: Easily add significance brackets to your ggplots
- ggtextures: Drawing textured rectangles and bars with ggplot
- Highcharter
- Leaflet for R — Introduction
- mschart: Chart Generation for 'Microsoft Word' and 'Microsoft PowerPoint' Documents
- parcats: Interactive Parallel Categories Diagrams for easyalluvial
- patchwork: The Composer of Plots
- practicalgg: Practical ggplot2
- prettyunits: Pretty, human readable formatting of quantities
- processanimateR: Process Map Token Replay Animation
- r2d3: Interface to 'D3' Visualizations
- ragg: Graphic Devices Based on AGG
- rayshader: Create Maps and Visualize Data in 2D and 3D
- scales: Scale Functions for Visualization
- showtext: Using Fonts More Easily in R Graphs
- streamgraph: htmlwidgtet R Package
- trelliscopejs: Create Interactive Trelliscope Displays
- vega-lite
- volcano3D: Interactive Plotting of Three-Way Differential Expression Analysis
- raivokolde/pheatmap: Pretty heatmaps
- A 'Bootstrap 4' Version of 'shinydashboard' • bs4Dash
- The R Graph Gallery – Inspiration and Help with R Graphics
- gallery.htmlwidgets.org
- From data to Viz | Find the graphic you need
- tabplot: Visualization of large datasets
- JohnCoene/grapher: ? Interactive graphs
R: sys_pkg
- datadrivencv: An R package for building your CV with data
- R Documentation and manuals | R Documentation
- attempt: Tools for Defensive Programming
- bench: High Precision Timing of R Expressions
- callr: Call R from R
- conflicted: An Alternative Conflict Resolution Strategy
- r-lib/covr: Test coverage reports for R
- lrberge/dreamerr: Error Handling Made Easy, in R
- ellipsis: Tools for Working with ...
- emayili: Send email messages
- Environments: Reproducible Environments
- fastmap: Fast Implementation of a Key-Value Store
- fledge: Wings for Your R Packages
- git2r (rOpenSci): programmatic access to Git repositories
- here: A Simpler Way to Find Your Files
- r-lib/itdepends: itdepends provides tools to assess usage, measure weights, visualize proportions and assist removal of dependencies
- stedolan/jq: Wiki FAQ
- r-lib/later: Schedule an R function or formula to run after a specified period of time.
- lgr: A Fully Featured Logging Framework
- jimhester/lintr: Static Code Analysis for R
- loadtest: This package is to make load testing of APIs
- lobstr: Visualize R Data Structures with Trees
- logger: A Lightweight, Modern and Flexible Logging Utility
- pak: Another Approach to Package Installation
- pander: An R Pandoc Writer
- parallel computing
- HenrikBengtsson/doFuture: R package: doFuture — A Universal Foreach Parallel Adaptor using the Future API of the 'future' Package
- furrr: Apply Mapping Functions in Parallel using Futures
- HenrikBengtsson/future: R package: future: Unified Parallel and Distributed Processing in R for Everyone
- r-lib/progress: Progress bar in your R terminal
- progressr: An Inclusive, Unifying API for Progress Updates
- pins: Pin, Discover and Share Resources
- plumber: An API Generator for R
- precommit: Pre-commit Hooks for R
- processx: Execute and Control System Processes
- proffer: Profile R Code and Visualize with Pprof
- profvis: Interactive Visualizations for Profiling R Code
- projmgr: Task Tracking and Project Management with GitHub
- ps: List, Query, Manipulate System Processes
- rco: The R Code Optimizer
- remotes: R Package Installation from Remote Repositories, Including 'GitHub'
- redoc: Reversible Reproducible Documents
- renv: Project Environments
- reticulate: Interface to Python
- rcmdcheck: Run R CMD check from R and collect the results
- rlang: Functions for Base Types and Core R and Tidyverse Features
- rprojroot: Finding Files in Project Subdirectories
- styler: Non-Invasive Pretty Printing of R Code
- testthat: Unit Testing for R
- tinytest: A lightweight, no-dependency, but full-featured package for unit testing in R
- usethis: Automate Package and Project Setup
- r-lib/vdiffr: Visual regression testing and graphical diffing with testthat
- r-lib/waldo: Find Differences Between R Objects
R: shiny+Rmarkdown
- bootstraplib: Tools for styling shiny and rmarkdown from R via Bootstrap (3 or 4) Sass
- colourpicker: A colour picker tool for Shiny and for selecting colours in plots (in R)
- dreamRs/colorscale: Create a color scale from a single color
- details: Create Details HTML Tag for Markdown and Package Documentation
- excelR: A Wrapper of the 'JavaScript' Library 'jExcel'
- ezknitr (rOpenSci): Avoid the Typical Working Directory Pain When Using knitr
- flair: Highlight, Annotate, and Format your R Source Code
- golem: A Framework for Robust Shiny Applications
- grillade: Grid System for the Web
- metathis: HTML Metadata Tags for R Markdown and Shiny
- namer: The goal of namer is to name the chunks of R Markdown files
- reactR: React Helpers
- reactlog: Reactivity Visualizer for 'shiny'
- RinteRface
- rintrojs: Wrapper for the 'Intro.js' Library
- sass: Syntactically Awesome Style Sheets (Sass)
- shiny: R package that makes it easy to build interactive web apps straight from R
- datacamp/shinybones: A highly opinionated framework for building shiny dashboards.
- shinyglide: Glide Component for Shiny Applications
- shinyjqui: jQuery UI Interactions and Effects for Shiny
- shinymeta: Record and Expose Shiny App Logic using Metaprogramming
- shinytest: Test Shiny Apps
- sortable: Drag-and-Drop in shiny Apps with SortableJS
- thematic: Unified and Automatic Theming of ggplot2, lattice, and base R Graphics
- xaringanthemer: Custom xaringan CSS Themes
- JohnCoene/marker: Dynamically Highlight Text in Shiny
- CRAN — Package dqshiny
- CRAN — Package shinyFiles
- The YAML Fieldguide
- ymlthis: Write YAML for R Markdown, bookdown, blogdown, and More
Предыдущая публикация — «R Markdown. Как сделать отчет в условиях неопределенности?».