HappiomHappiom
  • Self-Improvement
  • Relationship
  • AI for Life
  • Apps
  • Tech
  • More
    • Online Diary
    • Glossary
  • Learn
    • Book
    • >> Soft Skills
    • Time Management
    • >> Tech Skills
    • R
    • Linux
    • Python
  • Our Apps
    • Download Diary App
    • Write Your First Diary
    • Login to Online Diary App
    • 100K+ Famous Quotes Site
  • Resources
    • Self-Improvement Guide
      • 21-Days to Self-Improvement
      • Creating a Habit
      • Learn Life Experiences
      • Easily Prioritizing Tasks
      • Learning from Mistakes
      • Doing Regular Exercises
      • Setting Priority for Success
      • Avoiding Common Mistakes
      • Eating Healthy Food Regularly
    • Journaling Guide
      • Online Diary
      • Best Diary Apps
      • Diary Writing Ideas
      • Diary Writing Topics
      • Avoid Writing in Diary
      • Diary Writing as Hobby
      • Reasons to Write a Diary
      • Types of Feelings In Diary
      • Improve Diary Writing Skills
  • Self-Improvement
  • Relationship
  • AI for Life
  • Apps
  • Tech
  • More
    • Online Diary
    • Glossary
  • Learn
    • Book
    • >> Soft Skills
    • Time Management
    • >> Tech Skills
    • R
    • Linux
    • Python
  • Our Apps
    • Download Diary App
    • Write Your First Diary
    • Login to Online Diary App
    • 100K+ Famous Quotes Site
  • Resources
    • Self-Improvement Guide
      • 21-Days to Self-Improvement
      • Creating a Habit
      • Learn Life Experiences
      • Easily Prioritizing Tasks
      • Learning from Mistakes
      • Doing Regular Exercises
      • Setting Priority for Success
      • Avoiding Common Mistakes
      • Eating Healthy Food Regularly
    • Journaling Guide
      • Online Diary
      • Best Diary Apps
      • Diary Writing Ideas
      • Diary Writing Topics
      • Avoid Writing in Diary
      • Diary Writing as Hobby
      • Reasons to Write a Diary
      • Types of Feelings In Diary
      • Improve Diary Writing Skills
Expand All Collapse All
  • R Code Examples
    • R Code to Create and Manipulate Vectors
    • R Code to Work with Data Frames
    • R Code to Handle Factors and Categorical Data
    • Example R Code for Basic Data Visualization with ggplot2
    • R Code to Aggregate Data Using dplyr
    • R Code to Apply Functions with lapply and sapply
    • R Code to Handle Missing Data
    • Example R Code for String Manipulation with stringr
    • R Code to Transform Data with tidyr
    • R Code to Perform ADF Test
    • R Code to Perform Data Import and Export with CSV
    • R Code for Filtering Data
    • R Code for Easily Summarizing Data
    • R Code to Perform Linear Regression for Statistical Analysis
    • R Code to Perform t-tests for Statistical Analysis
    • Example R Code for Time Series Analysis
    • R Code for Doing Web Scraping with Examples
    • R Code to Showcase Geospatial Analysis
    • Example R Code to Filter Multiple Conditions (for Data Manipulation)

R Code to Aggregate Data Using dplyr

Introduction to dplyr

dplyr is an R package designed for data manipulation. It provides a set of intuitive functions for filtering, summarizing, and transforming data. With dplyr, you can handle large datasets efficiently and perform complex operations with simple commands.

The package uses a clear syntax that makes data manipulation straightforward. Functions like `filter()`, `select()`, and `mutate()` help you easily modify and explore your data. dplyr integrates well with other tidyverse packages, creating a powerful toolkit for data analysis.

dplyr’s key feature is its ability to chain multiple operations together. This chaining is done using the pipe operator (`%>%`), allowing you to build a sequence of data transformations in a readable and concise manner.

Loading dplyr

Before using dplyr, you need to load the package. Install it from CRAN if it’s not already installed.

# Install dplyr if needed
install.packages("dplyr")

# Load the dplyr package
library(dplyr)

Aggregating Data with group_by and summarize

The group_by() function groups data by one or more variables. The summarize() function then calculates summary statistics for each group.

# Sample data frame
data <- data.frame(
  category = c("A", "B", "A", "B", "A", "B"),
  value = c(10, 20, 15, 25, 10, 30)
)

# Aggregate data by category
aggregated_data <- data %>%
  group_by(category) %>%
  summarize(
    mean_value = mean(value),
    total_value = sum(value)
  )
aggregated_data

Output:

# A tibble: 2 × 3
  category mean_value total_value
                  
1 A              11.7          35
2 B              25            75

Using Multiple Aggregations

You can perform multiple aggregations in a single summarize() call. This allows for various statistics to be computed simultaneously.

# Aggregate data with multiple statistics
aggregated_data_multi <- data %>%
  group_by(category) %>%
  summarize(
    mean_value = mean(value),
    median_value = median(value),
    max_value = max(value),
    min_value = min(value)
  )
aggregated_data_multi

Output:

# A tibble: 2 × 5
  category mean_value median_value max_value min_value
                            
1 A              11.7          10        15        10
2 B              25            25        30        20

Aggregating with Multiple Grouping Variables

To aggregate data by multiple grouping variables, include all variables in group_by(). This allows for more detailed summaries.

# Sample data frame with additional grouping variable
data_multi <- data.frame(
  category = c("A", "B", "A", "B", "A", "B"),
  subcategory = c("X", "X", "Y", "Y", "X", "Y"),
  value = c(10, 20, 15, 25, 10, 30)
)

# Aggregate data by category and subcategory
aggregated_data_multi_group <- data_multi %>%
  group_by(category, subcategory) %>%
  summarize(
    mean_value = mean(value),
    total_value = sum(value)
  )
aggregated_data_multi_group

Output:

# A tibble: 4 × 4
  category subcategory mean_value total_value
                        
1 A        X                10           20
2 A        Y                15           15
3 B        X                20           20
4 B        Y                30           30

Example: Filtering and Arranging Data

Here’s an example of how to use dplyr to filter and arrange data. We will use a sample dataset to demonstrate these operations.

# Sample data frame
data <- data.frame(
  name = c("Alice", "Bob", "Charlie", "David", "Eve"),
  age = c(23, 35, 29, 40, 31),
  salary = c(50000, 60000, 55000, 70000, 62000)
)

# Load dplyr package
library(dplyr)

# Filter and arrange the data
result <- data %>%
  filter(age > 30) %>%   # Filter to include only ages greater than 30
  arrange(desc(salary))  # Arrange in descending order of salary

result

Output:

     name age salary
1   David  40  70000
2     Eve  31  62000
3     Bob  35  60000

In this example, the filter() function selects rows where the age is greater than 30. The arrange() function then sorts these rows by salary in descending order. This operation makes it easy to analyze and view the top earners among individuals over 30 years old.

Related Articles
  • R Code to Transform Data with tidyr
  • Example R Code for String Manipulation with stringr
  • R Code to Handle Missing Data
  • R Code to Apply Functions with lapply and sapply
  • Example R Code for Basic Data Visualization with ggplot2
  • R Code to Handle Factors and Categorical Data

No luck finding what you need? Contact Us

Previously
Example R Code for Basic Data Visualization with ggplot2
Up Next
R Code to Apply Functions with lapply and sapply
  • About Us
  • Contact Us
  • Archive
  • Hindi
  • Tamil
  • Telugu
  • Marathi
  • Gujarati
  • Malayalam
  • Kannada
  • Privacy Policy
  • Copyright 2025 Happiom. All Rights Reserved.