Use Filter in Dplyr Conditional on an If Statement in R

Use filter in dplyr conditional on an if statement in R

You could do

library(dplyr)
y <- ""
data.frame(x = 1:5) %>% 
  {if (y=="") filter(., x>3) else filter(., x<3)} %>% 
  tail(1)

data.frame(x = 1:5) %>% 
 filter(if (y=="") x>3 else x<3) %>%  
  tail(1)

or even store your pipe in the veins of

mypipe <- . %>% tail(1) %>% print
data.frame(x = 1:5) %>% mypipe

R: IF statement in dplyr::filter requires ELSE otherwise fails?

We can return TRUE in else condition which will select all the rows in case the condition is FALSE and is not dependent on the value in the column we are testing.

library(dplyr)
a <- NA
mtcars %>% filter(if(!is.na(a)) cyl == a else TRUE)

and to answer your question, yes if would require else part because without it, it would just return NULL which will fail in filter. See this example :

num <- 2
a <- if(num > 1) 'yes'
a
#[1] "yes"
a <- if(num > 3) 'yes'
a
#NULL

Hence when you use

a <- NA
mtcars %>% filter(if(!is.na(a)) cyl == a)

What actually happens is

mtcars %>% filter(NULL)

which returns the same error message.

How to filter a grouped dataframe with a conditional statement using dplyr?

To count the number of unique values we can use n_distinct and filter the rows based on that.

library(dplyr)

df %>%
  group_by(country, year) %>%
  filter(if(n_distinct(version) == 2) version == 'versionA' else TRUE)


#  country   year version 
#  <fct>    <dbl> <fct>   
#1 country1  2011 versionA
#2 country2  2011 versionA
#3 country3  2011 versionB

conditional filtering based on grouped data in R using dplyr

Here's another method that selects directly using math rather than %in%

data %>% filter(col * sign((group < 3) - 0.5) > 0)
#> # A tibble: 76 x 3
#>    group  year    col
#>    <int> <int>  <dbl>
#>  1     2  1985  2.20 
#>  2     3  1986 -0.205
#>  3     4  1991 -2.10 
#>  4     3  1994 -0.113
#>  5     2  1997  1.90 
#>  6     1  2000  1.37 
#>  7     3  2002 -0.805
#>  8     4  2003 -0.535
#>  9     1  2004  0.792
#> 10     3  2006 -1.28 
#> # ... with 66 more rows

R filter rows such that one column is conditional on two other columns

df %>%
  group_by(id) %>%
  filter(any(n1 == 1), any(n2 == 1))

# A tibble: 6 x 3
# Groups:   id [3]
  id        n1    n2
  <chr>  <dbl> <dbl>
1 firm a     1     0
2 firm b     1     0
3 firm e     1     0
4 firm a     0     1
5 firm e     0     1
6 firm b     0     1

Conditional filtering using tidyverse

As @docendo-discimus pointed out in the comments, the following solutions work. I also added rlang::has_name instead of "a" %in% names(.).

This Q&A contains the original idea: Conditionally apply pipeline step depending on external value.

df1 %>% 
   filter(if(has_name("a")) a == 1 else TRUE)
# A tibble: 2 x 2
      a     b
  <int> <chr>
1     1     a
2     1     b

df2 %>% 
   filter(if(has_name("a")) a == 1 else TRUE)
# A tibble: 4 x 1
      b
  <chr>
1     a
2     a
3     b
4     b

Or alternatively, by using {}:

df1 %>%
  {if(has_name("a")) filter(., a == 1L) else .} 
# A tibble: 2 x 2
      a     b
  <int> <chr>
1     1     a
2     1     b

> df2 %>%
+   {if(has_name("a")) filter(., a == 1L) else .}
# A tibble: 4 x 1
      b
  <chr>
1     a
2     a
3     b
4     b

if else with filter R

Your attempt was very close but there appears to be some syntax issues; this should solve your problem:

library(tidyverse)

df1 <- data.frame(
  sample_id = c('SB024', '3666-01', '3666-01', '3666-02'), 
  FAO = c(100,50,3,5)
)

df1 %>%
  filter(ifelse(str_detect(sample_id, "3666"), FAO >=4, FAO >20))
#>   sample_id FAO
#> 1     SB024 100
#> 2   3666-01  50
#> 3   3666-02   5

df1 %>%
  filter(ifelse(str_detect(sample_id, "XXXX"), FAO >=4, FAO >20))
#>   sample_id FAO
#> 1     SB024 100
#> 2   3666-01  50

^{Created on 2021-11-05 by the reprex package (v2.0.1)}

Use Filter in Dplyr Conditional on an If Statement in R