value in the center in R, number in the middle of data series in R

The exact number in the middle of the data series in R

Here is how to locate the number in the middle of the data series in R. If there is an odd number of values, you can easily calculate that with the median, but it is not the case if there is an even number of values. In the situation with an even number of values median will return an average of two middle numbers. In other words, it will not bet the exact number. At first, it might sound simple, but that is similar to topics like rounding in R.

 

Median in R

The Median is in the middlemost value in the data series. To calculate that, there is an R base function median. If there is an odd number of values result is the exact number in the middle of the ordered data series.

thislist <- c(1, 11, 12, 17, 22)

median(thislist)

#[1] 12

Here is another example with the vector that contains an even number of values. With the function median, you can not get an exact number. In this situation, the median is averaging middle values.

thislist <- c(1, 11, 12, 17, 22, 25)

median(thislist)

#[1] 14.5

Otherwise median is just fine and sometimes a better choice than the mean calculation. If there is NA in your data, you can specify that. Similarly to mean, weighted mean, min, max, and other calculations.

thislist <- c(1, 11, 12, NA, 22, 25)

median(thislist, na.rm = T)

#[1] 12

If you want to calculate the row-wise median, you can do that in the same way as with the min or max in this example.

 

The exact number in the middle of the data series in R

As I mentioned earlier, if you have an even number of values, it is impossible to get the exact number in the middle of the data series with the median.
If you want to get the number without averaging middle values, you should decide which one of those two should be.

If you want the first of the two middle values, you can get that with a calculation like this.

thislist <- c(1, 11, 12, 17, 22, 25)

sort(thislist)[ceiling(length(thislist) / 2)]

#[1] 12

If you want the second of the two middle values, you can get that with a calculation like this.

thislist <- c(1, 11, 12, 17, 22, 25)

sort(thislist)[ceiling(length(thislist) / 2 + 0.1)]

#[1] 17

No matter which one you choose, it works correctly with an odd number of values. The result is the same as that you can get with the median.

You can get the middle number also by using dplyr.

df <- data.frame(thislist = c(1, 11, 12, 17, 22, 25)) 

library(dplyr) 

df %>%
  arrange() %>%
  filter(row_number() == ceiling(n() / 2))

#  thislist
# 1       12

 

Filter the middle row in R

It might be useful to locate a middle row in a data frame column by using row numbers. It is a lot easier because they are increasing by 1.

Similarly to filtering the last or first row, you can use that to find and filter the middle row in R.

df <- data.frame(thislist = c(17, 11, 22, 1, 12, 25)) 

library(dplyr) 

df %>%
  filter(row_number() == ceiling(n() / 2))

#  thislist
#1       22

Remember to arrange that if the sequence of content is important.





Posted

in

Comments

Leave a Reply

Your email address will not be published. Required fields are marked *