Active2 months ago
I have the data.frame below. I want to add a column that classifies my data according to column 1 (
Jamesh_no
) in that way that the first series of h_no 1,2,3,4 is class 1, the second series of h_no
(1 to 7) is class 2 etc. such as indicated in the last column.How to add new column to an dataframe (to the front not end)? Ask Question Asked 5 years, 11 months ago. Active 1 year, 10 months ago. Viewed 146k times 44. How to add a new variable to an existing data frame, but I want to add to the front not end. My dataframe is. By adding column 'a', and sort data frame by columns using column.
54k1212 gold badges128128 silver badges172172 bronze badges
Susanne DreisigackerSusanne Dreisigacker
8 Answers
You can add a column to your data using various techniques. The quotes below come from the 'Details' section of the relevant help text,
[[.data.frame
.Data frames can be indexed in several modes. When
[
and [[
are used with a single vector index (x[i]
or x[[i]]
), they index the data frame as if it were a list.The data.frame method for
$
, treats x
as a listWhen
[
and [[
are used with two indices (x[i, j]
and x[[i, j]]
) they act like indexing a matrixSince the method for
data.frame
assumes that if you don't specify if you're working with columns or rows, it will assume you mean columns.For your example, this should work:
Samuel Spencer6,1581212 gold badges6767 silver badges123123 bronze badges
Roman LuštrikRoman Luštrik53.3k2121 gold badges123123 silver badges169169 bronze badges
Easily: Your data frame is A
Then you get the column b.
A5C1D2H2I1M1N2O1R2T1159k2222 gold badges314314 silver badges399399 bronze badges
user1333396user1333396
If I understand the question correctly, you want to detect when the
h_no
doesn't increase and then increment the class
. (I'm going to walk through how I solved this problem, there is a self-contained function at the end.)Working
We only care about the
h_no
column for the moment, so we can extract that from the data frame:We want to detect when
h_no
doesn't go up, which we can do by working out when the difference between successive elements is either negative or zero. R provides the diff
function which gives us the vector of differences:Once we have that, it is a simple matter to find the ones that are non-positive:
In R,
TRUE
and FALSE
are basically the same as 1
and 0
, so if we get the cumulative sum of nonpos
, it will increase by 1 in (almost) the appropriate spots. The cumsum
function (which is basically the opposite of diff
) can do this.But, there are two problems: the numbers are one too small; and, we are missing the first element (there should be four in the first class).
The first problem is simply solved:
1+cumsum(nonpos)
. And the second just requires adding a 1
to the front of the vector, since the first element is always in class 1
:Now, we can attach it back onto our data frame with
cbind
(by using the class=
syntax, we can give the column the class
heading):And
data_w_classes
now contains the result.Final result
We can compress the lines together and wrap it all up into a function to make it easier to use:
Or, since it makes sense for the
class
to be a factor:You use either function like:
(This method of solving this problem is good because it avoids explicit iteration, which is generally recommend for R, and avoids generating lots of intermediate vectors and list etc. And also it's kinda neat how it can be written on one line :) )
huonhuon59.5k1010 gold badges159159 silver badges178178 bronze badges
In addition to Roman's answer, something like this might be even simpler. Note that I haven't tested it because I do not have access to R right now.
The function iterates over the values in
Paul HiemstraPaul Hiemstran_ho
and always returns the categorie that the current value belongs to. If a value of 1
is detected, we increase the global variable index
and continue.50.3k1010 gold badges118118 silver badges136136 bronze badges
user2759975user2759975
Approach based on identifying number of groups (
x
in mapply
) and its length (y
in mapply
)FerroaoFerroao
I believe that using 'cbind' is the simplest way to add a column to a data frame in R. Below an example:
Marco7,85911 gold badge2727 silver badges4545 bronze badges
Emanuele CataniaEmanuele Catania
you can add first an empty column to your data.frame then specify your conditions for the new column,
Seyma KalaySeyma Kalay