# How to match patient data for conditional logistic regression in R?

I have a dataset as shown below:

`````` patient_id   pre.int.outcome post.int.outcome
302949               1            1
993564               0            1
993570               1            1
993575               0            1
993792               1            0
```

```

I want to perform clogit pre / post interventions for each patient

I understand that I need to get it in the form:

``````  strata            outcome
1                 1
1                 1
2                 0
2                 0
3                 0
3                 1
```

```

In this form, strata represent pairs of patient numbers and a result, but I'm not sure how to do this. Can anyone please help or refer to a source that will help?

edit: in the end, I decided to use the reshape function to make the dataset "long" rather than wide;

``````    ds1<-reshape(ds, varying=c('pre.int.outcome','post.int.outcome'), v.names='outcome', timevar='before_after', times=c(0,1), direction='long')
```

```

I sorted by patient_id to use this as my "strata".

``````    ds1[order(ds1\$patient_id),]
```

```
+3

source to share

Maybe it helps

``````data.frame(strata= rep(1:nrow(df1), each=2), outcome=c(t(df1[2:3])))
```

```
+4

source

Based on comments and answers by akrun, here's a solution using a package `reshape2`

`melt`

:

``````library(reshape2)

# I created dummy data to make sure my answer works
# I assumed 4 intervention treatments, but this would work with
# two treatments. With the dummy data, just make sure nObs/4 is an integer
nObs = 100 # number of observations
d = data.frame(patient_id = 1:4,
pre.int.outcome = rbinom(4, 1, 0.7),
post.int.outcome = rbinom(4, 1, 0.5),
intervention = rep(c("a", "b", "c", "d"), each = nObs/4))
# melting the data as suggested by akrun
d2 = melt(d, id.vars =  c("patient_id", "intervention"))

# Creating a strata variable for you with paste
d2\$strata = as.factor(paste(d2\$patient_id, d2\$variable))
# I also clean up the variable to remove patient_id
# useful if you are concerned about protecting pii
levels(d2\$strata) = 1:length(d2\$strata)
# last, I clean up the data and create a third "pretty" data.frame
d3 = d2[ , c("intervention", "value", "strata")]
# intervention value strata
# 1            a     1      2
# 2            a     1      4
# 3            a     1      6
# 4            a     1      8
# 5            a     1      2
# 6            a     1      4
# I also throw in the logistic regression
myGLM = glm(value ~ intervention, data = d3, family = 'binomial')
summary(myGLM)
# prints lots of outputs to screen ...

# or if you need odds ratios
myGLM2 = glm(value ~ intervention - 1, data = d3, family = 'binomial')
exp(myGLM2\$coef)
exp(confint(myGLM2))
# also prints lots of outputs to screen ...
```

```

Edit: I added in `intervention`

based on comments from OP. I also added `glm`

to help her or him.

+2

source

All Articles