Quantcast
Viewing all articles
Browse latest Browse all 2

Converting a 4-dimensional wide table into long format

In my last article about converting a wide table into a long table using reshape’s melt function (recommend reading it first), I promised to soon cover the 4-dimensional case – here you go.  Originally I was faced with this problem when checking out the official statistics on death causes in Germany. The problem is that you cannot apply the pivoting tools of spreadsheet programs like Excel or Calc to a (wide) cross table. Also other tools like reshape’s cast function expect a long structured data table.

Image may be NSFW.
Clik here to view.

We will again use the melt function of R’s reshape package. As a matter of fact I am unsure if the solution I chose is the most elegant way to do this – also it is depending on having a wide-formatted dimension with few different values – like gender in this case. In case of too many values for the wide dimensions one could have the script loop through them. If somebody knows a nifty one step solution with melt – I’d be happy to hear about it.

Again we first need to give the first two long columns a name right above them in the second row – let’s say “year” and “type”. After exporting it as a CSV into wide.csv we end up with this (note how the first two columns in the first row are empty):

,,m,m,f,f
year,type,de,en,de,en
2010,A,5,5,1,12
[...]

And the R command sequence or script could look like this:

data <- read.table('wide.csv',
    header=T, sep=",",quote='"', 
    skip=1, # we skip the first line of the CSV file
    check.names=F
  )

# melt as usual but only for the first four columns keeping
# the male data. 'year' and 'type' are already long and
# the to be longed fields shall be named 'nation' (de, en).

data_m <- melt(data[,1:4], 
    id=c("year","type"), 
    variable_name="nation"
  )

# now we add a new column to the now long data frame
# which is the gender

data_m$gender <- c('m')

# like for the males - but this time we melt all of the 
# table into long except for the part keeping male data 
# - columns 3 and 4.

data_f<-melt(data[,c(1,2,5,6)], 
    id=c("year","type"), 
    variable_name="nation"
  )

data_f$gender <- c('f')

# so far the new long table is cut in half. Let's glue
# them together horizontally.

data_final <- rbind(data_m, data_f)

# almost done. but so far the gender column is the last one
# after the value column. so we rearrange the columns.

data_final <- data_final[c("year","gender","nation","type","value")]

write.table(data_final, file = "long.csv", sep=",", row.names=F)

And here is what the result file long.csv should contain.

year, gender, nation, type, value
2010, m, de, A, 5
[...]

And that’s about it already. Now we can reimport the data and start pivoting. How pivoting works in LibreOffice’s Calc I am going to explain in one of the next articles.


Viewing all articles
Browse latest Browse all 2

Trending Articles