Saturday, 18 May 2013

Student choices between SAS and R

I'm going to be writing a couple of posts looking back at a class I've taught that's just coming to an end (at the time of writing this I've got one more group presentation to see).

The course teaches SAS and R in parallel on our MSc course (if it's of interest all the teaching materials are here).

I'll be blogging about this class as I taught it a bit differently to the usual "Students Listen - Teachers Lecture" style. I'll get back to that more in future posts (although a lot of what I've done is in my PCUTL module 2 portfolio).

The purpose of this post will be to briefly discuss two questions that were on my class test that I feel give some (very shallow) information as to how the students experienced the course. (The class test was made of 4 questions: q1 - a simple task to be performed in both SAS and R, q2 - a task in SAS, q3 - a task in R and finally q4 - a task in either language.)

The class is taught over 5 weeks:
  1. Week 1: Introduction and basic statistics
  2. Week 2: Data manipulation
  3. Week 3: Programming
  4. Week 4: Extras (for example we take a look at proc optmodel and ggplot2).
  5. Week 5: A 2 hour class test
The first question on the class test (you can see it here) asked the students to rank their enjoyment of each week (the purpose of this question was to give them a nice easy starting point). So a ranking of 1 implied a favorite week while a ranking of 5 implied the least favorite.

This plot shows the mean ranking given to each week:


This data and the following discussion should be taken with no implied rigour: I'm not analysing this too closely and also students might just have written down any sequence of rankings without thinking about things too much (this was after all a test).


First of all it does seem that the students enjoyed the class test the least (which I guess is to be expected).

Secondly, it looks like the first week was perhaps less enjoyed in general.

I think this is also to be expected, I taught the class in a way that I don't believe would have been familiar to the students (I tried to encourage them to teach each other and themselves) so perhaps that first week was just a bit too unfamiliar. I'll try and rectify that in future years (if only by pointing students to what students from previous years did).

Now for the second point.



By design I hope that the students learn how to carry out various programming tasks in SAS and R seeing the strengths and weaknesses of each language as they go. The last question of the class test (again here) involved a bit of data manipulation on small data sets and I believe that the main difficulty was that this question did not force a language on the students. In essence choosing the language was the most important point of the question.

My personal approach would have been to use R for this particular question but interestingly most students chose SAS. Some used a combination (often starting in SAS before realising that perhaps R was better suited which led to a bit of a clumsy hybrid). On average SAS was used by 77% of the students for question 4 (some of which managed the task very well!).

During the group presentations I've been asking students afterwards a "this does not count" question (ie making it clear that there's no wrong or right answer to this and that I'm simply interested/curious):
'If you were starting a consultancy company tomorrow but could use only one package: ether SAS or R which would you pick?'
The really pleasing thing is that almost all of the students miss the constraint in my question and immediately reply something like:
"It depends on the kind of consultancy we'll be doing."
I re-iterate the constraint (after telling them that that's the actual right answer :)), I'd say that a majority of students seem to prefer R. Perhaps the biais towards SAS in the class test was due to the conditions (time was short) but overall it's been nice to see that most students realise that it's about finding the right tool for a particular job.

I'm yet to see the individual course work that they'll be handing in this week, which is of a very similar format. I wonder which language they'll have picked for Q4...

EDIT: Here's the next post in this series (looking at choices between SAS and R during teaching presentations).