July 09, 2014

On Python as an Introductory Programming Language

Estimated reading time: 4 minutes and 40 seconds.

Philip Guo’s post on Python replacing Java in introductory courses on programming has received extensive coverage on all the major geek news sites recently. It’s been on Slashdot, Hacker News, and many less popular sites. Unfortunately, much of the coverage, and to some extent the original post itself, is misleading. For example, here’s Slashdot’s summary of the article:

Python has surpassed Java as the top language used to introduce U.S. students to programming and computer science, according to a recent survey posted by the Association for Computing Machinery (ACM). Eight of the top 10 computer science departments now use Python to teach coding, as well as 27 of the top 39 schools, indicating that it is the most popular language for teaching introductory computer science courses, according to Philip Guo, a computer science researcher who compiled the survey for ACM.”

This is actually a pretty good summary of what the tech press seems to have taken from the article. Happily, Guo provided the raw data on which he based his post, so we don’t have to accept a summary, or even Guo’s interpretation of it.

According to his data, the top 39 universities offer a grand total of 76 classes that introduce students to programming. Of those 76 classes, 27 are based on python. Yet, Guo’s title for the post is “Python is now the most popular introductory teaching language at top U.S. universities,” which seems, at best, debatable. This title implies that there’s some sort of consensus, yet 21 of these 39 universities offer at least two introductory programming classes in at least two programming languages. Here are some alternative titles that would be just as accurate as Philip’s original title:

The technology press mis-interpreted and mis-summarized his article as only the Internet seems capable of doing. I strongly encourage anyone who saw an article about Python suddenly displacing Java as the top-university-approved language for computer programming to read Guo’s actual post. Doing so might raise questions about his methodology. For example, he limits his analysis to the top 39 graduate schools in computer science. It’s entirely possible that everything about that limit is wrong. First, the vast majority of people who learn to program aren’t attending a top 39 school. Yes, these schools are important, but are they really representative of the typical computer science education? Are they even representative of the best computer science education? How should we measure such a thing?

Second, these are rankings of graduate schools, but Philip is interested in their introductory undergraduate class on programming. Is graduate ranking really the best way to determine consensus on how to introduce programming to undergrads? Graduate schools are arguably more interested in research than teaching. Balancing teaching and research is one of the biggest challenges large research universities face. Allegedly favoring research over teaching may cost the University of Texas President his job, possibly as soon as tomorrow. Texas is number nine in the rankings Guo used.

Third, these are computer science schools, but programing is also taught in schools of computer engineering, informatics, library science, and potentially other disciplines. How many introductory programming classes at top universities were left off the list simply because they weren’t taught in a school of computer science? Are courses in, say, a School of Computer Engineering more or less likely to be taught in a particular language? My undergraduate degree is in Computer Engineering and from Purdue, which is a top 10 Computer Engineering school and a top 20 Computer Science school. The core curriculum for ECE at Purdue requires three classes that are primarily programming: two based on C and one based on Python.1

Choosing a language to use when introducing students to computer programming is an important decision, but it is not at all clear that there’s an obvious choice or an obvious consensus. I should probably also note here that of the languages listed in Guo’s post, Python would be my pick for an introductory language. Of course, this assumes that you have to pick one and only one programming language for an introductory course on programming, which is a fair assumption in my opinion. Guo linked to an article that provides a solid rational for Python as an introductory language, much of which I agree with. If I opened up the candidates to languages not on Guo’s list, I would pick Ruby over Python, but that may be a story for another time. For now, I just wanted to say that it’s flat out wrong to claim that “Academia has spoken, and Python wins.” Such a claim simply is not the definitive interpretation of the data Guo collected and reported, but it certainly seems to be the takeaway that otherwise respectable news sites are passing along to their readers.

  1. Of these three required courses, the first in the series has a pre-requisite course based on C and taught in the School of Computer Science. I didn’t take that course, and I can’t even remember why. I think it was because I was in the Honors Engineering program, which introduced programming using both C and Matlab. However, this was over a decade ago, and I simply can’t remember why I didn’t take the course.

This post was last updated on July 09, 2014