Wednesday, January 16, 2008

Teach The Hard Stuff

I read the Coding Horror entry on how to teach Computer Science the other day...
Coding Horror: How Should We Teach Computer Science?: "If we aren't teaching fundamental software engineering skills like deployment and source control in college today, we're teaching computer science the wrong way. What good is learning to write code in the abstract if you can't work on that code as a team in a controlled environment, and you can't deploy the resulting software? As so many computer science graduates belatedly figure out after landing their first real programming job, it isn't any good at all."
I scanned the comments and most people talked about CS versus SE (Software Engineering), or what they think is most important to teach. I agree with the narrow point that computer science is different than software engineering - the former teaches the theory and fundamentals of computers, languages, and programming, while the latter focuses on how to engineer software for the real world.

To a large extent CS and SE are orthogonal. Knowing how to write code that can compile on 11 platforms, versus understanding why a regular expression cannot be used to match parenthesis are two completely different issues.

The difficulty with coming up with the "truly important" concepts that should be taught in school is that the software industry is too diverse to be able to put in a nutshell. Take, for example, Jeff's idea that "computer science students should develop software under conditions as close as possible to the real world" just doesn't make sense. What real world?

I've, thankfully, been gainfully employed as a software engineer for 12 years, and I've never had to worry about software distribution, and there are large numbers of people who have never had to think about that. Why? Because it was taken care of by other people - I simply abstract away that problem as not something I had to solve. So how would that lesson have helped me?

But even if you wanted to teach about software distribution, what kind of distribution? Should the students have to deal with a wide variety of PC builds (Windows2K, Vista, Me, XP)? What about Mac? And the ton of flavors of Unix and all their varieties? What about 32 versus 64 bit platforms? What would this really teach students? Distributing software is a pain. Fine, lesson taught, but why have to go through this for every class? Does everyone in your company know how to distribute the software you produce?

My point isn't for or against teaching about distribution of software. Almost any one topic under the umbrella of CS or SE fails to apply to all of the software industry. No matter what you are taught, you are going to learn things you never use, and you are not going to learn things you need to know to do your job.

So, what should be taught?

This reminds me of the time I asked a professor what I should study in my last year or two - as I only had a couple of slots free for non-mandatory courses. I was specifically asking about taking math versus computer science classes (I was a double major), and he advised math. Why? Because math is hard.

So we should teach math?

Um.... not quite the point.

I think that school is a good time to learn the fundamentally difficult concepts. Hopefully you'll always be surrounded by intelligent, inquisitive peers and knowledgeable mentors, but the chances are school is your best bet for being immersed in that kind of environment. Think of school as a petri dish, sure, mold can grow anywhere, but it grows really well in a petri dish.

So now software is mold?

University should teach fundamentals and fundamentally difficult concepts. So things like: data and procedural abstraction, algorithmic complexity, data structures, algorithms, and what makes up this computer (OS, hardware, compiler, network). Hmmm... sounds like the course requirements for my CS degree.

No software engineering basics?

That being said, it would have been a heck of a lot easier to do some of the course work had I known more about how to use a text editor, version control, a debugger, logging, unit testing, or (gasp) an IDE. I would definitely have appreciated some introduction to those early on. But early on the projects are so small that none of those are important and would just get in the way. It's the classic chicken-and-egg problem.

So you want both CS and CE taught?

What I really want from a new college graduate is someone who can think for himself, and has the intellectual curiosity to continually improve.

Version control, any build system, testing harness, language, or software distribution system is (IMO) much easier to learn than discerning algorithmic complexity. So, if one has to make a choice between the two, I'd choose the more difficult subject, algorithmic complexity. Some of the comments argued that people don't tend to ever learn version control, so it should be taught in school. To which I would respond, if they are the kind of person who only ever learned it superficially in the real world, they would have only learned it enough to pass the class.

Teach the hard stuff in school, and, where possible, teach software engineering skills to help the student thrive while in school.

Back to the idea of developing software at school in conditions that match the 'real world'. The real world is full of noise. I'd much rather learn the subject matter than be subjected to compiler bugs, platform differences, whiny customers, incomplete specifications, etc.

I guess it boils down to the first CS class I took in college. I loved the class, and I still think it the best introductory CS class available. Why? Because in one semester you go from knowing next to nothing about CS to learning data structures/algorithms/complexity analysis and writing a interpreter, a CPU simulator, and a compiler. One semester, and you're exposed to basically all of computer science - and not just superficially.

Scheme? You want to force me to (parse (all (the (parentheses))))?

Grow up, look past the language. Look at the concepts taught. If you want to rewrite the book using Ruby or whatever, be my guest. If the result is a class that can still fit into a semester, all the power to you. The point of using Scheme was that the language is taught in one lecture, and after that you get to learn the interesting things. Try teaching meta-programming to C++ novices (hell, try teaching it to 99% of the software community).

Yeah, but it's just a toy interpreter/compiler/CPU...

If you actually look at what is taught, you have a supremely flexible interpreter/compiler/CPU simulator, and you can vary the semantics of the language, compiler optimizations, and hardware configurations. You can explore the intricacies of each to whatever extent you desire.

I've never seen that much information packed into such a concise form that is accessible to first year students.

Whatever is taught to computer science students should be taught with a focus on that subject. Just like you should remove distractions that drag your developers down, you should remove distractions from the subjects taught in school. If you want to teach about developing software in the real world, then make a class or two that does that, or provide internships, or contribute to open source projects. If something doesn't help the student learn the subject, it gets in the way.

No comments: