Contents

The High Bit: Exploring a Programmer's First Programming Experiences

Motivation

Throughout this course we have been asking what makes a good programmer, and we have had numerous very intelligent people come in and try to answer the question. They gave a plethora of useful advice and real-life experience, but just as you can't teach a klutz to be a star quarterback, you can't explain programming to someone who lacks the innate foundation to do it. This inspired us to ask, rather than what, why makes a good programmer. Or more grammatically, why do some people "just get it" while others simply do not.

In order to answer this, we need to know why current programmers became programmers, and why non-programmers did not. What was it about their personality, mindset, and most importantly childhood that led them towards code instead of politics or ballet? What events, influences or circumstances enabled them to begin programming, and what kept them interested long enough that they could blossom into the beautiful, binary-driven flowers they are today?

By finding specific causes and correlations, we hope to exploit this research to make programming more available to the masses. We will get interested people started at a younger age, and extend the availability of programming to people who would not otherwise ever have been interested.

Approach

Our initial intention was to solicit the thoughts of as many programmers as possible. However, it became clear early on that attempting to interview individual programmers would simply not scale to the number of people we sought to interview; fifteen minute interview slots invariably turned into hour long sessions during which our interviewees would fondly recall their early projects, wax poetic about their favorite APIs, and give their two cents about the secret sauce which makes a good programmer. We therefore decided to create an online survey in order to gather the thoughts of a greater number and variety of people.

The first step in creating the survey was to distill our initial interviews into a set interesting areas which we wished to explore further. We thus created a survey which consisted of twenty-four questions, inquiring about topics ranging from the age at which the person started programming to their favorite API (see Appendix A for a full listing of the questions on the survey). An initial concern was that most of the questions were to be answered by typing a response into an (intimidating) text box; we feared that the response rate would be low, as respondents would tire of inputting answers into these boxes. Fortunately, this proved not to be a problem, as somewhere around 80% of those people who began the survey completed it all the way through. It would have certainly been better to interview these people individually, but given the surprisingly high survey completion rate, we believe that the trade-off between individual interviewing and scale was a good one.

Soon after launching the survey, we immediately saw a few trends in the data which indicated that we should slightly modify the structure of the questionnaire. For example, several users expressed frustration with not having any "progress feedback" on the page, indicating how much of the survey they had completed; moreover, a few of the questions had ambiguous wording that needed to be corrected. We made a few such obvious improvements to the survey. In retrospect, a "dry-run" by a small group of people would have been a good idea, and we would certainly have such a dry-run if we ran a second round of the survey. There were a few faults with the survey which we did not discover sufficiently early, but which should have been addressed. The leading wording of some of the questions created a bias against certain types of people filling out the survey. For example, the survey explicitly asked the respondent to "Tell us lovingly of your first 'baby' project." This question assumes that the respondent DID program in his or her free time and does not allow for easy admission that he or she did NOT have any such projects. A few respondents sheepishly admitted to this very fact on the survey. However, a couple of discussions with people who started to fill out the survey (but did not complete it) indicated that they were embarrassed to admit to this fact and thus abandoned the survey. This likely biased our data towards self-starting programmers who tended to program in their spare time, but we have no reliable way to determine this. It would have been useful to end the survey with a meta-survey in which respondents could give feedback about the nature of questions, as this might have caught such biases earlier.

A final, rather surprisingly, sticking point was the lack of good, free online survey platforms. We looked at various possibilities and ended up settling on [www.zoomerang.com Zoomerang]. Unfortunately, the free online version of this service has various arcane restrictions (such as a cap on the number of respondent per survey and an expiration date after which the results become inaccessible) which can only be removed by paying a $600 annual fee. The most troubling restriction was that the "basic" version disallowed data export in tabular form; one could only view the results as HTML pages. We therefore had to resort to rolling our own Python scripts to glean the data of interest from the Zoomerang result pages.

It is important to note that our methodology is rather lacking in statistical rigor. We solicited survey responses by sending mail to a wide range of people who we personally knew to be programmers and requested that they forward the email to other programmers they knew. Such a method of dispersal undoubtedly introduces bias, as ultimately respondents were friends and friends-of-friends. While proper statistical methodology would have certainly been helpful, we believe that our data provide a solid pilot version of a larger study which we believe should ultimately be undertaken.

Results

Trends

We received a total of 63 responses to the survey, split evenly between students, active developers, and "others." The latter category encompasses a variety of roles, including project managers, UI designers, technical leads, and software testers. Figure 1 show the breakdown of the roles among the responses.

Figure 1.
Figure 1.

The age at which people began to program ranged from 6 to 18, with the full distribution shown in Figure 2. The data indicate that every person surveyed (except one) began programming before starting college! While this is a result we expected to see, it's telling that it scales so well to a larger population of programmers. Note that these results could have come about in part from the potential bias we described above (the survey was slanted towards people who were self-starting programmers as kids). That said, it still brings up questions of cause and effect: did these people becomes CS students and professional developers because they started so early (and thus had a head start), or did they start early because they were predisposed to the programming way of thinking? These are questions that this study couldn't answer. There are two spikes in this graph, one around twelve and another around sixteen. The latter spike is likely due to AP Computer Science, which is taught during sophomore or junior year at many high schools. Explanations for the former spike are more tenuous, but one possibility would be the increased presence of programmable graphing calculators starting in Algebra in 6th and 7th grade.

Figure 2.
Figure 2.

The effect of an adult influence on starting age is clearly shown in Figure 3. Children who had an engineering influence during their childhood began programming at 10.3 years old on average, while those without an influence averaged 14.7. Note that the one respondent who started programming at 6 is the son of one of the co-authors of vi: an strong influence, indeed! Our definition of an adult influence included two principle groups: (1) people whose parents were directly involved in the software industry, and (2) people who explicitly cited a parent, family member, or other adult who, while not necessarily technically trained, encouraged them to start tinkering with computers and/or programming. These results are striking, and we believe they represent one of the most important results of this study. They demonstrate that adults who pushed children towards programming can have a profound effect on starting age. On one hand, they suggest that teaching programming earlier in school is plausible and could get a children programming earlier. On the other, they indicate that it's important to enable parents with little technical background to push their kids towards programming; this can be accomplished by creating inexpensive programming starter kits (such a Discover Programming for Macintosh, which a couple of respondents mentioned) which non-technical parents can give to their kids.

Figure 3.
Figure 3.

In a quite surprisingly result, over a quarter of respondents began programming with QBasic. We believe that this is largely due to the ubiquity of QB during most of the nineties. Every DOS box shipped with QB, and so one could easily drop to a shell and start up a complete programming environment. Moreover, nearly 40% started on another form of Basic (TI, Wang, Visual, etc). The reasons for this Basic-bias are likely two-fold: (1) the simplicity of Basic enabled kids to start writing "interesting" programs quickly, and (2) the many computers and devices exposed programmability via Basic. The former point is not nearly as valid today as it once was (since even Visual Basic is now a full-fledge object-oriented language), while the latter is definitely not true any longer (TI calculators are one of the few remaining devices that expose a simple-to-access programming interface). Note that the remainder of respondents began with miscellaneous other languages (C, Java, Javascript, Algol 60, etc) but no single language was mentioned more than twice, meaning that no one language dominated this "other" category.

Figure 4.
Figure 4.

Individual Responses

Certain words showed up frequently in the responses we received:

  • “tinker”
  • “fun”
  • “build your own world”
  • “accomplishment”

One of the most striking stories about an early programming experience involved a programmer who first began programming on his TI-82 calculator. Required to have one for class, he one day noticed that the manual contained various example programs. He copied into his calculator, and soon was hooked. He wrote calculator programs which helped him solve his math homework in a quarter of the time, and then spent the remaining free time hacking on a 3D graphing application which he still cites as his favorite "baby project." This experience underscores the importance of having a ubiquitous programming platform with easily-accessible programming examples.

Many respondents spoke of enjoying programming because it allowed them to build their own world. One said, "making a machine do my bidding gave me a profound sense of pleasure, bordering euphoria." Another stated, "It's all about building something." And finally, another respondent declared that, “[Programming] lets me control my world.”

Another important point that emerged concerned the understanding of what programming is. One person said, "The barrier was finding step 1: what programming was," while another states that "It can be hard to explain what exactly programming is with no prior conception of it." This results seems a bit odd to us, as we've been programming for years.

A final interesting point came from a parent who stated that he had little luck getting his kids to start programming in VB, as they simply found it too boring. However, his kids had taken to building tinkering with microcontroller-based projects, even when the end result was something as simple as a blinking LED. While these kids are not programming per se, their reasons for being excited about microcontrollers are likely similar to the reasons that many of our respondents were excited about their early programming: it lets them "tinker," "build their own world," and have "fun."

Conclusions

In lieu of our results, we came up with three suggestions to help bring programming to a younger or less technical audience. These conclusions are strikingly similar to those articulated in The Little Coder's Predicament by _why, the author of a ruby-based introductory programming environment called Hackety Hack. Of all the materials we read about starting kids of a diet of programming, _why is the one who most clearly "gets it."

Ubiquity

Many people began programming in math class on their TIs or other programmable calculators. With the constant presence of the devices, they were simply bound to eventually start exploring them. If we give programming interfaces more of a presence, we can exploit this effect. The same holds true for games. A large number of scriptable and moddable games affords frequent opportunities for kids to start programming by modifying the games with which they love to interact. The key insight here is that rather than starting kids in a toy environment far away from what'd they'd interact with on a normal basis, we let them program in environments with which they are already familiar. Yet, this approach doesn't scale unless a large number of such environments (such as games) enable some sort of programming. Unfortunately, most of today's console games don't allow any sort of simple programmability; some include heavy-weight PC-based SDKs, but these are out of reach for the vast majority of kids. A gradual shift towards easy, ubiquitous console-based programmability is sure to go a long way. As one respondent put it, kids need a "gateway drug" to become engaged in serious programming; they start modding games or programming for a calculator, and before long they're hooked on programming as a whole!

Accessibility

A beginner's language must be intuitive and as easy to pick up as breathing. As somebody is realizing they want to customize or automate something, we need to clearly present to them the way to do it. We need programming to no longer be intimidating, and convince people that is it not out of their reach.

Examples

Documentation is great for seasoned developers who know the terminology, but for someone just starting out, it is often not the best way to learn. Tutorials, too, teach someone little more than how to follow directions. Most important, many of those surveyed complained about how bored they were reading through books or tutorials as kids. We propose that people should learn to program from small examples that they can parse and figure out themselves as they use them to put together their own project. The operative word here is frankencode, taking various bits of code and stringing them together into something of one's own creation.

Future Work

Our efforts yielded insights into a wide range of possible future directions; some of these directions were ones we originally intended to pursue but couldn't, while others were ones we realized would be interesting to pursue during the course of the survey.

One of our early intentions in this study was to determine what sorts of programming primitives would be exciting to children today. While QB and Logo were exciting to those growing up in previous decades, today's children are ostensibly not nearly as excited by the prospect of writing computer programs that interact with the user via text-only input or which draw simple lines on the screen. Such is the claim, but we have no empirical evidence that this in fact is the case. As it turned out, our study could not measure the interests and programming habits of today's children, as there are difficulties with interviewing young children directly (especially at the scale we hoped to achieve). The only indication we had of the interests of today's children was via reports of the parents we interviewed in the survey, but due to a strong bias towards younger respondent, not many of the interviewees had children. The obvious future work, then, would be to undertake a systematic study of the way today's children interact with the computer technology around them, and to use this information to help determine what sort of programming environments would be most attractive to them.

In a similar vein, our study only involved those who were successfully programmers (or, at the very least, those who self-identified as such). Another interesting study would involved interviewing non-programmers about their reasons for shying away from programming; furthermore, it would be even more fascinating to talk with proficient programmers who chose careers in non-technology sectors. Did their programming skills help them in their non-technical career?

One survey respondent suggested that people who starting programming later in life speak programming with an "accent." While a compelling idea, it's not necessarily true. An interesting study would focus more on correlating a person's programming ability with their starting age, as well as categorizing how those who started programming later might differ in their abilities or approaches to problems. This is something we could not easily explore here, especially due to the difficulty of judging programming skill.

We need to integrate simple programming environments into existing platforms. Whether it is a scripting interface, a level editor, or a robust email filter, people should be shown that programming is available and useful to them. This sort of thinking presents an interesting direction in HCI research. How does one create user interfaces that afford programmability of the underlying mechanism? What sorts of interfaces are most useful (and least intimidating) to non-programmers?

Since many of the people who started programming early did so for fun, an interesting direction of study would be comparing how children get involved in various hobbies during middle school (programming being just one). This sort of study is much more deeply related to the field of psychology and would likely benefit from such expertise.

A final possible future direction, one which was inspired by comments indicating that a frequent initial hurdle to programming is realizing what programming IS, would be to conduct a broad survey judging the average person's perceptions about programming. What do they know about it? How can we teach people just the basics of what programming is? How do we create awareness of what software developers actually do? Such a survey would be useful in determining how best to explain programming to people in general. Note that this is important for both kids (how do you tell a kid what programming is?) AND for the adults around them, for an adult who doesn't have any notions about the act of programming will be hard-pressed to encourage a kid to do it.

References

Background Reading

  1. The Little Coder's Predicament: http://whytheluckystiff.net/articles/theLittleCodersPredicament.html
  2. An interesting thread: http://pluralsight.com/blogs/dbox/archive/2005/02/20/6009.aspx
  3. A Father's Reflections: http://davidbau.com/archives/2005/07/29/haaarg_world.html
  4. KPL: http://blogs.msdn.com/coding4fun/archive/2006/10/31/912456.aspx
  5. Weinberg, Gerald M. The Psychology of Computer Programming. Van Norstrand Reinhold: New York, 1971.

Existing Frameworks

  1. Scratch: http://scratch.mit.edu/
  2. Hackety Hack: http://hacketyhack.net/
  3. Alice: http://www.alice.org/

Appendices

Appendix A: Our Survey Questions

  1. What's your name and email?
  2. What's your job?
    1. Job? Pssh. I'm just a student.
    2. I mainly write code.
    3. I mainly test software.
    4. I manage developers and/or projects.
    5. Other, please specify
  3. Your First Programming Experiences. When'd you first start programming?
    1. Age?
    2. Year?
  4. How'd you first start programming?
    1. Learned in school
    2. Was taught by a family member (or other adult)
    3. Learned on my own initiative
    4. Some other way (please tell us!)
  5. What do your parents do for a living? Were there any influential adults during your early years who were developers or engineers?
  6. What was your first programming environment? (e.g., Logo, TI Basic, Amiga, QB)
  7. Describe the progression of your programming environments as you got older.
  8. Tell us lovingly of your first "baby" project (the one that still makes you beam with pride).
  9. What other programming did you do early on?
  10. Why did you keep programming?
  11. In retrospect, what were the barriers to starting early? Can they be obliterated?
  12. What appeals to you in programming? Why do you keep doing it to this day (if you do)?
  13. Describe your formal (i.e., school) exposure to programming.
  14. What's your real-world exposure to programming been?
  15. Which has helped more in becoming a good developer, and why? Having to redo it all, would you skip one of the two phases?
  16. Describe what experience has most shaped you as a programmer.
  17. Would you ever want to assume a non-technical role in your company? Why or why not?
  18. Wax poetic about your favorite API.
  19. How do we get kids to program? How do we convince them of its value?
  20. Bonus: If you have kids, what have you tried? Have you had success?
  21. How does one become a great developer?
  22. How do we convince normal people that changing programs to do what they want them to do is not out of reach?
  23. Bonus: What's your buy-out value? How much money would it take to stop you from ever programming again? Pick your favorite currency.
  24. Anything else you'd like to say?

Appendix B: Selected Favorite APIs

  • "Cairo is pretty nifty."
  • I have yet to have one that I've truly fallen in love with. They all seem to have their pointless annoyances somewhere...
  • Rather than picking an API, can I just say this: "i heart reflection"
  • The Rails stack... it's so nice. Talk about moving in the right direction.
  • I actually really like the .NET framework API: they spend a lot of effort in designing it so that its features are discoverable and common use-cases are easy. On the more esoteric side, Boost.MPL is an amazing library; it makes compile-time programming in C++ seem trivial.
  • I already said I'm a Ruby developer. Your database isn't big enough to store my soliloquy about appreciating beautiful code. That said, I like Rails, but I really love test/spec and BDD style tests. Also love Mocha which is for mocking and stubbing, it has wonderful DSL for such things. And who can forget Camping, the Mad Hatter's framework.
  • I found the Java class libraries in 1995 to be several steps above anything I had previously used. When .NET came along, they ripped off Java, fixed the most glaring problems, and didn't add too many idiocies. It was a thing of beauty that's slowly getting uglified with every release.
The collections released in .NET
Are better than most I've seen yet.
Though stolen from Java.
They're hotter than lava.
On that my last nickel I'd bet.
  • Anything SOAP / XML -- if you've ever worked on the nasty EDI (ANSI X12) standards of the 60's and 70's, you will see how simple XML / XSL, SOAP, and other API frameworks have made it to integrate systems -- a dramatic improvement in life.
  • Can I just talk about the Python library? It has everything I generally want to do, and it's all laid out in a very unsurprising fashion. It all round rocks.
  • The Windows message pump.
  • This may be off topic as it is not an API but a programming technique that I love, but here it goes: I would never be able to write all that code that I did without using region-based memory allocation (note this was unmanaged code). The ability to divide all memory into several large "bags" and then throw away whole bags instead of individual pieces helped me tremendously.
  • Lisp is God. Rest are mere mortals.
  • System.DirectoryServices (programmatically access Active Directory and do fun computations about the org chart at Microsoft) or Microsoft.Office.Interop.Outlook (programmatically screw w/ Outlook, both on the desktop & on the phone)
  • Nice question. I think the Python C API is pretty good, although restricted enough to not be so difficult. OpenGL is a pretty sweet design for what it was designed for (something you learn even more when you implement it).
  • Google Maps API - love it
  • I really like the Java API. Its nice to have so many library classes at my disposal.
  • I love the STL. All haters are just misinformed. The containers and algorithms are almost completely orthogonal, but the right tradeoffs are made to allow for high-performance implementations. The consistency and generality of the iterator model are excellent. (Try writing a linear-time intersection with Java iterators. It is possible, but ugly.) There are a few omissions (like hash tables), but they have largely been addressed by TR1 and Boost. One of these days, I'll check out the Boost iterator/iterator range library, which, from the looks of it, keeps the consistency and generality while adding a healthy dose of convenience.
  • OCaml is beautiful in its combination of powerful functional programming, module system (namespaces + templated code), and static typing. I really love C++ templates in their power and flexibility, although the suffer from making compilation slow, code often unreadable, and many other shortcomings, but they are truly powerful in that you can basically implement your program run-time using meta-programming. Java is clean but castrated. Perl is quick, dirty, and ugly. Did I mention that OCaml is a beautiful language?
  • I really like the .Net API. It's consistent, intuitive, and flexible, but without being burdensome.
  • It doesn't matter what the API is, as long as I have IntelliSense :) ... and a decent set of results for all my Google queries :)
  • I love the standard C library. It has everything you need, as long as you do things in silly obfuscated ways to get there.
  • Meh. They're all bad.


Appendix C: Selected Buyout Values

  • 10 million US
  • $400k / year, inflation adjusted, for 60 years.
  • NaN
  • Infinite. It's what I do for fun :)
  • $100M (enough to fund all my dreams' projects)
  • $30 billion would suffice
  • $20,000,000
  • $24 million. Back to school for a Comp Lit PhD.
  • Cost of living for myself and my family plus a bil
  • No amount. I wouldn't be happy if I stopped.
  • 100 mil
  • 16 harems, being a starter for the A's and a donut
  • $500,000 USD per year annual salary
  • No amount. I wouldn't take $100 billion dollars.
  • Programming maybe, but not CS as a whole.
  • 150,000 rupees...
  • Probably surprisingly little.
  • Programming gives me enough money, don't need more
  • From ever programming, even for fun? No thanks.
  • cost of training+$20 mil to start building bridges
  • $30 million
  • $0 -- I don't program any more
  • I don't think I could be convinced. 100mil minimum
  • $10M
  • $1M plus health insurance for life.
  • Enough to retire on.
  • somewhere on the order of $100M
  • Hypothetical beyond reason. Maybe $1-10m USD.
  • equivalent to a lifetime of programming jobs :)

Early Project Work

Proposal Overview

This project will explore how early a wide range of developers started programming, including the conditions and environments under which they started programming, and attempt to draw broader conclusions about developers' early starts.

Project Milestones

Beginning of Week of May 21: Conduct exploratory interviews and develop survey which can be sent out to a large population of developers

End of Week of May 21: Send out the survey to as many people as possible (we have several contacts in various software companies)

Week of May 28: Analyze returned data and attempt to draw some interesting conclusions. Second survey phase, if necessary.

Project Progress

  • Completed several background interviews and background reading
  • Created survey (www.stanford.edu/~piotrek/survey/)
  • Wide spamming of survey completed
Last modified June 11, 2007 6:44 pm / Skin by Kevin Hughes
MediaWiki