When I was a kid one of my favourite books was Richard Scarry's What Do People Do All Day?. (Another of my favorites was Snow White, I liked them so much that my father refused to repeat them more than once every couple of days by the time I was four)
Now that I look back at Richard Scarry, some gender stereotypes in there where Daddy goes out to work and Mommy consumes ("Mommy loved her new earrings .... Grocer Cat bought a new dress for Mommy. She earned it by taking such good care of the house") that maybe my parents didn't want me to read. But I digress. Slightly. Perhaps this interest in Richard Scarry was my initial foray into labo[u]r and economic history.
I am still interested in what do people do all day. But these days it is more of an academic interest, and at the end of the year I have a conference paper due on the topic. Specifically about what I learned from trying to classify and code half a million different occupations into something tractable enough for research. So here ends the fun discussion of childrens' books (with pictures!) and here begins some jottings towards a conference paper. Keep reading, it will be like a campfire sing-along ... with marshmallows at the end.
In all seriousness, I do this because just the thought that somebody who is not that interested in how to analyze occupations might read this is a useful discipline on my writing.
I think it's always useful to begin talking about "what do people do all day" with some [throat clearing] preliminaries about why it's an important topic. For better or worse, we all do a lot of work. Work affects everyone, and we should study things that affect large numbers of people. For now all the question about whether that is for pay, whether all that work is a good thing, or whether it could be done differently are to the side. They're important but should be distinguished from the descriptive task of measuring and classifying what people do all day.
What to classify: When we're talking about classifying work, it is conventional to divide it up into occupation—the tasks and duties people perform— industry— and what the American census calls "class of worker," loosely speaking what kind of authority a person has in their job; whether they're an employer directing others, an employee, or working by and for themselves.
These standard divisions are useful, and these days when we want to find out about them we normally ask the right questions to do so. I think it's important to keep in mind two caveats.
The first is that there is some correlation between these three variables which people take as natural and therefore conflate aspects of their job that researchers would like to distinguish. An example of this correlation and conflation is that some occupations are rarely found outside of certain industries. Farmers are never outside agriculture. But we cannot, conversely, carry this conflation forward. It is tempting to think that if we come across a nurse that he's working in health and medicine, but there are enough examples of nurses employed in manufacturing and education and elsewhere that we should pause before doing so. At the very least, inferences guesses like this should be flagged in some way.
The second caveat is that the general public's appreciation of these distinctions between industry and occupation is not what social scientists would like it to me. For better or worse, people often have a clearer idea of what industry they (or their family members) are working in than precisely what they do. This caveat is probably somewhat related to the first one. It would be tempting to conclude that the general public should get with our academic program and understand this difference (or that we should study a different public) but regularities like this are interesting in their own right.
What I make of this observation that people tend to be clearer about industry than occupation are the following which I offer tentatively as hypotheses rather than definitive statements.
You can see this variety in what comes under the same occupational title by reading the modern responses to questions which aim to elicit the specific tasks people do. All farm laborers are not the same. Neither are all lawyers.
While there are regularities in what people in these occupations do they are not absolute. In casual conversation this probably doesn't matter too much, but for research it does matter. When we see "lawyer, in a law firm" that is as much as we know.
In short, ascribing characteristics to occupations is OK in social situations, but not so much in research. If we are going to ascribe something to an occupation—social status, for example—we should do it globally. Once we have classified all our data, we can re-classify it, simplify it, lump the professionals together, the clerks, the factory workers etc ...
Classification and coding. For the purposes of this discussion I take "classification" to be the somewhat abstract process of deciding what distinctions we are going to make between different responses (do we accept lawyer and attorney as the same job, for example), and "coding," the somewhat mechanical process of looking at a response (or group of responses) and typing a numeric code so that "criminal lawyer" and "defence lawyer" and "defending bad guys in court" all get code xxx and can be distinguished from lawyer's secretary and farmer.
As a practical matter I think that accuracy and consistency are enhanced by making distinctions by introducing new variables, rather than making longer codes. As I understand it, in the not so distant past disk space was a real concern and having one variable of three digits that combined two ideas was genuinely better than two variables of two digits that kept them separate. But these days disk space and memory is trivially cheap, so distinct ideas should be kept distinct.
One of the challenges with coding is to stick to the literal text, and only code that. This is another way of saying that we can't ascribe [too much] when coding. For example, if someone says they are a custodian we only know their occupation. It would be nice to know if they were are a school custodian or a hospital custodian, but we don't know that.
As I mentioned, I have coded nearly half a million occupations in the space of a couple of years (with some help). How do you do that? As I noted above occupation and industry are correlated. There are a lot of farmers who work in agriculture. A lot of teachers who work in education. For responses like these it is most efficient to code occupation and industry at the same time. In other situations, particularly manufacturing workers, there is some dependence of occupation on industry but not as much. Often it was more efficient to code a group of industries together, based on keywords (specific products such as "cotton" or "timber", or descriptions of types of workplaces, such as "shop" or "mill" or "plant"), and then code the occupations based on keywords that distinguished tasks, or rough gradations in skill or authority.
It is not uncommon that when actually doing the coding, other distinctions or classifications that might be useful occur to us. For example, we might find that lawyers are unusually forthcoming on whether they are criminal or corporate lawyers. Rather than revising the coding scheme post hoc to incorporate this distinction it is better to flag the cases we want to retain extra information on, and revisit them once the first round of coding is complete.
An important choice in coding is whether to lump or split? Should we assume that "attorneys" and "lawyers" are the same, that "merchants" and "dealers" are the same? That a "sales clerk" and a "saleslady" are the same. Those are ones I can accept. But what about a "hammerman" and a "blacksmith"? Trickier. It does depend on the amount of data, and the time it takes to recode. In general, people making codes that others will use should probably err on the side of splitting rather than lumping. It is easy enough to lump later on to get a tractable number of categories for analysis, but discovering that apparently disparate groups have been lumped together is more frustrating.
These are [unfinished] reflections from the trenches, or just coming out of the trenches, of actually coding lots of data. What strikes me in looking at work across time in censuses and surveys, is not the change but the stasis, at least in terminology. Despite changes in technology and who is working, many of the terms we use to describe work today existed back then. There are, of course, new occupations that did not exist in 1880 or 1900. Aviation and computer programming probably the most obvious. But look at the terms we use to describe occupations in aviation. Pilot and Captain. Straight out of the maritime industry.
In other words, the language of occupations has not really changed much, despite what we know from closer studies of the workplace that what some occupations do has changed. Coding and classifying surveys of work can only be a starting point, a description and analysis of context, in the collective project of understanding what people do all day.
Posted by robe0419 at March 16, 2006 1:38 PM