Tables—linking categorical and quantitative variables

Một phần của tài liệu A gentle introduction to stata, fourth edition (Trang 170 - 173)

Many research questions can be answered by tables that mix a categorical variable and a quantitative variable. We may want to compare income for people from different racial groups. A study may need to compare the length of incarceration for men and for women sentenced for the same crime. A drug company may want to compare the time it takes for its drug to take effect with the time it takes another drug.

Here we switch back to 2006 General Social Survey,gss2006 chapter6.dta. We will usehrs1, hours a person worked last week, as our quantitative variable and compare men with women. Our hypothesis is that men spend more time working for pay each week than do women. We will use a formal test of significance in the next chapter, but here we will just show a table of the relationship. SelectStatistics⊲ Summaries, tables, and tests⊲ Other tables⊲Flexible table of summary statistics to get to the dialog box in figure 6.4.

Figure 6.4. Summarizing a quantitative variable by categories of a categorical variable The dialog box looks fairly complicated. Look closely at where we enter the variable names and statistics we want Stata to compute. The Row variable is the categorical variable, sex. Under Statistics, select the following from the menus: Mean, Standard deviation, andCount nonmissing. On the far right, there are places to enter variables.

We wanthrs1for the mean, standard deviation, and count. Click onSubmitso that we can return to this dialog box at a later time. The resulting command is

6.8 Tables—linking categorical and quantitative variables 141

. table sex, contents(mean hrs1 sd hrs1 count hrs1) Gender mean(hrs1) sd(hrs1) N(hrs1)

MALE 44.8767 14.27598 1,387

FEMALE 39.2034 13.60452 1,352

This table shows that the week before the survey, men spent on average over 5 more hours working for pay than did women: 44.88 hours for men versus 39.20 for women.

Now suppose that you want to see the effect of marital status and gender on hours spent working for pay. Go back to the dialog box, click on the checkbox next toColumn variable, and addmarital as the variable. Go to the Optionstab and select Add row totals. The command for this is

. table sex marital, contents(mean hrs1 sd hrs1 count hrs1) row

I omitted the results here.

These summary tables are useful, but to communicate to a lay audience, a graph often works best. We can create a bar chart showing the mean number of hours worked by gender and by marital status. We created one type of bar chart in chapter 5, where we used the dialog box for a histogram to create the bar chart. Now we are doing a bar chart for a summary statistic, mean hours worked, over a pair of categorical variables, sexandmarital. To do this, access the dialog box for a bar chart by selectingGraphics

⊲Bar chart. Under the Maintab, make sure that the first statistic is checked, and, by default, this is the mean. To the right of this, under Variables, type hrs1 to make the graph show the mean hours worked. Next click on the Categories tab. Check Group 1 and typesexas the Grouping variable. CheckGroup 2 and typemaritalas the Grouping variable. There is a button labeledProperties to the right of where you enteredsex. Click on this button and change theAngleto45 degrees. Click onAccept.

It would be nice to have the actual mean values for hours worked at the top of each bar, so switch to theBarstab, and click on the radio button forLabel with bar height.

If we stop here, the numerical labels will have too many decimal places to look nice.

We can fix these by making the numerical labels have a fixed format. Click on the Properties button in the section labeled Bar labels, and type the value %9.1f in the Formatbox. TheBar label properties dialog box is shown in figure 6.5.

142 Chapter 6 Statistics and graphs for two categorical variables

Figure 6.5. TheBar label properties dialog box

This effort produces a nice bar chart, but there are some things we can still improve.

The marital status labels are too wide and run into each other. The label of the hours worked for pay could also be better. Open the Graph Editor and change the font of the labels of marital status and gender to size small. Then add the titleHours Worked Last Week and the subtitle By Sex and Marital Status. Your final bar chart is shown in figure 6.6.

46.5

38.5 37.2

31.5 44.6

42.2 41.6

38.5 42.6

39.7

01020304050mean of hrs1

MARRIED WIDOWED DIVORCED SEPARATED NEVER MARRIED

MALE

FEMALE MALE

FEMALE MALE

FEMALE MALE

FEMALE MALE

FEMALE

By Sex and Marital Status

Hours Worked Last Week

Figure 6.6. Bar graph summarizing a quantitative variable by categories of a categorical variable

Một phần của tài liệu A gentle introduction to stata, fourth edition (Trang 170 - 173)

Tải bản đầy đủ (PDF)

(498 trang)