Global poverty and impact evaluation
Economics 174: Spring 2024 Problem Set 2 This assignment is due Sunday, March 17th by 11:59pm. Note that late submissions is not accepted. This problem set is worth a total of 100 points. You may work in groups of up to four people total, but you must write up (and understand) your own answers. Do not copy and paste answers or parts of answers from one another. Identical answers will results in zero. Please submit your completed assignment by uploading to bCourses. To receive full credit, your submission should include three components: (1) Your answers. Clearly state your name, section, and the names of any other students with whom you worked. When answering questions, make sure that you address all components of the questions thoroughly and in complete sentences. (2) a ”do file”. The do-file you submit should be a final version that contains only the code necessary to produce the answers (and not everything you tried). (3) a ”log file”, which will include all the outputs. You can submit your code separately, or attach your code to the end of your problem set and submit in one file. Additional instructions and materials on how to use outreg2 command in STATA are posted on bcourses in ”PS2” folder. All the dataset you need to complete this assignment are also posted in the same folder. Question 1: Randomized Evaluations (48 points) This question uses an adapted dataset based on Muralidharan, Singh, and Ganimian’s 2019 paper “Disrupting Education? Experimental Evidence on Technology-Aided Instruction in India.” The paper is available online and the replication data is available on ICPSR. Download the adapted dataset from bCourses. This project evaluated the impact of a center-based and technology-aided after-school educational program on math and Hindi performance among middle schoolers living in low-income neighborhoods in urban India. The technology-based curriculum was designed to be high-quality, adaptive, and engaging. Approximately 600 middle schoolers were recruited to participate in the study. Half of these recruited students were randomly allocated by lottery to receive a voucher to participate in the program (treatment group), and half were not (control group). For the purposes of this question, you can assume that 100% of those assigned to treatment participated in the program and that 0% of those assigned to control participated in the program. Below is a list of the variables included in the dataset, with a brief description of each. Note that BL refers to versions of variables collected at baseline (collected before the program began) while EL refers to variables collected at endline (collected at the conclusion of the program). Variable student id student age student female student grade treatment BL math percent, EL math percent BL hindi percent, EL hindi percent BL ses index, EL ses index Description Identification numbers that uniquely identify students Age of the student (collected at baseline) Indicator (0/1) for whether the student is female Grade of the student (collected at baseline) Indicator (0/1) for treatment status of the student Math score in percent correct Hindi score in percent correct Household wealth index (a) (8 pts) Import the data and generate a table of summary statistics. What is the range of ages and grade levels within this sample? What are average scores for math and Hindi at baseline? At endline? (b) First, you’ll check whether the treatment and control groups are balanced in terms of each of the following variables: (1) age, (2) sex, (3) household wealth index at baseline, (4) math scores at baseline, 1 Economics 174: Spring 2024 Problem Set 1 and (5) Hindi scores at baseline. (Hint: You’ll run a separate regression for each of these five variables. Use outreg2 or other preferable command to output the tables.) (i) (10 pts) First (before you code up any regressions), write down the regression you plan to run for at least one of the variables. Which parameter represents the coefficient of interest? What do you expect the estimate of this parameter to be (positive, negative, zero) and why? (ii) (10 pts) Use the reg command to run the regressions in Stata. Are there significant differences between the treatment and control groups for any of these five variables? What is the purpose of this exercise, and what are you able to conclude? (c) Next, you’ll estimate the impact of the treatment on math and Hindi scores at endline. (i) (10 pts) First, write the two regressions you will run to estimate these treatment effects. In words, what will each of these parameters capture? (ii) (10 pts) Run the two regressions using the reg command and use the outreg2 command to produce tables containing the results of these two regressions. Interpret your results. What is the effect of the treatment on each of math and Hindi scores? Are these estimated treatment effects statistically significant? Explain. Question 2: The PACE-A Program (52 points) For this assignment, we will analyze the dataset used in the paper titled ”Bringing Education to Afghan Girls: A Randomized Controlled Trial of Village-Based Schools”. Please download the STATA dataset ”Pset2-Q2.dta” from bCourses. Once you open the dataset, if you issue the command: ”describe, full” (minus the quotes), you will see the variables contained in the dataset. They include the following: Variable Name Variable Label f07 heads child cnt Indicator set to one if the child is the son or daughter of the head of the house Indicator set to one if the child is female, fall 2007 f07 girl cnt f07 age cnt Child’s age, fall 2007 f07 age head cnt Age of head of the household, fall 2007 f07 yrs ed head cnt Years of education of the head of the household, fall 2007 Number of jeribs of land owned by household, fall 2007 f07 jeribs cnt double f07 num sheep cnt Number of sheeps and goats owned by the household, fall 2007 f07 duration village cnt Length of time family has lived in the village, fall 2007 Family speaks farsi, fall 2007 f07 farsi cnt f07 tajik cnt Family speaks tajik, fall 2007 f07 farmer cnt Family head is a farmer, fall 2007 f07 num ppl hh cnt Number of people in the household, fall 2007 f07 test observed Indicator set to one if child took test in fall 2007 survey treatment Indicator set to one if village group assigned to treatment clustercode Village Group ID f07 formal school Indicator set to one if the child is enrolled in a formal school, fall 2007 f07 nearest scl Distance (miles) to the nearest non-community based school, fall 2007 f07 bot norm total Total normalized test score, fall 2007 s08 both norma total Indictor set to one if the child is enrolled in a formal school, spring 2008 1. (16pts) Create a table with two estimated regressions using only on the control group. (Remember to add the option ”, cluster(clustercode)”. When you run the regression, be sure to include a clustering command. For example, ”regress f07 heads child cnt treatment, cluster(clustercode)”. The option ”cluster(clustercode)” is needed to account for the fact that treatment was randomized at the village group level. 2 Economics 174: Spring 2024 Problem Set 1 1) (4pts) Regress enrollment in 2007 (f07 formal school) on the following child and household controls: f07 heads child cnt f07 girl cnt f07 age cnt f07 duration village cnt f07 farsi cnt f07 tajik cnt f07 farmer cnt f07 age head cnt f07 yrs ed head cnt f07 num ppl hh cnt f07 jeribs cnt f07 num sheep cnt f07 nearest scl 2) (4pts) Regress test scores in 2007 (f07 both norma total) on the following child and household controls: f07 heads child cnt f07 girl cnt f07 age cnt f07 duration village cnt f07 farsi cnt f07 tajik cnt f07 farmer cnt f07 age head cnt f07 yrs ed head cnt f07 num ppl hh cnt f07 jeribs cnt f07 num sheep cnt f07 nearest scl (4pts) Interpret the coefficients on ”f07 girl cnt” and ”f07 nearest scl” for both regressions. For each coefficient, can we reject the hypothesis that it is equal to zero? At what level of confidence? (4pts) What is the predicted value of test scores for children who live 10 miles away from a formal school? Does this prediction extend beyond the support of the data? 2. (24pts) Let’s estimate the impact of the program. Create the following table using outreg2 or other syntax to outputting regression tables: (3pts) Column 1: Estimate the impact of the program on enrollment (f07 formal school) (3pts) Column 2: Estimate the impact of the program on enrollment with child and household controls (the ones in question 3) (3pts) Column 3: Estimate the impact of the program on test scores in 2007 (f07 both norma total) (3pts) Column 4: Estimate the impact of the program on test scores in 2007 with child and household controls (the ones in question 3) (3pts) Column 5: Estimate the impact of the program on test scores in 2008 (s08 both norma total) (3pts) Column 6: Estimate the impact of the program on test scores in 2008 with child and household controls (the ones in question 3) (6pts) Interpret the coefficients on the treatment indicator of the regressions estimated in columns 1, 3, 5. 3 Economics 174: Spring 2024 Problem Set 1 Are these coefficients statistically significant? If so, at what level of confidence? How do the controls affect the point estimate of treatment effect? How do the controls affect the standard error on the treatment effect? What counterfactual assumption are we assuming to interpret this coefficient as causal? 3. (12pts) Let’s estimate the effects of the program by gender. Create the following table: (2pts) Column 1: Estimate the impact of the program on enrollment, just for girls (2pts) Column 2. Estimate the impact of the program on enrollment, just for boys (4pts) Column 3: Estimate the impact of the program on enrollment but now include an interaction term between treatment and the girl indicator. That is, estimate the following regression: enrollment = α0 + α1 girl + α2 treatment + α3 ( girl × treatment ) + ϵ where girl is an indicator if the child is a girl; treatment is an indicator for treatment; and enrollment is an indicator for whether the child is enrolled. (4pts) Interpret the coefficients: α0 , α1 , α3 , α4 . How do these coefficients compare to the coefficients we estimated in columns 1 and 2 ? 4 Economics 174: Spring 2024 Problem Set 2 This assignment is due Sunday, March 17th by 11:59pm. Note that late submissions is not accepted. This problem set is worth a total of 100 points. You may work in groups of up to four people total, but you must write up (and understand) your own answers. Do not copy and paste answers or parts of answers from one another. Identical answers will results in zero. Please submit your completed assignment by uploading to bCourses. To receive full credit, your submission should include three components: (1) Your answers. Clearly state your name, section, and the names of any other students with whom you worked. When answering questions, make sure that you address all components of the questions thoroughly and in complete sentences. (2) a ”do file”. The do-file you submit should be a final version that contains only the code necessary to produce the answers (and not everything you tried). (3) a ”log file”, which will include all the outputs. You can submit your code separately, or attach your code to the end of your problem set and submit in one file. Additional instructions and materials on how to use outreg2 command in STATA are posted on bcourses in ”PS2” folder. All the dataset you need to complete this assignment are also posted in the same folder. Question 1: Randomized Evaluations (48 points) This question uses an adapted dataset based on Muralidharan, Singh, and Ganimian’s 2019 paper “Disrupting Education? Experimental Evidence on Technology-Aided Instruction in India.” The paper is available online and the replication data is available on ICPSR. Download the adapted dataset from bCourses. This project evaluated the impact of a center-based and technology-aided after-school educational program on math and Hindi performance among middle schoolers living in low-income neighborhoods in urban India. The technology-based curriculum was designed to be high-quality, adaptive, and engaging. Approximately 600 middle schoolers were recruited to participate in the study. Half of these recruited students were randomly allocated by lottery to receive a voucher to participate in the program (treatment group), and half were not (control group). For the purposes of this question, you can assume that 100% of those assigned to treatment participated in the program and that 0% of those assigned to control participated in the program. Below is a list of the variables included in the dataset, with a brief description of each. Note that BL refers to versions of variables collected at baseline (collected before the program began) while EL refers to variables collected at endline (collected at the conclusion of the program). Variable student id student age student female student grade treatment BL math percent, EL math percent BL hindi percent, EL hindi percent BL ses index, EL ses index Description Identification numbers that uniquely identify students Age of the student (collected at baseline) Indicator (0/1) for whether the student is female Grade of the student (collected at baseline) Indicator (0/1) for treatment status of the student Math score in percent correct Hindi score in percent correct Household wealth index (a) (8 pts) Import the data and generate a table of summary statistics. What is the range of ages and grade levels within this sample? What are average scores for math and Hindi at baseline? At endline? (b) First, you’ll check whether the treatment and control groups are balanced in terms of each of the following variables: (1) age, (2) sex, (3) household wealth index at baseline, (4) math scores at baseline, 1 Economics 174: Spring 2024 Problem Set 1 and (5) Hindi scores at baseline. (Hint: You’ll run a separate regression for each of these five variables. Use outreg2 or other preferable command to output the tables.) (i) (10 pts) First (before you code up any regressions), write down the regression you plan to run for at least one of the variables. Which parameter represents the coefficient of interest? What do you expect the estimate of this parameter to be (positive, negative, zero) and why? (ii) (10 pts) Use the reg command to run the regressions in Stata. Are there significant differences between the treatment and control groups for any of these five variables? What is the purpose of this exercise, and what are you able to conclude? (c) Next, you’ll estimate the impact of the treatment on math and Hindi scores at endline. (i) (10 pts) First, write the two regressions you will run to estimate these treatment effects. In words, what will each of these parameters capture? (ii) (10 pts) Run the two regressions using the reg command and use the outreg2 command to produce tables containing the results of these two regressions. Interpret your results. What is the effect of the treatment on each of math and Hindi scores? Are these estimated treatment effects statistically significant? Explain. Question 2: The PACE-A Program (52 points) For this assignment, we will analyze the dataset used in the paper titled ”Bringing Education to Afghan Girls: A Randomized Controlled Trial of Village-Based Schools”. Please download the STATA dataset ”Pset2-Q2.dta” from bCourses. Once you open the dataset, if you issue the command: ”describe, full” (minus the quotes), you will see the variables contained in the dataset. They include the following: Variable Name Variable Label f07 heads child cnt Indicator set to one if the child is the son or daughter of the head of the house Indicator set to one if the child is female, fall 2007 f07 girl cnt f07 age cnt Child’s age, fall 2007 f07 age head cnt Age of head of the household, fall 2007 f07 yrs ed head cnt Years of education of the head of the household, fall 2007 Number of jeribs of land owned by household, fall 2007 f07 jeribs cnt double f07 num sheep cnt Number of sheeps and goats owned by the household, fall 2007 f07 duration village cnt Length of time family has lived in the village, fall 2007 Family speaks farsi, fall 2007 f07 farsi cnt f07 tajik cnt Family speaks tajik, fall 2007 f07 farmer cnt Family head is a farmer, fall 2007 f07 num ppl hh cnt Number of people in the household, fall 2007 f07 test observed Indicator set to one if child took test in fall 2007 survey treatment Indicator set to one if village group assigned to treatment clustercode Village Group ID f07 formal school Indicator set to one if the child is enrolled in a formal school, fall 2007 f07 nearest scl Distance (miles) to the nearest non-community based school, fall 2007 f07 bot norm total Total normalized test score, fall 2007 s08 both norma total Indictor set to one if the child is enrolled in a formal school, spring 2008 1. (16pts) Create a table with two estimated regressions using only on the control group. (Remember to add the option ”, cluster(clustercode)”. When you run the regression, be sure to include a clustering command. For example, ”regress f07 heads child cnt treatment, cluster(clustercode)”. The option ”cluster(clustercode)” is needed to account for the fact that treatment was randomized at the village group level. 2 Economics 174: Spring 2024 Problem Set 1 1) (4pts) Regress enrollment in 2007 (f07 formal school) on the following child and household controls: f07 heads child cnt f07 girl cnt f07 age cnt f07 duration village cnt f07 farsi cnt f07 tajik cnt f07 farmer cnt f07 age head cnt f07 yrs ed head cnt f07 num ppl hh cnt f07 jeribs cnt f07 num sheep cnt f07 nearest scl 2) (4pts) Regress test scores in 2007 (f07 both norma total) on the following child and household controls: f07 heads child cnt f07 girl cnt f07 age cnt f07 duration village cnt f07 farsi cnt f07 tajik cnt f07 farmer cnt f07 age head cnt f07 yrs ed head cnt f07 num ppl hh cnt f07 jeribs cnt f07 num sheep cnt f07 nearest scl (4pts) Interpret the coefficients on ”f07 girl cnt” and ”f07 nearest scl” for both regressions. For each coefficient, can we reject the hypothesis that it is equal to zero? At what level of confidence? (4pts) What is the predicted value of test scores for children who live 10 miles away from a formal school? Does this prediction extend beyond the support of the data? 2. (24pts) Let’s estimate the impact of the program. Create the following table using outreg2 or other syntax to outputting regression tables: (3pts) Column 1: Estimate the impact of the program on enrollment (f07 formal school) (3pts) Column 2: Estimate the impact of the program on enrollment with child and household controls (the ones in question 3) (3pts) Column 3: Estimate the impact of the program on test scores in 2007 (f07 both norma total) (3pts) Column 4: Estimate the impact of the program on test scores in 2007 with child and household controls (the ones in question 3) (3pts) Column 5: Estimate the impact of the program on test scores in 2008 (s08 both norma total) (3pts) Column 6: Estimate the impact of the program on test scores in 2008 with child and household controls (the ones in question 3) (6pts) Interpret the coefficients on the treatment indicator of the regressions estimated in columns 1, 3, 5. 3 Economics 174: Spring 2024 Problem Set 1 Are these coefficients statistically significant? If so, at what level of confidence? How do the controls affect the point estimate of treatment effect? How do the controls affect the standard error on the treatment effect? What counterfactual assumption are we assuming to interpret this coefficient as causal? 3. (12pts) Let’s estimate the effects of the program by gender. Create the following table: (2pts) Column 1: Estimate the impact of the program on enrollment, just for girls (2pts) Column 2. Estimate the impact of the program on enrollment, just for boys (4pts) Column 3: Estimate the impact of the program on enrollment but now include an interaction term between treatment and the girl indicator. That is, estimate the following regression: enrollment = α0 + α1 girl + α2 treatment + α3 ( girl × treatment ) + ϵ where girl is an indicator if the child is a girl; treatment is an indicator for treatment; and enrollment is an indicator for whether the child is enrolled. (4pts) Interpret the coefficients: α0 , α1 , α3 , α4 . How do these coefficients compare to the coefficients we estimated in columns 1 and 2 ? 4
Collepals.com Plagiarism Free Papers
Are you looking for custom essay writing service or even dissertation writing services? Just request for our write my paper service, and we'll match you with the best essay writer in your subject! With an exceptional team of professional academic experts in a wide range of subjects, we can guarantee you an unrivaled quality of custom-written papers.
Get ZERO PLAGIARISM, HUMAN WRITTEN ESSAYS
Why Hire Collepals.com writers to do your paper?
Quality- We are experienced and have access to ample research materials.
We write plagiarism Free Content
Confidential- We never share or sell your personal information to third parties.
Support-Chat with us today! We are always waiting to answer all your questions.