supplementarymaterials.tex

%!TEX program = xelatex

\documentclass[hidelinks, 12pt, article, oneside]{memoir}

\input{preamble supplementary materials bjps.tex}

\externaldocument{manuscript}

\begin{document}

\appendix

\pagenumbering{arabic}% resets `page` counter to 1
\renewcommand*{\thepage}{A\arabic{page}}
\renewcommand\thefigure{A\arabic{figure}}
\renewcommand\thetable{A\arabic{table}}
\renewcommand\thesection{A\arabic{section}}
\renewcommand\theequation{A\arabic{equation}}
\setcounter{figure}{0}
\setcounter{table}{0}
\setcounter{equation}{0}
\setsecnumdepth{subsection}

\tableofcontents

\clearpage

\section{Robustness to Different Measures of Election Timing}\label{appendixsection:alternativespecifications}

A potential concern with my results is that the results might be sensitive to the specification of the functional form of election timing.  In this appendix, I re-estimate the results from Figures \ref{fig:gextensiveakhmedov} to \ref{fig:yearlycycle} using the number of years to the election instead of individual electoral cycle dummies.

\subsection{Using A Continuous Measure of Distance to Elections}

I run a regression using the number of years to the next election.  The formal equation takes the form:

\begin{equation}
    Y_{i,t} = \beta_{1}\text{Years to Next Election}_{i,t} + \beta_{2}y_{i,t-1} + Z_{i,t} + \gamma_{i} + \zeta_{t} + \mu_{i,t,d},\label{equation:continuous}
\end{equation}

where Years to Next Election$_{i,t}$ is a continuous variable that indicates the number of years school $i$ is to the next legislative election, and the rest of the equation is as in Equations \ref{equation:businesscycle}.

I present results for government schools in Table \ref{appendixtable:yearstoelectiongovernment} and for private schools in Table \ref{appendixtable:yearstoelectionprivate}.  The results using a continuous measure of years to election are largely the same as other specifications, but it is important to note that the results for private schools are significant in all specifications that include lagged dependent variables (columns 1-4 and 6-9). The point estimates are between one-third and one-half smaller than those for government schools, suggesting that if any effect in this specification, it is smaller and weaker.

\SingleSpacing
\input{data/output/tables/anyabsencegovernmentyears.tex}
\DoubleSpacing

\SingleSpacing
\input{data/output/tables/anyabsenceprivateyears.tex}
\DoubleSpacing

\clearpage

\section{Calculating the Fiscal Cost of Absence}\label{appendixsection:fiscalcost}

In this section, I calculate the total amount of money lost to absenteeism as a function of teachers' wages, and the total amount of money recovered in election years from the highest level of absenteeism.  Table \ref{appendixtable:fiscalcost} provides calculations of the estimate of how much money is lost and recovered between election and non-election years.

\SingleSpacing
\begin{table}[htbp]
\begin{threeparttable}
\caption{Calculating the Fiscal Recovery of Reduced Absenteeism\label{appendixtable:fiscalcost}}
\begingroup
\begin{tabular}{lrr}
\toprule
                                        & Highest Wage Estimate                                 & Lowest Wage Estimate \\
\midrule
Average Monthly Wage (\rupee)           & \input{data/output/text/highwage.tex}                 & \input{data/output/text/lowwage.tex} \\
Average Daily Wage (\rupee)             & \input{data/output/text/dailyhigh.tex}                & \input{data/output/text/dailylow.tex} \\
Average Absence per School (Days)       & \multicolumn{2}{c}{\input{data/output/text/meanabsence.tex}}                      \\
Average Number of Teachers per Year     & \multicolumn{2}{c}{\input{data/output/text/teachers.tex}}                         \\
Wages Lost in Average Year (\rupee)     & \input{data/output/text/wagelosshigh.tex}             & \input{data/output/text/wagelosslow.tex} \\
Wages Lost in Average Year (\$)         & \input{data/output/text/wagelossdollarshigh.tex}      & \input{data/output/text/wagelossdollarslow.tex} \\
Reduction in Absenteeism per School (Days)  & \multicolumn{2}{c}{\input{data/output/text/absencereduction.tex}} \\
Wage Recovery in Year Before Election (\rupee) & \input{data/output/text/wagerecoveryhigh.tex}         & \input{data/output/text/wagerecoverylow.tex} \\
Wage Recovery in Year Before Election (\$)     & \input{data/output/text/wagerecoveryhighdollar.tex}   & \input{data/output/text/wagerecoverylowdollar.tex} \\
Share of Wages Lost Recovered (\%)      & \multicolumn{2}{c}{\input{data/output/text/sharebudget.tex}}                       \\ 
\bottomrule
\end{tabular}
\endgroup
\end{threeparttable}
\end{table}
\DoubleSpacing

All costs in 2010 prices.  The high wage estimate is taken from \cite{Muralidharan2016} and the low wage estimate is taken from \cite{Kingdon2010}. The mean level of absence is the mean government school absence in the entire DISE dataset, and the average number of teachers per year is the mean number of teachers in the DISE dataset across all years.  The wages lost due to absenteeism in the average year in Rupees (\rupee) is the average daily wage multiplied by the mean number of absences and by the mean number of teachers.  The wages lost in USD Dollars (\$) divides the number by the exchange rate in 2010.  The reduction in absenteeism is calculated by taking the difference in the point estimates from ``1 Year After Election'' and ``1 Year from Election'' in Column 4 in Table \ref{appendixtable:gintensiveakhmedov}.  The wage recovery multiplies the reduction in absenteeism by the average daily wage and the average number of teachers, while the share of wages lost recovered divides the wage recovery by the wages lost.

\clearpage

\section{Consistency of Absence Measures Between Data Sources}

I rely on two different measures of teacher absenteeism in the paper, the DISE School Report Cards that are self-reported by head teachers in schools, and the Indian Human Development Survey (IHDS) from 2011-12 that is independently collected by the National Council for Applied Economic Research through unannounced visits to schools.

I compare these two measures to the measures from a third independently collected source, the data from \cite{Kremer2005}.  In Figure \ref{appendixfigure:absencecomparison}, I provide state-level scatter plots of the levels of absenteeism.  There is a strong positive relationship between measures in the DISE School Report Cards and the World Bank's from \cite{Kremer2005} (Panel A).  This relationship is stronger removing the outlier state of West Bengal from the DISE School Report Cards data.

\begin{figure}[htbp]
\caption{Consistency of Absence Measures Between Data Sources\label{appendixfigure:absencecomparison}}
\centering
\begin{minipage}{6.5in}
    \includegraphics[keepaspectratio = true]{data/output/figures/absencecomparison.pdf}
    \tiny \emph{Notes}: Panel A plots absence in DISE against absence in Table 2 of \cite{Kremer2005}.  Panel B plots absence in the 2011-12 round of the IHDS against absence in Table 2 of \cite{Kremer2005}.  Panel C plots absence in DISE against absence in the 2011-12 round of the IHDS. Each point represents the state-level average of the probability that an individual teacher was absent on the day of the school visit.  For the DISE SRC data, this represents the teacher-level absence per working day. 
\end{minipage}
\end{figure}

The relationship is weaker between the IHDS and the other two measures, with a negative relationship for both measures (Panels B \& C), and null when removing West Bengal in the DISE School Report Cards (Panel C).

\clearpage

\section{``Cooking the Books'' or Reduced Absenteeism?}\label{appendixsection:ihdsdata}

This paper uses rich administrative data to answer an important political problem.  While the use of administrative data is increasingly common \citep{Lindgren2016, Gulzar2017}, I do not take the quality of this data at face value. Instead, I verify the quality of the data by triangulating against independently collected sources \citep{Herrera2007}.  Administrative data suffers from the additional concern that bureaucrats have an incentive to misreport in ways that make their performance look better \citep{Martinez2022}, on top of all the data quality concerns of other sources of data.  Verified against other sources of data, however, administrative data provides great potential as it allows us to answer big questions at scale, especially as the data gathering capacity of states improves \citep{Jerven2013, Jensenius2017}.

A concern with the self-reported nature of the DISE school report cards is that head teachers and district-level officers might be ``cooking the books'', or falsely reporting lower rates of absenteeism, around elections to make bureaucratic effort look greater with no underlying change in behavior.  The empirical evidence of ``cooking the books'' would be substantively similar -- lower absenteeism in election years -- although the mechanisms would be different -- perceived pressure to modify government data rather than hold providers accountable. Reported absenteeism of 12 percent in the DISE data and 14 percent in the IHDS data in Table \ref{table:summarystatistics} is lower than absenteeism in independent audits conducted in 2005 \citep{Chaudhury2006, Muralidharan2016}, suggesting either a secular decline in absenteeism or in the \emph{reporting} of absenteeism, suggestive evidence of ``cooking the books''.

To separate whether schools are ``cooking the books'' or there is electoral pressure on teachers, I use the IHDS school surveys, an independent survey of schools conducted in 2011-2012 and contemporaneously to the DISE school report cards data collection to test whether I find similar patterns in this data.  The IHDS data collects a broad range of demographic and socioeconomic data, and also conducts a school survey of the largest private and government school in each village they survey households.  Visits to schools are otherwise unannounced and unexpected by school staff and replicate the randomized audits in \cite{Chaudhury2006}.

I use the second wave of the Indian Human Development Survey (IHDS) for data on teacher absenteeism and reasons for absenteeism.  The IHDS survey is a nationally representative survey of 1,503 villages and 971 urban neighborhoods across India \citep{Desai2015}. The 2011-2012 round was the second round of a panel survey that surveyed the same villages in 2004-2005.  The survey asked a small number of questions about the largest government and private school in each surveyed village, and the next two largest schools, irrespective of whether they were public or private. For every teacher in the school, the surveyors checked if the teacher was present on the day of the survey, and, if they were absent, whether they were absent on officially sanctioned government work.  In this sense, the survey replicates the data collection in \cite{Chaudhury2006}, randomly auditing schools on staff absenteeism.

In Table \ref{table:absenceschooltype}, I look at the probability that a teacher is absent from the school on the day of the survey, absent from the school on the day of the survey to conduct official work, or present at the school on the day of the interview and present for the interview as a function of whether their school is a government or private school.  As expected, teachers are four percentage points more likely to be absent from a government school (columns 1-2), and about two percentage points more likely to be absent for officially sanctioned duty (columns 3-4).  Given that government schools have fewer teachers (Panel C of Table \ref{table:summarystatistics}), conditional on being at the school teachers in government schools are also more likely to be present at the IHDS interview as there are likely fewer teachers to answer the survey and schools are smaller (columns 5-6).  These results confirm common sense expectations of what we think differences in teacher absenteeism should look like between government and private schools, with government schools showing consistently higher levels of absenteeism.

\SingleSpacing

\input{data/output/tables/ihdsabsencegovernment.tex}

\DoubleSpacing

Taking the DISE and IHDS data together, the two sources allow me to triangulate between two otherwise imperfect data sources \citep{Herrera2007}, and draw broader conclusions on the drivers of absence in Indian public service.  While the IHDS data effectively provides a randomized audit of a select number of government and private schools in the country, designed to be representative of the country as a whole, the DISE data allows me to generalize the results to \emph{all} schools in India.  The consistency in the results between the two sources provides support that the electoral cycles I discover are real and not the product of data quality, manipulation by teachers or school officials, or a result of self-reporting.

A key question surrounding data quality is how self-reported data provided by organizations like DISE compare to independent evaluations of absence from random audits such as in \cite{Banerjee2006, Chaudhury2006}.  The levels of absence found in this paper are much lower than absence found by independent evaluations of service worker absenteeism from other papers in India.  Average levels of absence self-reported in the DISE dataset reach 13 percent for the \emph{year}, far shorter than the levels of absence recorded on random spot checks in \cite{Chaudhury2006} of 25 percent on any given day.  For education in India, the DISE data serves as the only comparable source of data available to the government and broader public, and is used by the former to assess the state of schools.  While the data is almost certainly biased downwards, it does have important implications for decision-making as this is the dataset used by policymakers.  Finding similar results in the IHDS data adds confidence that the self-reported data is not being systematically manipulated in election years and that we are seeing real decreases in absenteeism.  Across all specifications, results are comparable in direction between both sources of data.

This is a larger problem for any study in the social sciences that relies on administrative data.  In contexts of low capacity, low attention, or poor measurement, the quality of this data may deviate from common sense understandings or reality.  This paper provides one path forward -- triangulate administrative data with other sources of high-quality data collected from other sources \citep{Herrera2007}.  The benefits of using administrative data are too great to simply ignore them, but we should be cautious in how we employ them and the conclusions we draw from them, ensuring that they are verified through other means.  I provide one way forward in this paper by independently verifying results from an administrative dataset against a second source of data collected by a different organization for a different purpose, with different incentives in data collection.  It provides one way forward for students of political science interested in employing large data moving forward.

\subsection{Absenteeism in By-Elections}

Another concern in cooking the books is that politicians and teachers could have time to coordinate around either actions or reporting on absenteeism before an election.  For example, politicians and district-level officers could coordinate to report lower levels of absenteeism in an election year ex-post by reporting lower levels of absenteeism after the election.

If this were the case, we would see similar effects in elections on a regular five year cycle as well as by-elections, elections held because an MLA has had to step down, often because of a sudden death.  By-elections are held quickly often with only months of campaigning, but the winner serves the remainder of the term until the next election.  I can leverage by-elections to see if politicians are ``cooking the books'' with education officials as after the election they could alter levels of absenteeism.  Given the short campaign period and unexpected election timing, we should \emph{not} see an electoral cycle in absenteeism around by-elections if politicians are not cooking the books, but we should see one if they are cooking the books ex-post.

Below, I also provide the same tables and figures I show in the main manuscript of all the by-election elections using data from DISE, and the election immediately preceding and following the by-election.  There are a much smaller number of by-elections and schools in constituencies holding by-elections (see the Footers to Tables \ref{appendixtable:gextensiveakhmedovbe}, \ref{appendixtable:gintensiveakhmedovbe}, \ref{appendixtable:pextensiveakhmedovbe}, and \ref{appendixtable:pintensiveakhmedovbe}). The standard errors for ``2 or more'' years to the next election are large as by-elections cut the time to the next election often to one year or less.

\begin{figure}[htbp]
\caption{Absence in a School Year over the Electoral Cycle in Government Schools During By-Elections\label{fig:gbeextensiveakhmedov}}
\centering
\begin{minipage}{6.5in}
    \includegraphics[keepaspectratio = true]{data/output/figures/anyabsencecyclegovernmentbyelection.pdf}
    \tiny \emph{Notes:} In Panel A, the dependent variable is a dummy variable that takes the value of one if a school reports any teacher absenteeism in that year. In Panel B, the dependent variable is the log number of absences per teacher. The regression includes controls for the number of teachers in a school, a dummy for whether the school is in a rural area, and year and school fixed effects.  The line represents 95\% confidence intervals with standard errors clustered at the constituency-year level.  There are \input{data/output/text/nobsgovernmentbe.tex} school-year observations and \input{data/output/text/nschoolsgovernmentbe.tex} in Panel A, and \input{data/output/text/nobsgovernmentbelog.tex} and \input{data/output/text/nschoolsgovernmentbelog.tex} total schools in Panel B. The election year mean is \input{data/output/text/electionyearmeandummycyclegovernmentbyelection.tex} in Panel A and \input{data/output/text/electionyearmeanlogcyclegovernmentbyelection.tex} in Panel B.  Panel A corresponds to Column 9 in Table \ref{appendixtable:gextensiveakhmedovbe} and Panel B corresponds to Column 9 in Table \ref{appendixtable:gintensiveakhmedovbe}.\\
    \emph{Data Source:} District Information System for Education (DISE) School Report Cards.
\end{minipage}
\end{figure}

If politicians were ``cooking the books'' ex-post, we would similar effects for by-elections as we do in the main set of results using the full set of elections. Politicians would be able to coordinate with education officials at the state or national level to change data, which we see no evidence of here. At the same time, given the constrained campaigning window in by-elections, which are often held within a couple of months after the election is declared, it adds confidence in the results as we should not expect politicians to be able to exert pressure over the entirety of a constituency within such a short time period.

\begin{figure}[htbp]
\caption{Absence in a School Year over the Electoral Cycle in Private Schools During By-Elections\label{fig:pbeextensiveakhmedov}}
\centering
\begin{minipage}{6.5in}
    \includegraphics[keepaspectratio = true]{data/output/figures/anyabsencecycleprivatebyelection.pdf}
    \tiny \emph{Notes:} In Panel A, the dependent variable is a dummy variable that takes the value of one if a school reports any teacher absenteeism in that year. In Panel B, the dependent variable is the log number of absences per teacher. The regression includes controls for the number of teachers in a school, a dummy for whether the school is in a rural area, and year and school fixed effects.  The line represents 95\% confidence intervals with standard errors clustered at the constituency-year level.  There are \input{data/output/text/nobsprivatebe.tex} school-year observations and \input{data/output/text/nschoolsprivatebe.tex} in Panel A, and \input{data/output/text/nobsprivatebelog.tex} and \input{data/output/text/nschoolsprivatebelog.tex} total schools in Panel B. The election year mean is \input{data/output/text/electionyearmeandummycycleprivatebyelection.tex} in Panel A and \input{data/output/text/electionyearmeanlogcycleprivatebyelection.tex} in Panel B.  Panel A corresponds to Column 9 in Table \ref{appendixtable:pextensiveakhmedovbe} and Panel B corresponds to Column 9 in Table \ref{appendixtable:pintensiveakhmedovbe}.\\
    \emph{Data Source:} District Information System for Education (DISE) School Report Cards.
\end{minipage}
\end{figure}

\input{data/output/tables/anyabsencecyclegovernmentbyelection.tex}

\input{data/output/tables/logabsencecyclegovernmentbyelection.tex}

\input{data/output/tables/anyabsencecycleprivatebyelection.tex}

\input{data/output/tables/logabsencecycleprivatebyelection.tex}

\clearpage

\section{Matching Schools to Assembly Constituencies}\label{appendixsection:locationmatching}

As the DISE school report card data does not identify which Assembly Constituency a school is located in, I use geographic information on the school to place the school in an Assembly Constituency.  This matching proceeded in four steps:

\begin{enumerate}\itemsep -2pt
    \item Using the precise location of the school
    \item Using the location of the village in which the school is located
    \item Using data from \cite{Adukia2019b} (AAN) to cross-reference unmatched schools to geographic locations
    \item Using the postal pincode of the school to match the school to the Assembly Constituency
\end{enumerate}

I provide a further description of each step below, and a matching rate table in Table \ref{appendixtable:matchrate}.  The overall match rate for both public and private schools was \input{data/output/text/matchrate.tex}\%

\SingleSpacing
\begin{table}[h]
\begin{threeparttable}
\scriptsize
\caption{Matching Rate by Matching Strategy\label{appendixtable:matchrate}}
\centering
\input{data/output/tables/matchratetable.tex}
\begin{tablenotes}
    \tiny \item \emph{Notes}: The Matching Strategy column identifies the strategy used to match schools to assembly constituencies.  The Remaining Unmatched columns report the number of schools left to match after all previous matching strategies, the Number Matched columns report how many schools were matched using that particular matching strategy, the Match Rate columns report the percentage of schools matched of all remaining schools left to match, and the Overall Match Rate columns report the percentage of schools matched by that strategy out of all the schools in the data set for government and private schools respectively 
\end{tablenotes}
\end{threeparttable}
\end{table}
\DoubleSpacing

\subsection{School GIS}

The Government of India provides georeferenced information on many schools in India at \url{https://schoolgis.nic.in/}. I scraped the site and merged the locations with the DISE school report cards using the school code provided in each data set.  I then used a spatial join with Assembly Constituency shapefiles to identify the Assembly Constituency in which the school was located.

\subsection{Village Codes}

Next, the first nine digits of each school's school code identifies the village in which a school is located.  For unmatched schools located in a village with a matched school, I coded that school as located in the same assembly constituency.

\subsection{\cite{Adukia2019b}}

\cite{Adukia2019b} provide a crosswalk between DISE village codes and Census of India village codes.  For any remaining unmatched schools, I use the nine-digit DISE village code to match schools to Census villages.  Then, I use village-level shapefiles to spatially join villages to Assembly Constituencies, and code schools in the Assembly Constituency they are located.

\subsection{Postal Pincodes}

Finally, for the remaining unmatched villages, each school observation in the school report cards data reports the postal pincode in which the school is located.  I geo-reference these pincodes using Google Maps and take the centroid of the pincode.  Using the latitude and longitude of the centroid, I place this in Assembly Constituencies and code the school as being in that Assembly Constituency.

\subsection{Differences Between Matched and Unmatched Schools}

I then test for differences in matched and unmatched schools.  Figure \ref{appendixfig:matchingratetest} plots the differences in means. Given the large sample size and high match rate, most variables have significant differences, although their substantive sizes are small.  For example, rural schools are eight percent more likely to be matched, but given that \input{data/output/text/matchrate.tex}\% of schools are matched and of those, 86\% are rural, this means that 83\% of schools in the population are rural.  These differences are unlikely to lead to systematic bias in the results.

\begin{figure}[htbp]
\caption{Difference in Means Between Schools Matched to their Assembly Constituency and Unmatched Schools\label{appendixfig:matchingratetest}}
\centering
\begin{minipage}{6.5in}
    \includegraphics[keepaspectratio = true]{data/output/figures/matchingratetest.pdf}
    \tiny
    \emph{Notes:} Each point estimate is a t-test of the difference between schools I was able to place in an assembly constituency and unmatched schools.  For continuous variables, variables are standardized to range from 0-1 by subtracting the mean value and dividing by two standard deviations \citep{Gelman2008a}. The plot is ordered from largest to smallest value.
\end{minipage}
\end{figure}

\clearpage

\section{Systematic Measurement Error}

Given the lower levels of absenteeism reported in the DISE data relative to data collected independently by the World Bank in \cite{Chaudhury2006} and in the IHDS \citep{Desai2015}, the DISE data is likely to contain some level of measurement error. A greater concern for the identification strategy in this paper is if this measurement error and underreporting is systematic to the treatment and only occurs in pre-election years where there are greater incentives for politicians and education administrators to underreport absenteeism.

Here, I explore how large the systematic measurement error would have to be to overturn the results in the paper. To do so, I conduct a placebo test where I replace the value of pre-election year, and only pre-election year, absenteeism to either the maximum or median observed at that school for a randomly sampled number of constituencies. To do so, I randomly sample between 10 to 100 constituencies in increments of 10 200 times each.  For every school within \{10, 20, 30, 40, 50, 60, 70, 80, 90, 100\} randomly sampled constituencies, I replace their pre-election absenteeism to either be 1 (for any absence reported), the median value observed in that school, or the maximum value observed in that school (Panels A, B, and C of Figure \ref{fig:placebotest} respectively). I randomly sample the \{10, 20, 30, 40, 50, 60, 70, 80, 90, 100\} constituencies 200 times each and take the mean of the point estimate and 95\% confidence interval in Figure \ref{fig:placebotest}.

\begin{figure}[htbp]
\caption{Placebo Test Increasing Schools within an Increasing Share of Constituencies\label{fig:placebotest}}
\centering
\begin{minipage}{6.5in}
    \includegraphics[keepaspectratio = true]{data/output/figures/placebotest.pdf}
    \tiny \emph{Notes:} The placebo test increases the pre-election year absenteeism for all schools in a given number of constituencies. The y-axis reports the number of randomly sampled constituencies in which all the schools have their pre-election year absenteeism replaced, increasing from 10 to 100 constituencies. I estimate Equation \ref{equation:businesscycle} 200 times for each level of constituencies, randomly sampling the number of constituencies that have their absenteeism increased without replacement. I present the mean value of the pre-election year point estimate from these 200 estimations. The dependent variable in Panel A is a dummy that takes the value of 1 if the school reports any absenteeism in a year and 0 otherwise. For the randomly sampled number of constituencies I set the pre-election year dependent variable to 1 for all schools. The dependent variable in Panels B and C is the logged number of absences in a school-year. In Panel B, for the randomly sampled number of constituencies, I set the pre-election year dependent variable to the within-school median. In Panel C, for the randomly sampled number of constituencies, I set the pre-election year dependent variable to the within-school maximum.\\
    \emph{Data Source:} District Information System for Education.
\end{minipage}
\end{figure}

For example if a school within a randomly sampled constituency reported absenteeism rates of \{12, 8, 6, \textcolor{red}{0}, 10, 9, 8, 10, \textcolor{red}{9}, 5, 7, 9\} (pre-election years in red), in the placebo test, the resulting values would be:

\begin{enumerate} \itemsep -2pt
    \item[Panel A:] {1, 1, 1, \textcolor{red}{1}, 1, 1, 1, 1, \textcolor{red}{1}, 1, 1, 1}
    \item[Panel B:] {12, 8, 6, \textcolor{red}{8.5}, 10, 9, 8, 10, \textcolor{red}{8.5}, 5, 7, 9}
    \item[Panel C:] {12, 8, 6, \textcolor{red}{12}, 10, 9, 8, 10, \textcolor{red}{12}, 5, 7, 9}
\end{enumerate}

There are between 800 to 1,000 constituencies that hold elections every year, so this represents between 1 to 12.5 percent of all constituencies in an election year. The strong assumption here is that if a politician is working with education officials to misreport absenteeism in the year before an election, they would misreport absenteeism for \emph{all} the schools in their constituency and the observed value was misreported from the highest possible value ever seen in that school.\footnote{This is a similar bounding exercise to Mansky bounds \citep{Horowitz2000a}.}  As constituencies hold an average of three elections in the data, I am setting the pre-election absenteeism rate for three sets of elections in these models. Panels A and C represent hard tests of the measurement error, assuming that \emph{all} the schools within the randomly sampled constituencies misreport absenteeism from the maximum possible value observed in the data, while Panel B presents an upper bound bounded by the median level reported in the school.

In Panels A and C, between 20 and 30 constituencies have to misreport absenteeism (or between 2 to 4 percent of the data) and the misreported absenteeism has to be the maximum possible level of observed absenteeism in a school to make the results insignificant.  There is no number of constituencies that misreport absenteeism in Panel B that overturn the results. This represents a strict test and suggests that the level of measurement error would have to be high to overturn the results.

\clearpage

\section{Full Results Tables}

This section provides the full results tables for the figures presented in Figures \ref{fig:gextensiveakhmedov} to \ref{figure:testscorecycle}.  The results for Panel A of Figure \ref{fig:gextensiveakhmedov} is presented in Column 9 of Table \ref{appendixtable:gextensiveakhmedov}.  The results for Panel B of Figure \ref{fig:gextensiveakhmedov} is presented in Column 9 of Table \ref{appendixtable:gintensiveakhmedov}.  Irrespective of the specification, we still see evidence of an electoral cycle in government schools, with point estimates ranging from 2.3 to 3.8 percentage point reduction in absenteeism in the years before elections.  The difference between the point estimate for one year before the election and one year after the election is also significant across all specifications.

\SingleSpacing

\input{data/output/tables/anyabsencecyclegovernment.tex}

\input{data/output/tables/logabsencecyclegovernment.tex}
\DoubleSpacing

The results for Panel A of Figure \ref{fig:pextensiveakhmedov} are presented in Column 9 of Table \ref{appendixtable:pextensiveakhmedov}. The results for Panel B of Figure \ref{fig:pextensiveakhmedov} are presented in column 4 of Table \ref{appendixtable:pintensiveakhmedov}.  Columns 1 to 4 and 6 to 9 provide robustness checks to modeling choices with and without year and school fixed effects, and Columns 5 and 10 runs the analysis without the lagged dependent variable.  All specifications include a dummy for rural schools and columns 6 to 10 include the number of teachers in a school as a measure of the size of the school.  Columns 1 to 5 do not control for the number of teachers in the school in case politicians also manipulate the number of teachers in a school around an electoral cycle.  The results are substantively similar in all specifications.  Like the results in the main body of the paper, we do not see evidence of an electoral cycle in private schools.

\SingleSpacing
\input{data/output/tables/anyabsencecycleprivate.tex}

\input{data/output/tables/logabsencecycleprivate.tex}
\DoubleSpacing

Table \ref{table:ihdsabsencecyclegovernment} presents the analysis from Panel A of Figure \ref{fig:yearlycycle} in Column 1, and Panel B of Figure \ref{fig:yearlycycle} in Column 2.

\SingleSpacing
\input{data/output/tables/ihdsabsencecyclegovernment.tex}
\DoubleSpacing

Table \ref{table:ihdsabsencecycleprivate} presents the analysis from Panel A of Figure \ref{fig:yearlycycleprivate} in Column 1, and Panel B of Figure \ref{fig:yearlycycleprivate} in Column 2.

\SingleSpacing
\input{data/output/tables/ihdsabsencecycleprivate.tex}
\DoubleSpacing

Table \ref{table:testscorecycle} presents the full results in Table form of Figure \ref{figure:testscorecycle}.

\SingleSpacing
\input{data/output/tables/ihdstestscores.tex}
\DoubleSpacing

\clearpage

\subsection*{Full Results Tables for Administrative Visits}

In Tables \ref{table:administrative} and \ref{table:smc}, I present the full set of results for the effects of the electoral cycle on administrative visits (Table \ref{table:administrative}) and SMC meetings (Table \ref{table:smc}).  Column 9 in both tables represents the results presented in Panels A and B of Figure \ref{fig:alternativechannels} respectively.

\SingleSpacing
\input{data/output/tables/administrative.tex}

\input{data/output/tables/smc.tex}
\DoubleSpacing

\clearpage

\renewcommand\bibname{Supplementary Materials References}
\bibliography{absence}
\bibliographystyle{apsr}

\end{document}