*** empty log message ***

author N. Brouard <brouard@ined.fr>

Wed, 16 Jun 2004 12:05:30 +0000 (12:05 +0000)

committer N. Brouard <brouard@ined.fr>

Wed, 16 Jun 2004 12:05:30 +0000 (12:05 +0000)
author N. Brouard <brouard@ined.fr>
Wed, 16 Jun 2004 12:05:30 +0000 (12:05 +0000)
committer N. Brouard <brouard@ined.fr>
Wed, 16 Jun 2004 12:05:30 +0000 (12:05 +0000)
diff --git a/html/doc/imach.htm b/html/doc/imach.htm

new file mode 100644 (file)

index 0000000..f11e984
--- /dev/null
+++ b/html/doc/imach.htm
@@ -0,0 +1,1311 @@
+<!-- $Id$ --!>\r
+<html>\r
+\r
+<head>\r
+<meta http-equiv="Content-Type"\r
+content="text/html; charset=iso-8859-1">\r
+<meta name="GENERATOR" content="Microsoft FrontPage Express 2.0">\r
+<title>Computing Health Expectancies using IMaCh</title>\r
+<!-- Changed by: Agnes Lievre, 12-Oct-2000 -->\r
+<html>\r
+\r
+<head>\r
+<meta http-equiv="Content-Type"\r
+content="text/html; charset=iso-8859-1">\r
+<title>IMaCh</title>\r
+</head>\r
+\r
+<body bgcolor="#FFFFFF">\r
+\r
+<hr size="3" color="#EC5E5E">\r
+\r
+<h1 align="center"><font color="#00006A">Computing Health\r
+Expectancies using IMaCh</font></h1>\r
+\r
+<h1 align="center"><font color="#00006A" size="5">(a Maximum\r
+Likelihood Computer Program using Interpolation of Markov Chains)</font></h1>\r
+\r
+<p align="center">&nbsp;</p>\r
+\r
+<p align="center"><a href="http://www.ined.fr/"><img\r
+src="logo-ined.gif" border="0" width="151" height="76"></a><img\r
+src="euroreves2.gif" width="151" height="75"></p>\r
+\r
+<h3 align="center"><a href="http://www.ined.fr/"><font\r
+color="#00006A">INED</font></a><font color="#00006A"> and </font><a\r
+href="http://euroreves.ined.fr"><font color="#00006A">EUROREVES</font></a></h3>\r
+\r
+<p align="center"><font color="#00006A" size="4"><strong>Version\r
+0.8a, May 2002</strong></font></p>\r
+\r
+<hr size="3" color="#EC5E5E">\r
+\r
+<p align="center"><font color="#00006A"><strong>Authors of the\r
+program: </strong></font><a href="http://sauvy.ined.fr/brouard"><font\r
+color="#00006A"><strong>Nicolas Brouard</strong></font></a><font\r
+color="#00006A"><strong>, senior researcher at the </strong></font><a\r
+href="http://www.ined.fr"><font color="#00006A"><strong>Institut\r
+National d'Etudes Démographiques</strong></font></a><font\r
+color="#00006A"><strong> (INED, Paris) in the &quot;Mortality,\r
+Health and Epidemiology&quot; Research Unit </strong></font></p>\r
+\r
+<p align="center"><font color="#00006A"><strong>and Agnès\r
+Lièvre<br clear="left">\r
+</strong></font></p>\r
+\r
+<h4><font color="#00006A">Contribution to the mathematics: C. R.\r
+Heathcote </font><font color="#00006A" size="2">(Australian\r
+National University, Canberra).</font></h4>\r
+\r
+<h4><font color="#00006A">Contact: Agnès Lièvre (</font><a\r
+href="mailto:lievre@ined.fr"><font color="#00006A"><i>lievre@ined.fr</i></font></a><font\r
+color="#00006A">) </font></h4>\r
+\r
+<hr>\r
+\r
+<ul>\r
+    <li><a href="#intro">Introduction</a> </li>\r
+    <li><a href="#data">On what kind of data can it be used?</a></li>\r
+    <li><a href="#datafile">The data file</a> </li>\r
+    <li><a href="#biaspar">The parameter file</a> </li>\r
+    <li><a href="#running">Running Imach</a> </li>\r
+    <li><a href="#output">Output files and graphs</a> </li>\r
+    <li><a href="#example">Exemple</a> </li>\r
+</ul>\r
+\r
+<hr>\r
+\r
+<h2><a name="intro"><font color="#00006A">Introduction</font></a></h2>\r
+\r
+<p>This program computes <b>Healthy Life Expectancies</b> from <b>cross-longitudinal\r
+data</b> using the methodology pioneered by Laditka and Wolf (1).\r
+Within the family of Health Expectancies (HE), Disability-free\r
+life expectancy (DFLE) is probably the most important index to\r
+monitor. In low mortality countries, there is a fear that when\r
+mortality declines, the increase in DFLE is not proportionate to\r
+the increase in total Life expectancy. This case is called the <em>Expansion\r
+of morbidity</em>. Most of the data collected today, in\r
+particular by the international <a href="http://www.reves.org">REVES</a>\r
+network on Health expectancy, and most HE indices based on these\r
+data, are <em>cross-sectional</em>. It means that the information\r
+collected comes from a single cross-sectional survey: people from\r
+various ages (but mostly old people) are surveyed on their health\r
+status at a single date. Proportion of people disabled at each\r
+age, can then be measured at that date. This age-specific\r
+prevalence curve is then used to distinguish, within the\r
+stationary population (which, by definition, is the life table\r
+estimated from the vital statistics on mortality at the same\r
+date), the disable population from the disability-free\r
+population. Life expectancy (LE) (or total population divided by\r
+the yearly number of births or deaths of this stationary\r
+population) is then decomposed into DFLE and DLE. This method of\r
+computing HE is usually called the Sullivan method (from the name\r
+of the author who first described it).</p>\r
+\r
+<p>Age-specific proportions of people disable are very difficult\r
+to forecast because each proportion corresponds to historical\r
+conditions of the cohort and it is the result of the historical\r
+flows from entering disability and recovering in the past until\r
+today. The age-specific intensities (or incidence rates) of\r
+entering disability or recovering a good health, are reflecting\r
+actual conditions and therefore can be used at each age to\r
+forecast the future of this cohort. For example if a country is\r
+improving its technology of prosthesis, the incidence of\r
+recovering the ability to walk will be higher at each (old) age,\r
+but the prevalence of disability will only slightly reflect an\r
+improve because the prevalence is mostly affected by the history\r
+of the cohort and not by recent period effects. To measure the\r
+period improvement we have to simulate the future of a cohort of\r
+new-borns entering or leaving at each age the disability state or\r
+dying according to the incidence rates measured today on\r
+different cohorts. The proportion of people disabled at each age\r
+in this simulated cohort will be much lower (using the exemple of\r
+an improvement) that the proportions observed at each age in a\r
+cross-sectional survey. This new prevalence curve introduced in a\r
+life table will give a much more actual and realistic HE level\r
+than the Sullivan method which mostly measured the History of\r
+health conditions in this country.</p>\r
+\r
+<p>Therefore, the main question is how to measure incidence rates\r
+from cross-longitudinal surveys? This is the goal of the IMaCH\r
+program. From your data and using IMaCH you can estimate period\r
+HE and not only Sullivan's HE. Also the standard errors of the HE\r
+are computed.</p>\r
+\r
+<p>A cross-longitudinal survey consists in a first survey\r
+(&quot;cross&quot;) where individuals from different ages are\r
+interviewed on their health status or degree of disability. At\r
+least a second wave of interviews (&quot;longitudinal&quot;)\r
+should measure each new individual health status. Health\r
+expectancies are computed from the transitions observed between\r
+waves and are computed for each degree of severity of disability\r
+(number of life states). More degrees you consider, more time is\r
+necessary to reach the Maximum Likelihood of the parameters\r
+involved in the model. Considering only two states of disability\r
+(disable and healthy) is generally enough but the computer\r
+program works also with more health statuses.<br>\r
+<br>\r
+The simplest model is the multinomial logistic model where <i>pij</i>\r
+is the probability to be observed in state <i>j</i> at the second\r
+wave conditional to be observed in state <em>i</em> at the first\r
+wave. Therefore a simple model is: log<em>(pij/pii)= aij +\r
+bij*age+ cij*sex,</em> where '<i>age</i>' is age and '<i>sex</i>'\r
+is a covariate. The advantage that this computer program claims,\r
+comes from that if the delay between waves is not identical for\r
+each individual, or if some individual missed an interview, the\r
+information is not rounded or lost, but taken into account using\r
+an interpolation or extrapolation. <i>hPijx</i> is the\r
+probability to be observed in state <i>i</i> at age <i>x+h</i>\r
+conditional to the observed state <i>i</i> at age <i>x</i>. The\r
+delay '<i>h</i>' can be split into an exact number (<i>nh*stepm</i>)\r
+of unobserved intermediate states. This elementary transition (by\r
+month or quarter trimester, semester or year) is modeled as a\r
+multinomial logistic. The <i>hPx</i> matrix is simply the matrix\r
+product of <i>nh*stepm</i> elementary matrices and the\r
+contribution of each individual to the likelihood is simply <i>hPijx</i>.\r
+<br>\r
+</p>\r
+\r
+<p>The program presented in this manual is a quite general\r
+program named <strong>IMaCh</strong> (for <strong>I</strong>nterpolated\r
+<strong>MA</strong>rkov <strong>CH</strong>ain), designed to\r
+analyse transition data from longitudinal surveys. The first step\r
+is the parameters estimation of a transition probabilities model\r
+between an initial status and a final status. From there, the\r
+computer program produces some indicators such as observed and\r
+stationary prevalence, life expectancies and their variances and\r
+graphs. Our transition model consists in absorbing and\r
+non-absorbing states with the possibility of return across the\r
+non-absorbing states. The main advantage of this package,\r
+compared to other programs for the analysis of transition data\r
+(For example: Proc Catmod of SAS<sup>®</sup>) is that the whole\r
+individual information is used even if an interview is missing, a\r
+status or a date is unknown or when the delay between waves is\r
+not identical for each individual. The program can be executed\r
+according to parameters: selection of a sub-sample, number of\r
+absorbing and non-absorbing states, number of waves taken in\r
+account (the user inputs the first and the last interview), a\r
+tolerance level for the maximization function, the periodicity of\r
+the transitions (we can compute annual, quarterly or monthly\r
+transitions), covariates in the model. It works on Windows or on\r
+Unix.<br>\r
+</p>\r
+\r
+<hr>\r
+\r
+<p>(1) Laditka, Sarah B. and Wolf, Douglas A. (1998), &quot;New\r
+Methods for Analyzing Active Life Expectancy&quot;. <i>Journal of\r
+Aging and Health</i>. Vol 10, No. 2. </p>\r
+\r
+<hr>\r
+\r
+<h2><a name="data"><font color="#00006A">On what kind of data can\r
+it be used?</font></a></h2>\r
+\r
+<p>The minimum data required for a transition model is the\r
+recording of a set of individuals interviewed at a first date and\r
+interviewed again at least one another time. From the\r
+observations of an individual, we obtain a follow-up over time of\r
+the occurrence of a specific event. In this documentation, the\r
+event is related to health status at older ages, but the program\r
+can be applied on a lot of longitudinal studies in different\r
+contexts. To build the data file explained into the next section,\r
+you must have the month and year of each interview and the\r
+corresponding health status. But in order to get age, date of\r
+birth (month and year) is required (missing values is allowed for\r
+month). Date of death (month and year) is an important\r
+information also required if the individual is dead. Shorter\r
+steps (i.e. a month) will more closely take into account the\r
+survival time after the last interview.</p>\r
+\r
+<hr>\r
+\r
+<h2><a name="datafile"><font color="#00006A">The data file</font></a></h2>\r
+\r
+<p>In this example, 8,000 people have been interviewed in a\r
+cross-longitudinal survey of 4 waves (1984, 1986, 1988, 1990).\r
+Some people missed 1, 2 or 3 interviews. Health statuses are\r
+healthy (1) and disable (2). The survey is not a real one. It is\r
+a simulation of the American Longitudinal Survey on Aging. The\r
+disability state is defined if the individual missed one of four\r
+ADL (Activity of daily living, like bathing, eating, walking).\r
+Therefore, even is the individuals interviewed in the sample are\r
+virtual, the information brought with this sample is close to the\r
+situation of the United States. Sex is not recorded is this\r
+sample.</p>\r
+\r
+<p>Each line of the data set (named <a href="data1.txt">data1.txt</a>\r
+in this first example) is an individual record which fields are: </p>\r
+\r
+<ul>\r
+    <li><b>Index number</b>: positive number (field 1) </li>\r
+    <li><b>First covariate</b> positive number (field 2) </li>\r
+    <li><b>Second covariate</b> positive number (field 3) </li>\r
+    <li><a name="Weight"><b>Weight</b></a>: positive number\r
+        (field 4) . In most surveys individuals are weighted\r
+        according to the stratification of the sample.</li>\r
+    <li><b>Date of birth</b>: coded as mm/yyyy. Missing dates are\r
+        coded as 99/9999 (field 5) </li>\r
+    <li><b>Date of death</b>: coded as mm/yyyy. Missing dates are\r
+        coded as 99/9999 (field 6) </li>\r
+    <li><b>Date of first interview</b>: coded as mm/yyyy. Missing\r
+        dates are coded as 99/9999 (field 7) </li>\r
+    <li><b>Status at first interview</b>: positive number.\r
+        Missing values ar coded -1. (field 8) </li>\r
+    <li><b>Date of second interview</b>: coded as mm/yyyy.\r
+        Missing dates are coded as 99/9999 (field 9) </li>\r
+    <li><strong>Status at second interview</strong> positive\r
+        number. Missing values ar coded -1. (field 10) </li>\r
+    <li><b>Date of third interview</b>: coded as mm/yyyy. Missing\r
+        dates are coded as 99/9999 (field 11) </li>\r
+    <li><strong>Status at third interview</strong> positive\r
+        number. Missing values ar coded -1. (field 12) </li>\r
+    <li><b>Date of fourth interview</b>: coded as mm/yyyy.\r
+        Missing dates are coded as 99/9999 (field 13) </li>\r
+    <li><strong>Status at fourth interview</strong> positive\r
+        number. Missing values are coded -1. (field 14) </li>\r
+    <li>etc</li>\r
+</ul>\r
+\r
+<p>&nbsp;</p>\r
+\r
+<p>If your longitudinal survey do not include information about\r
+weights or covariates, you must fill the column with a number\r
+(e.g. 1) because a missing field is not allowed.</p>\r
+\r
+<hr>\r
+\r
+<h2><font color="#00006A">Your first example parameter file</font><a\r
+href="http://euroreves.ined.fr/imach"></a><a name="uio"></a></h2>\r
+\r
+<h2><a name="biaspar"></a>#Imach version 0.8a, May 2002,\r
+INED-EUROREVES </h2>\r
+\r
+<p>This is a comment. Comments start with a '#'.</p>\r
+\r
+<h4><font color="#FF0000">First uncommented line</font></h4>\r
+\r
+<pre>title=1st_example datafile=data1.txt lastobs=8600 firstpass=1 lastpass=4</pre>\r
+\r
+<ul>\r
+    <li><b>title=</b> 1st_example is title of the run. </li>\r
+    <li><b>datafile=</b> data1.txt is the name of the data set.\r
+        Our example is a six years follow-up survey. It consists\r
+        in a baseline followed by 3 reinterviews. </li>\r
+    <li><b>lastobs=</b> 8600 the program is able to run on a\r
+        subsample where the last observation number is lastobs.\r
+        It can be set a bigger number than the real number of\r
+        observations (e.g. 100000). In this example, maximisation\r
+        will be done on the 8600 first records. </li>\r
+    <li><b>firstpass=1</b> , <b>lastpass=4 </b>In case of more\r
+        than two interviews in the survey, the program can be run\r
+        on selected transitions periods. firstpass=1 means the\r
+        first interview included in the calculation is the\r
+        baseline survey. lastpass=4 means that the information\r
+        brought by the 4th interview is taken into account.</li>\r
+</ul>\r
+\r
+<p>&nbsp;</p>\r
+\r
+<h4><a name="biaspar-2"><font color="#FF0000">Second uncommented\r
+line</font></a></h4>\r
+\r
+<pre>ftol=1.e-08 stepm=1 ncovcol=2 nlstate=2 ndeath=1 maxwav=4 mle=1 weight=0</pre>\r
+\r
+<ul>\r
+    <li><b>ftol=1e-8</b> Convergence tolerance on the function\r
+        value in the maximisation of the likelihood. Choosing a\r
+        correct value for ftol is difficult. 1e-8 is a correct\r
+        value for a 32 bits computer.</li>\r
+    <li><b>stepm=1</b> Time unit in months for interpolation.\r
+        Examples:<ul>\r
+            <li>If stepm=1, the unit is a month </li>\r
+            <li>If stepm=4, the unit is a trimester</li>\r
+            <li>If stepm=12, the unit is a year </li>\r
+            <li>If stepm=24, the unit is two years</li>\r
+            <li>... </li>\r
+        </ul>\r
+    </li>\r
+    <li><b>ncovcol=2</b> Number of covariate columns in the\r
+        datafile which precede the date of birth. Here you can\r
+        put variables that won't necessary be used during the\r
+        run. It is not the number of covariates that will be\r
+        specified by the model. The 'model' syntax describe the\r
+        covariates to take into account. </li>\r
+    <li><b>nlstate=2</b> Number of non-absorbing (alive) states.\r
+        Here we have two alive states: disability-free is coded 1\r
+        and disability is coded 2. </li>\r
+    <li><b>ndeath=1</b> Number of absorbing states. The absorbing\r
+        state death is coded 3. </li>\r
+    <li><b>maxwav=4</b> Number of waves in the datafile.</li>\r
+    <li><a name="mle"><b>mle</b></a><b>=1</b> Option for the\r
+        Maximisation Likelihood Estimation. <ul>\r
+            <li>If mle=1 the program does the maximisation and\r
+                the calculation of health expectancies </li>\r
+            <li>If mle=0 the program only does the calculation of\r
+                the health expectancies. </li>\r
+        </ul>\r
+    </li>\r
+    <li><b>weight=0</b> Possibility to add weights. <ul>\r
+            <li>If weight=0 no weights are included </li>\r
+            <li>If weight=1 the maximisation integrates the\r
+                weights which are in field <a href="#Weight">4</a></li>\r
+        </ul>\r
+    </li>\r
+</ul>\r
+\r
+<h4><font color="#FF0000">Covariates</font></h4>\r
+\r
+<p>Intercept and age are systematically included in the model.\r
+Additional covariates can be included with the command: </p>\r
+\r
+<pre>model=<em>list of covariates</em></pre>\r
+\r
+<ul>\r
+    <li>if<strong> model=. </strong>then no covariates are\r
+        included</li>\r
+    <li>if <strong>model=V1</strong> the model includes the first\r
+        covariate (field 2)</li>\r
+    <li>if <strong>model=V2 </strong>the model includes the\r
+        second covariate (field 3)</li>\r
+    <li>if <strong>model=V1+V2 </strong>the model includes the\r
+        first and the second covariate (fields 2 and 3)</li>\r
+    <li>if <strong>model=V1*V2 </strong>the model includes the\r
+        product of the first and the second covariate (fields 2\r
+        and 3)</li>\r
+    <li>if <strong>model=V1+V1*age</strong> the model includes\r
+        the product covariate*age</li>\r
+</ul>\r
+\r
+<p>In this example, we have two covariates in the data file\r
+(fields 2 and 3). The number of covariates included in the data\r
+file between the id and the date of birth is ncovcol=2 (it was\r
+named ncov in version prior to 0.8). If you have 3 covariates in\r
+the datafile (fields 2, 3 and 4), you will set ncovcol=3. Then\r
+you can run the programme with a new parametrisation taking into\r
+account the third covariate. For example, <strong>model=V1+V3 </strong>estimates\r
+a model with the first and third covariates. More complicated\r
+models can be used, but it will takes more time to converge. With\r
+a simple model (no covariates), the programme estimates 8\r
+parameters. Adding covariates increases the number of parameters\r
+: 12 for <strong>model=V1, </strong>16 for <strong>model=V1+V1*age\r
+</strong>and 20 for <strong>model=V1+V2+V3.</strong></p>\r
+\r
+<h4><font color="#FF0000">Guess values for optimization</font><font\r
+color="#00006A"> </font></h4>\r
+\r
+<p>You must write the initial guess values of the parameters for\r
+optimization. The number of parameters, <em>N</em> depends on the\r
+number of absorbing states and non-absorbing states and on the\r
+number of covariates. <br>\r
+<em>N</em> is given by the formula <em>N</em>=(<em>nlstate</em> +\r
+<em>ndeath</em>-1)*<em>nlstate</em>*<em>ncovmodel</em>&nbsp;. <br>\r
+<br>\r
+Thus in the simple case with 2 covariates (the model is log\r
+(pij/pii) = aij + bij * age where intercept and age are the two\r
+covariates), and 2 health degrees (1 for disability-free and 2\r
+for disability) and 1 absorbing state (3), you must enter 8\r
+initials values, a12, b12, a13, b13, a21, b21, a23, b23. You can\r
+start with zeros as in this example, but if you have a more\r
+precise set (for example from an earlier run) you can enter it\r
+and it will speed up them<br>\r
+Each of the four lines starts with indices &quot;ij&quot;: <b>ij\r
+aij bij</b> </p>\r
+\r
+<blockquote>\r
+    <pre># Guess values of aij and bij in log (pij/pii) = aij + bij * age\r
+12 -14.155633  0.110794 \r
+13  -7.925360  0.032091 \r
+21  -1.890135 -0.029473 \r
+23  -6.234642  0.022315 </pre>\r
+</blockquote>\r
+\r
+<p>or, to simplify (in most of cases it converges but there is no\r
+warranty!): </p>\r
+\r
+<blockquote>\r
+    <pre>12 0.0 0.0\r
+13 0.0 0.0\r
+21 0.0 0.0\r
+23 0.0 0.0</pre>\r
+</blockquote>\r
+\r
+<p>In order to speed up the convergence you can make a first run\r
+with a large stepm i.e stepm=12 or 24 and then decrease the stepm\r
+until stepm=1 month. If newstepm is the new shorter stepm and\r
+stepm can be expressed as a multiple of newstepm, like newstepm=n\r
+stepm, then the following approximation holds: </p>\r
+\r
+<pre>aij(stepm) = aij(n . stepm) - ln(n)\r
+</pre>\r
+\r
+<p>and </p>\r
+\r
+<pre>bij(stepm) = bij(n . stepm) .</pre>\r
+\r
+<p>For example if you already ran for a 6 months interval and\r
+got:<br>\r
+</p>\r
+\r
+<pre># Parameters\r
+12 -13.390179  0.126133 \r
+13  -7.493460  0.048069 \r
+21   0.575975 -0.041322 \r
+23  -4.748678  0.030626 \r
+</pre>\r
+\r
+<p>If you now want to get the monthly estimates, you can guess\r
+the aij by substracting ln(6)= 1,7917<br>\r
+and running<br>\r
+</p>\r
+\r
+<pre>12 -15.18193847  0.126133 \r
+13 -9.285219469  0.048069\r
+21 -1.215784469 -0.041322\r
+23 -6.540437469  0.030626\r
+</pre>\r
+\r
+<p>and get<br>\r
+</p>\r
+\r
+<pre>12 -15.029768 0.124347 \r
+13 -8.472981 0.036599 \r
+21 -1.472527 -0.038394 \r
+23 -6.553602 0.029856 \r
+\r
+which is closer to the results. The approximation is probably useful\r
+only for very small intervals and we don't have enough experience to\r
+know if you will speed up the convergence or not.\r
+</pre>\r
+\r
+<pre>         -ln(12)= -2.484\r
+ -ln(6/1)=-ln(6)= -1.791\r
+ -ln(3/1)=-ln(3)= -1.0986\r
+-ln(12/6)=-ln(2)= -0.693\r
+</pre>\r
+\r
+<h4><font color="#FF0000">Guess values for computing variances</font></h4>\r
+\r
+<p>This is an output if <a href="#mle">mle</a>=1. But it can be\r
+used as an input to get the various output data files (Health\r
+expectancies, stationary prevalence etc.) and figures without\r
+rerunning the rather long maximisation phase (mle=0). </p>\r
+\r
+<p>The scales are small values for the evaluation of numerical\r
+derivatives. These derivatives are used to compute the hessian\r
+matrix of the parameters, that is the inverse of the covariance\r
+matrix, and the variances of health expectancies. Each line\r
+consists in indices &quot;ij&quot; followed by the initial scales\r
+(zero to simplify) associated with aij and bij. </p>\r
+\r
+<ul>\r
+    <li>If mle=1 you can enter zeros:</li>\r
+    <li><blockquote>\r
+            <pre># Scales (for hessian or gradient estimation)\r
+12 0. 0. \r
+13 0. 0. \r
+21 0. 0. \r
+23 0. 0. </pre>\r
+        </blockquote>\r
+    </li>\r
+    <li>If mle=0 you must enter a covariance matrix (usually\r
+        obtained from an earlier run).</li>\r
+</ul>\r
+\r
+<h4><font color="#FF0000">Covariance matrix of parameters</font></h4>\r
+\r
+<p>This is an output if <a href="#mle">mle</a>=1. But it can be\r
+used as an input to get the various output data files (Health\r
+expectancies, stationary prevalence etc.) and figures without\r
+rerunning the rather long maximisation phase (mle=0). <br>\r
+Each line starts with indices &quot;ijk&quot; followed by the\r
+covariances between aij and bij:<br>\r
+</p>\r
+\r
+<pre>\r
+   121 Var(a12) \r
+   122 Cov(b12,a12)  Var(b12) \r
+          ...\r
+   232 Cov(b23,a12)  Cov(b23,b12) ... Var (b23) </pre>\r
+\r
+<ul>\r
+    <li>If mle=1 you can enter zeros. </li>\r
+    <li><pre># Covariance matrix\r
+121 0.\r
+122 0. 0.\r
+131 0. 0. 0. \r
+132 0. 0. 0. 0. \r
+211 0. 0. 0. 0. 0. \r
+212 0. 0. 0. 0. 0. 0. \r
+231 0. 0. 0. 0. 0. 0. 0. \r
+232 0. 0. 0. 0. 0. 0. 0. 0.</pre>\r
+    </li>\r
+    <li>If mle=0 you must enter a covariance matrix (usually\r
+        obtained from an earlier run). </li>\r
+</ul>\r
+\r
+<h4><font color="#FF0000">Age range for calculation of stationary\r
+prevalences and health expectancies</font></h4>\r
+\r
+<pre>agemin=70 agemax=100 bage=50 fage=100</pre>\r
+\r
+<pre>\r
+Once we obtained the estimated parameters, the program is able\r
+to calculated stationary prevalence, transitions probabilities\r
+and life expectancies at any age. Choice of age range is useful\r
+for extrapolation. In our data file, ages varies from age 70 to\r
+102. It is possible to get extrapolated stationary prevalence by\r
+age ranging from agemin to agemax.\r
+\r
+\r
+Setting bage=50 (begin age) and fage=100 (final age), makes\r
+the program computing life expectancy from age 'bage' to age\r
+'fage'. As we use a model, we can interessingly compute life\r
+expectancy on a wider age range than the age range from the data.\r
+But the model can be rather wrong on much larger intervals.\r
+Program is limited to around 120 for upper age!\r
+</pre>\r
+\r
+<ul>\r
+    <li><b>agemin=</b> Minimum age for calculation of the\r
+        stationary prevalence </li>\r
+    <li><b>agemax=</b> Maximum age for calculation of the\r
+        stationary prevalence </li>\r
+    <li><b>bage=</b> Minimum age for calculation of the health\r
+        expectancies </li>\r
+    <li><b>fage=</b> Maximum age for calculation of the health\r
+        expectancies </li>\r
+</ul>\r
+\r
+<h4><a name="Computing"><font color="#FF0000">Computing</font></a><font\r
+color="#FF0000"> the observed prevalence</font></h4>\r
+\r
+<pre>begin-prev-date=1/1/1984 end-prev-date=1/6/1988 estepm=1</pre>\r
+\r
+<pre>\r
+Statements 'begin-prev-date' and 'end-prev-date' allow to\r
+select the period in which we calculate the observed prevalences\r
+in each state. In this example, the prevalences are calculated on\r
+data survey collected between 1 january 1984 and 1 june 1988. \r
+</pre>\r
+\r
+<ul>\r
+    <li><strong>begin-prev-date= </strong>Starting date\r
+        (day/month/year)</li>\r
+    <li><strong>end-prev-date= </strong>Final date\r
+        (day/month/year)</li>\r
+    <li><strong>estepm= </strong>Unit (in months).We compute the\r
+        life expectancy from trapezoids spaced every estepm\r
+        months. This is mainly to measure the difference between\r
+        two models: for example if stepm=24 months pijx are given\r
+        only every 2 years and by summing them we are calculating\r
+        an estimate of the Life Expectancy assuming a linear\r
+        progression inbetween and thus overestimating or\r
+        underestimating according to the curvature of the\r
+        survival function. If, for the same date, we estimate the\r
+        model with stepm=1 month, we can keep estepm to 24 months\r
+        to compare the new estimate of Life expectancy with the\r
+        same linear hypothesis. A more precise result, taking\r
+        into account a more precise curvature will be obtained if\r
+        estepm is as small as stepm.</li>\r
+</ul>\r
+\r
+<h4><font color="#FF0000">Population- or status-based health\r
+expectancies</font></h4>\r
+\r
+<pre>pop_based=0</pre>\r
+\r
+<p>The program computes status-based health expectancies, i.e\r
+health expectancies which depends on your initial health state.\r
+If you are healthy your healthy life expectancy (e11) is higher\r
+than if you were disabled (e21, with e11 &gt; e21).<br>\r
+To compute a healthy life expectancy independant of the initial\r
+status we have to weight e11 and e21 according to the probability\r
+to be in each state at initial age or, with other word, according\r
+to the proportion of people in each state.<br>\r
+We prefer computing a 'pure' period healthy life expectancy based\r
+only on the transtion forces. Then the weights are simply the\r
+stationnary prevalences or 'implied' prevalences at the initial\r
+age.<br>\r
+Some other people would like to use the cross-sectional\r
+prevalences (the &quot;Sullivan prevalences&quot;) observed at\r
+the initial age during a period of time <a href="#Computing">defined\r
+just above</a>. <br>\r
+</p>\r
+\r
+<ul>\r
+    <li><strong>popbased= 0 </strong>Health expectancies are\r
+        computed at each age from stationary prevalences\r
+        'expected' at this initial age.</li>\r
+    <li><strong>popbased= 1 </strong>Health expectancies are\r
+        computed at each age from cross-sectional 'observed'\r
+        prevalence at this initial age. As all the population is\r
+        not observed at the same exact date we define a short\r
+        period were the observed prevalence is computed.</li>\r
+</ul>\r
+\r
+<h4><font color="#FF0000">Prevalence forecasting ( Experimental)</font></h4>\r
+\r
+<pre>starting-proj-date=1/1/1989 final-proj-date=1/1/1992 mov_average=0 </pre>\r
+\r
+<p>Prevalence and population projections are only available if\r
+the interpolation unit is a month, i.e. stepm=1 and if there are\r
+no covariate. The programme estimates the prevalence in each\r
+state at a precise date expressed in day/month/year. The\r
+programme computes one forecasted prevalence a year from a\r
+starting date (1 january of 1989 in this example) to a final date\r
+(1 january 1992). The statement mov_average allows to compute\r
+smoothed forecasted prevalences with a five-age moving average\r
+centered at the mid-age of the five-age period. <br>\r
+</p>\r
+\r
+<ul>\r
+    <li><strong>starting-proj-date</strong>= starting date\r
+        (day/month/year) of forecasting</li>\r
+    <li><strong>final-proj-date= </strong>final date\r
+        (day/month/year) of forecasting</li>\r
+    <li><strong>mov_average</strong>= smoothing with a five-age\r
+        moving average centered at the mid-age of the five-age\r
+        period. The command<strong> mov_average</strong> takes\r
+        value 1 if the prevalences are smoothed and 0 otherwise.</li>\r
+</ul>\r
+\r
+<h4><font color="#FF0000">Last uncommented line : Population\r
+forecasting </font></h4>\r
+\r
+<pre>popforecast=0 popfile=pyram.txt popfiledate=1/1/1989 last-popfiledate=1/1/1992</pre>\r
+\r
+<p>This command is available if the interpolation unit is a\r
+month, i.e. stepm=1 and if popforecast=1. From a data file\r
+including age and number of persons alive at the precise date\r
+&#145;popfiledate&#146;, you can forecast the number of persons\r
+in each state until date &#145;last-popfiledate&#146;. In this\r
+example, the popfile <a href="pyram.txt"><b>pyram.txt</b></a>\r
+includes real data which are the Japanese population in 1989.<br>\r
+</p>\r
+\r
+<ul type="disc">\r
+    <li class="MsoNormal"\r
+    style="TEXT-ALIGN: justify; mso-margin-top-alt: auto; mso-margin-bottom-alt: auto; mso-list: l10 level1 lfo36; tab-stops: list 36.0pt"><b>popforecast=\r
+        0 </b>Option for population forecasting. If\r
+        popforecast=1, the programme does the forecasting<b>.</b></li>\r
+    <li class="MsoNormal"\r
+    style="TEXT-ALIGN: justify; mso-margin-top-alt: auto; mso-margin-bottom-alt: auto; mso-list: l10 level1 lfo36; tab-stops: list 36.0pt"><b>popfile=\r
+        </b>name of the population file</li>\r
+    <li class="MsoNormal"\r
+    style="TEXT-ALIGN: justify; mso-margin-top-alt: auto; mso-margin-bottom-alt: auto; mso-list: l10 level1 lfo36; tab-stops: list 36.0pt"><b>popfiledate=</b>\r
+        date of the population population</li>\r
+    <li class="MsoNormal"\r
+    style="TEXT-ALIGN: justify; mso-margin-top-alt: auto; mso-margin-bottom-alt: auto; mso-list: l10 level1 lfo36; tab-stops: list 36.0pt"><b>last-popfiledate</b>=\r
+        date of the last population projection&nbsp;</li>\r
+</ul>\r
+\r
+<hr>\r
+\r
+<h2><a name="running"></a><font color="#00006A">Running Imach\r
+with this example</font></h2>\r
+\r
+<pre>We assume that you typed in your <a href="biaspar.imach">1st_example\r
+parameter file</a> as explained <a href="#biaspar">above</a>. \r
+\r
+To run the program you should either:\r
+</pre>\r
+\r
+<ul>\r
+    <li>click on the imach.exe icon and enter the name of the\r
+        parameter file which is for example <a\r
+        href="C:\usr\imach\mle\biaspar.imach">C:\usr\imach\mle\biaspar.imach</a>\r
+    </li>\r
+    <li>You also can locate the biaspar.imach icon in <a\r
+        href="C:\usr\imach\mle">C:\usr\imach\mle</a> with your\r
+        mouse and drag it with the mouse on the imach window). </li>\r
+    <li>With latest version (0.7 and higher) if you setup windows\r
+        in order to understand &quot;.imach&quot; extension you\r
+        can right click the biaspar.imach icon and either edit\r
+        with notepad the parameter file or execute it with imach\r
+        or whatever. </li>\r
+</ul>\r
+\r
+<pre>The time to converge depends on the step unit that you used (1\r
+month is cpu consuming), on the number of cases, and on the\r
+number of variables.\r
+\r
+\r
+The program outputs many files. Most of them are files which\r
+will be plotted for better understanding.\r
+\r
+</pre>\r
+\r
+<hr>\r
+\r
+<h2><a name="output"><font color="#00006A">Output of the program\r
+and graphs</font> </a></h2>\r
+\r
+<p>Once the optimization is finished, some graphics can be made\r
+with a grapher. We use Gnuplot which is an interactive plotting\r
+program copyrighted but freely distributed. A gnuplot reference\r
+manual is available <a href="http://www.gnuplot.info/">here</a>. <br>\r
+When the running is finished, the user should enter a caracter\r
+for plotting and output editing. <br>\r
+These caracters are:<br>\r
+</p>\r
+\r
+<ul>\r
+    <li>'c' to start again the program from the beginning.</li>\r
+    <li>'e' opens the <a href="biaspar.htm"><strong>biaspar.htm</strong></a>\r
+        file to edit the output files and graphs. </li>\r
+    <li>'g' to graph again</li>\r
+    <li>'q' for exiting.</li>\r
+</ul>\r
+\r
+<h5><font size="4"><strong>Results files </strong></font><br>\r
+<br>\r
+<font color="#EC5E5E" size="3"><strong>- </strong></font><a\r
+name="Observed prevalence in each state"><font color="#EC5E5E"\r
+size="3"><strong>Observed prevalence in each state</strong></font></a><font\r
+color="#EC5E5E" size="3"><strong> (and at first pass)</strong></font><b>:\r
+</b><a href="prbiaspar.txt"><b>prbiaspar.txt</b></a><br>\r
+</h5>\r
+\r
+<p>The first line is the title and displays each field of the\r
+file. The first column is age. The fields 2 and 6 are the\r
+proportion of individuals in states 1 and 2 respectively as\r
+observed during the first exam. Others fields are the numbers of\r
+people in states 1, 2 or more. The number of columns increases if\r
+the number of states is higher than 2.<br>\r
+The header of the file is </p>\r
+\r
+<pre># Age Prev(1) N(1) N Age Prev(2) N(2) N\r
+70 1.00000 631 631 70 0.00000 0 631\r
+71 0.99681 625 627 71 0.00319 2 627 \r
+72 0.97125 1115 1148 72 0.02875 33 1148 </pre>\r
+\r
+<p>It means that at age 70, the prevalence in state 1 is 1.000\r
+and in state 2 is 0.00 . At age 71 the number of individuals in\r
+state 1 is 625 and in state 2 is 2, hence the total number of\r
+people aged 71 is 625+2=627. <br>\r
+</p>\r
+\r
+<h5><font color="#EC5E5E" size="3"><b>- Estimated parameters and\r
+covariance matrix</b></font><b>: </b><a href="rbiaspar.txt"><b>rbiaspar.imach</b></a></h5>\r
+\r
+<p>This file contains all the maximisation results: </p>\r
+\r
+<pre> -2 log likelihood= 21660.918613445392\r
+ Estimated parameters: a12 = -12.290174 b12 = 0.092161 \r
+                       a13 = -9.155590  b13 = 0.046627 \r
+                       a21 = -2.629849  b21 = -0.022030 \r
+                       a23 = -7.958519  b23 = 0.042614  \r
+ Covariance matrix: Var(a12) = 1.47453e-001\r
+                    Var(b12) = 2.18676e-005\r
+                    Var(a13) = 2.09715e-001\r
+                    Var(b13) = 3.28937e-005  \r
+                    Var(a21) = 9.19832e-001\r
+                    Var(b21) = 1.29229e-004\r
+                    Var(a23) = 4.48405e-001\r
+                    Var(b23) = 5.85631e-005 \r
+ </pre>\r
+\r
+<p>By substitution of these parameters in the regression model,\r
+we obtain the elementary transition probabilities:</p>\r
+\r
+<p><img src="pebiaspar1.gif" width="400" height="300"></p>\r
+\r
+<h5><font color="#EC5E5E" size="3"><b>- Transition probabilities</b></font><b>:\r
+</b><a href="pijrbiaspar.txt"><b>pijrbiaspar.txt</b></a></h5>\r
+\r
+<p>Here are the transitions probabilities Pij(x, x+nh) where nh\r
+is a multiple of 2 years. The first column is the starting age x\r
+(from age 50 to 100), the second is age (x+nh) and the others are\r
+the transition probabilities p11, p12, p13, p21, p22, p23. For\r
+example, line 5 of the file is: </p>\r
+\r
+<pre> 100 106 0.02655 0.17622 0.79722 0.01809 0.13678 0.84513 </pre>\r
+\r
+<p>and this means: </p>\r
+\r
+<pre>p11(100,106)=0.02655\r
+p12(100,106)=0.17622\r
+p13(100,106)=0.79722\r
+p21(100,106)=0.01809\r
+p22(100,106)=0.13678\r
+p22(100,106)=0.84513 </pre>\r
+\r
+<h5><font color="#EC5E5E" size="3"><b>- </b></font><a\r
+name="Stationary prevalence in each state"><font color="#EC5E5E"\r
+size="3"><b>Stationary prevalence in each state</b></font></a><b>:\r
+</b><a href="plrbiaspar.txt"><b>plrbiaspar.txt</b></a></h5>\r
+\r
+<pre>#Prevalence\r
+#Age 1-1 2-2\r
+\r
+#************ \r
+70 0.90134 0.09866\r
+71 0.89177 0.10823 \r
+72 0.88139 0.11861 \r
+73 0.87015 0.12985 </pre>\r
+\r
+<p>At age 70 the stationary prevalence is 0.90134 in state 1 and\r
+0.09866 in state 2. This stationary prevalence differs from\r
+observed prevalence. Here is the point. The observed prevalence\r
+at age 70 results from the incidence of disability, incidence of\r
+recovery and mortality which occurred in the past of the cohort.\r
+Stationary prevalence results from a simulation with actual\r
+incidences and mortality (estimated from this cross-longitudinal\r
+survey). It is the best predictive value of the prevalence in the\r
+future if &quot;nothing changes in the future&quot;. This is\r
+exactly what demographers do with a Life table. Life expectancy\r
+is the expected mean time to survive if observed mortality rates\r
+(incidence of mortality) &quot;remains constant&quot; in the\r
+future. </p>\r
+\r
+<h5><font color="#EC5E5E" size="3"><b>- Standard deviation of\r
+stationary prevalence</b></font><b>: </b><a\r
+href="vplrbiaspar.txt"><b>vplrbiaspar.txt</b></a></h5>\r
+\r
+<p>The stationary prevalence has to be compared with the observed\r
+prevalence by age. But both are statistical estimates and\r
+subjected to stochastic errors due to the size of the sample, the\r
+design of the survey, and, for the stationary prevalence to the\r
+model used and fitted. It is possible to compute the standard\r
+deviation of the stationary prevalence at each age.</p>\r
+\r
+<h5><font color="#EC5E5E" size="3">-Observed and stationary\r
+prevalence in state (2=disable) with confidence interval</font>:<b>\r
+</b><a href="vbiaspar21.htm"><b>vbiaspar21.gif</b></a></h5>\r
+\r
+<p>This graph exhibits the stationary prevalence in state (2)\r
+with the confidence interval in red. The green curve is the\r
+observed prevalence (or proportion of individuals in state (2)).\r
+Without discussing the results (it is not the purpose here), we\r
+observe that the green curve is rather below the stationary\r
+prevalence. It suggests an increase of the disability prevalence\r
+in the future.</p>\r
+\r
+<p><img src="vbiaspar21.gif" width="400" height="300"></p>\r
+\r
+<h5><font color="#EC5E5E" size="3"><b>-Convergence to the\r
+stationary prevalence of disability</b></font><b>: </b><a\r
+href="pbiaspar11.gif"><b>pbiaspar11.gif</b></a><br>\r
+<img src="pbiaspar11.gif" width="400" height="300"> </h5>\r
+\r
+<p>This graph plots the conditional transition probabilities from\r
+an initial state (1=healthy in red at the bottom, or 2=disable in\r
+green on top) at age <em>x </em>to the final state 2=disable<em> </em>at\r
+age <em>x+h. </em>Conditional means at the condition to be alive\r
+at age <em>x+h </em>which is <i>hP12x</i> + <em>hP22x</em>. The\r
+curves <i>hP12x/(hP12x</i> + <em>hP22x) </em>and <i>hP22x/(hP12x</i>\r
++ <em>hP22x) </em>converge with <em>h, </em>to the <em>stationary\r
+prevalence of disability</em>. In order to get the stationary\r
+prevalence at age 70 we should start the process at an earlier\r
+age, i.e.50. If the disability state is defined by severe\r
+disability criteria with only a few chance to recover, then the\r
+incidence of recovery is low and the time to convergence is\r
+probably longer. But we don't have experience yet.</p>\r
+\r
+<h5><font color="#EC5E5E" size="3"><b>- Life expectancies by age\r
+and initial health status with standard deviation</b></font><b>: </b><a\r
+href="erbiaspar.txt"><b>erbiaspar.txt</b></a></h5>\r
+\r
+<pre># Health expectancies \r
+# Age 1-1 (SE) 1-2 (SE) 2-1 (SE) 2-2 (SE)\r
+70 10.4171 (0.1517)    3.0433 (0.4733)    5.6641 (0.1121)    5.6907 (0.3366)\r
+71 9.9325 (0.1409)    3.0495 (0.4234)    5.2627 (0.1107)    5.6384 (0.3129)\r
+72 9.4603 (0.1319)    3.0540 (0.3770)    4.8810 (0.1099)    5.5811 (0.2907)\r
+73 9.0009 (0.1246)    3.0565 (0.3345)    4.5188 (0.1098)    5.5187 (0.2702)\r
+</pre>\r
+\r
+<pre>For example 70 10.4171 (0.1517) 3.0433 (0.4733) 5.6641 (0.1121) 5.6907 (0.3366) means:\r
+e11=10.4171 e12=3.0433 e21=5.6641 e22=5.6907 </pre>\r
+\r
+<pre><img src="expbiaspar21.gif" width="400" height="300"><img\r
+src="expbiaspar11.gif" width="400" height="300"></pre>\r
+\r
+<p>For example, life expectancy of a healthy individual at age 70\r
+is 10.42 in the healthy state and 3.04 in the disability state\r
+(=13.46 years). If he was disable at age 70, his life expectancy\r
+will be shorter, 5.66 in the healthy state and 5.69 in the\r
+disability state (=11.35 years). The total life expectancy is a\r
+weighted mean of both, 13.46 and 11.35; weight is the proportion\r
+of people disabled at age 70. In order to get a pure period index\r
+(i.e. based only on incidences) we use the <a\r
+href="#Stationary prevalence in each state">computed or\r
+stationary prevalence</a> at age 70 (i.e. computed from\r
+incidences at earlier ages) instead of the <a\r
+href="#Observed prevalence in each state">observed prevalence</a>\r
+(for example at first exam) (<a href="#Health expectancies">see\r
+below</a>).</p>\r
+\r
+<h5><font color="#EC5E5E" size="3"><b>- Variances of life\r
+expectancies by age and initial health status</b></font><b>: </b><a\r
+href="vrbiaspar.txt"><b>vrbiaspar.txt</b></a></h5>\r
+\r
+<p>For example, the covariances of life expectancies Cov(ei,ej)\r
+at age 50 are (line 3) </p>\r
+\r
+<pre>   Cov(e1,e1)=0.4776  Cov(e1,e2)=0.0488=Cov(e2,e1)  Cov(e2,e2)=0.0424</pre>\r
+\r
+<h5><font color="#EC5E5E" size="3"><b>-Variances of one-step\r
+probabilities </b></font><b>: </b><a href="probrbiaspar.txt"><b>probrbiaspar.txt</b></a></h5>\r
+\r
+<p>For example, at age 65</p>\r
+\r
+<pre>   p11=9.960e-001 standard deviation of p11=2.359e-004</pre>\r
+\r
+<h5><font color="#EC5E5E" size="3"><b>- </b></font><a\r
+name="Health expectancies"><font color="#EC5E5E" size="3"><b>Health\r
+expectancies</b></font></a><font color="#EC5E5E" size="3"><b>\r
+with standard errors in parentheses</b></font><b>: </b><a\r
+href="trbiaspar.txt"><font face="Courier New"><b>trbiaspar.txt</b></font></a></h5>\r
+\r
+<pre>#Total LEs with variances: e.. (std) e.1 (std) e.2 (std) </pre>\r
+\r
+<pre>70 13.26 (0.22) 9.95 (0.20) 3.30 (0.14) </pre>\r
+\r
+<p>Thus, at age 70 the total life expectancy, e..=13.26 years is\r
+the weighted mean of e1.=13.46 and e2.=11.35 by the stationary\r
+prevalence at age 70 which are 0.90134 in state 1 and 0.09866 in\r
+state 2, respectively (the sum is equal to one). e.1=9.95 is the\r
+Disability-free life expectancy at age 70 (it is again a weighted\r
+mean of e11 and e21). e.2=3.30 is also the life expectancy at age\r
+70 to be spent in the disability state.</p>\r
+\r
+<h5><font color="#EC5E5E" size="3"><b>-Total life expectancy by\r
+age and health expectancies in states (1=healthy) and (2=disable)</b></font><b>:\r
+</b><a href="ebiaspar1.gif"><b>ebiaspar1.gif</b></a></h5>\r
+\r
+<p>This figure represents the health expectancies and the total\r
+life expectancy with the confident interval in dashed curve. </p>\r
+\r
+<pre>        <img src="ebiaspar1.gif" width="400" height="300"></pre>\r
+\r
+<p>Standard deviations (obtained from the information matrix of\r
+the model) of these quantities are very useful.\r
+Cross-longitudinal surveys are costly and do not involve huge\r
+samples, generally a few thousands; therefore it is very\r
+important to have an idea of the standard deviation of our\r
+estimates. It has been a big challenge to compute the Health\r
+Expectancy standard deviations. Don't be confuse: life expectancy\r
+is, as any expected value, the mean of a distribution; but here\r
+we are not computing the standard deviation of the distribution,\r
+but the standard deviation of the estimate of the mean.</p>\r
+\r
+<p>Our health expectancies estimates vary according to the sample\r
+size (and the standard deviations give confidence intervals of\r
+the estimate) but also according to the model fitted. Let us\r
+explain it in more details.</p>\r
+\r
+<p>Choosing a model means ar least two kind of choices. First we\r
+have to decide the number of disability states. Second we have to\r
+design, within the logit model family, the model: variables,\r
+covariables, confonding factors etc. to be included.</p>\r
+\r
+<p>More disability states we have, better is our demographical\r
+approach of the disability process, but smaller are the number of\r
+transitions between each state and higher is the noise in the\r
+measurement. We do not have enough experiments of the various\r
+models to summarize the advantages and disadvantages, but it is\r
+important to say that even if we had huge and unbiased samples,\r
+the total life expectancy computed from a cross-longitudinal\r
+survey, varies with the number of states. If we define only two\r
+states, alive or dead, we find the usual life expectancy where it\r
+is assumed that at each age, people are at the same risk to die.\r
+If we are differentiating the alive state into healthy and\r
+disable, and as the mortality from the disability state is higher\r
+than the mortality from the healthy state, we are introducing\r
+heterogeneity in the risk of dying. The total mortality at each\r
+age is the weighted mean of the mortality in each state by the\r
+prevalence in each state. Therefore if the proportion of people\r
+at each age and in each state is different from the stationary\r
+equilibrium, there is no reason to find the same total mortality\r
+at a particular age. Life expectancy, even if it is a very useful\r
+tool, has a very strong hypothesis of homogeneity of the\r
+population. Our main purpose is not to measure differential\r
+mortality but to measure the expected time in a healthy or\r
+disability state in order to maximise the former and minimize the\r
+latter. But the differential in mortality complexifies the\r
+measurement.</p>\r
+\r
+<p>Incidences of disability or recovery are not affected by the\r
+number of states if these states are independant. But incidences\r
+estimates are dependant on the specification of the model. More\r
+covariates we added in the logit model better is the model, but\r
+some covariates are not well measured, some are confounding\r
+factors like in any statistical model. The procedure to &quot;fit\r
+the best model' is similar to logistic regression which itself is\r
+similar to regression analysis. We haven't yet been sofar because\r
+we also have a severe limitation which is the speed of the\r
+convergence. On a Pentium III, 500 MHz, even the simplest model,\r
+estimated by month on 8,000 people may take 4 hours to converge.\r
+Also, the program is not yet a statistical package, which permits\r
+a simple writing of the variables and the model to take into\r
+account in the maximisation. The actual program allows only to\r
+add simple variables like age+sex or age+sex+ age*sex but will\r
+never be general enough. But what is to remember, is that\r
+incidences or probability of change from one state to another is\r
+affected by the variables specified into the model.</p>\r
+\r
+<p>Also, the age range of the people interviewed has a link with\r
+the age range of the life expectancy which can be estimated by\r
+extrapolation. If your sample ranges from age 70 to 95, you can\r
+clearly estimate a life expectancy at age 70 and trust your\r
+confidence interval which is mostly based on your sample size,\r
+but if you want to estimate the life expectancy at age 50, you\r
+should rely in your model, but fitting a logistic model on a age\r
+range of 70-95 and estimating probabilties of transition out of\r
+this age range, say at age 50 is very dangerous. At least you\r
+should remember that the confidence interval given by the\r
+standard deviation of the health expectancies, are under the\r
+strong assumption that your model is the 'true model', which is\r
+probably not the case.</p>\r
+\r
+<h5><font color="#EC5E5E" size="3"><b>- Copy of the parameter\r
+file</b></font><b>: </b><a href="orbiaspar.txt"><b>orbiaspar.txt</b></a></h5>\r
+\r
+<p>This copy of the parameter file can be useful to re-run the\r
+program while saving the old output files. </p>\r
+\r
+<h5><font color="#EC5E5E" size="3"><b>- Prevalence forecasting</b></font><b>:\r
+</b><a href="frbiaspar.txt"><b>frbiaspar.txt</b></a></h5>\r
+\r
+<p\r
+style="TEXT-ALIGN: justify; tab-stops: 45.8pt 91.6pt 137.4pt 183.2pt 229.0pt 274.8pt 320.6pt 366.4pt 412.2pt 458.0pt 503.8pt 549.6pt 595.4pt 641.2pt 687.0pt 732.8pt">First,\r
+we have estimated the observed prevalence between 1/1/1984 and\r
+1/6/1988. The mean date of interview (weighed average of the\r
+interviews performed between1/1/1984 and 1/6/1988) is estimated\r
+to be 13/9/1985, as written on the top on the file. Then we\r
+forecast the probability to be in each state. </p>\r
+\r
+<p\r
+style="TEXT-ALIGN: justify; tab-stops: 45.8pt 91.6pt 137.4pt 183.2pt 229.0pt 274.8pt 320.6pt 366.4pt 412.2pt 458.0pt 503.8pt 549.6pt 595.4pt 641.2pt 687.0pt 732.8pt">Example,\r
+at date 1/1/1989 : </p>\r
+\r
+<pre class="MsoNormal"># StartingAge FinalAge P.1 P.2 P.3\r
+# Forecasting at date 1/1/1989\r
+  73 0.807 0.078 0.115</pre>\r
+\r
+<p\r
+style="TEXT-ALIGN: justify; tab-stops: 45.8pt 91.6pt 137.4pt 183.2pt 229.0pt 274.8pt 320.6pt 366.4pt 412.2pt 458.0pt 503.8pt 549.6pt 595.4pt 641.2pt 687.0pt 732.8pt">Since\r
+the minimum age is 70 on the 13/9/1985, the youngest forecasted\r
+age is 73. This means that at age a person aged 70 at 13/9/1989\r
+has a probability to enter state1 of 0.807 at age 73 on 1/1/1989.\r
+Similarly, the probability to be in state 2 is 0.078 and the\r
+probability to die is 0.115. Then, on the 1/1/1989, the\r
+prevalence of disability at age 73 is estimated to be 0.088.</p>\r
+\r
+<h5><font color="#EC5E5E" size="3"><b>- Population forecasting</b></font><b>:\r
+</b><a href="poprbiaspar.txt"><b>poprbiaspar.txt</b></a></h5>\r
+\r
+<pre># Age P.1 P.2 P.3 [Population]\r
+# Forecasting at date 1/1/1989 \r
+75 572685.22 83798.08 \r
+74 621296.51 79767.99 \r
+73 645857.70 69320.60 </pre>\r
+\r
+<pre># Forecasting at date 1/1/19909 \r
+76 442986.68 92721.14 120775.48\r
+75 487781.02 91367.97 121915.51\r
+74 512892.07 85003.47 117282.76 </pre>\r
+\r
+<p>From the population file, we estimate the number of people in\r
+each state. At age 73, 645857 persons are in state 1 and 69320\r
+are in state 2. One year latter, 512892 are still in state 1,\r
+85003 are in state 2 and 117282 died before 1/1/1990.</p>\r
+\r
+<hr>\r
+\r
+<h2><a name="example"></a><font color="#00006A">Trying an example</font></h2>\r
+\r
+<p>Since you know how to run the program, it is time to test it\r
+on your own computer. Try for example on a parameter file named <a\r
+href="..\mytry\imachpar.imach">imachpar.imach</a> which is a copy\r
+of <font size="2" face="Courier New">mypar.imach</font> included\r
+in the subdirectory of imach, <font size="2" face="Courier New">mytry</font>.\r
+Edit it to change the name of the data file to <font size="2"\r
+face="Courier New">..\data\mydata.txt</font> if you don't want to\r
+copy it on the same directory. The file <font face="Courier New">mydata.txt</font>\r
+is a smaller file of 3,000 people but still with 4 waves. </p>\r
+\r
+<p>Click on the imach.exe icon to open a window. Answer to the\r
+question:'<strong>Enter the parameter file name:'</strong></p>\r
+\r
+<table border="1">\r
+    <tr>\r
+        <td width="100%"><strong>IMACH, Version 0.8a</strong><p><strong>Enter\r
+        the parameter file name: ..\mytry\imachpar.imach</strong></p>\r
+        </td>\r
+    </tr>\r
+</table>\r
+\r
+<p>Most of the data files or image files generated, will use the\r
+'imachpar' string into their name. The running time is about 2-3\r
+minutes on a Pentium III. If the execution worked correctly, the\r
+outputs files are created in the current directory, and should be\r
+the same as the mypar files initially included in the directory <font\r
+size="2" face="Courier New">mytry</font>.</p>\r
+\r
+<ul>\r
+    <li><pre><u>Output on the screen</u> The output screen looks like <a\r
+href="imachrun.LOG">this Log file</a>\r
+#\r
+\r
+title=MLE datafile=..\data\mydata.txt lastobs=3000 firstpass=1 lastpass=3\r
+ftol=1.000000e-008 stepm=24 ncovcol=2 nlstate=2 ndeath=1 maxwav=4 mle=1 weight=0</pre>\r
+    </li>\r
+    <li><pre>Total number of individuals= 2965, Agemin = 70.00, Agemax= 100.92\r
+\r
+Warning, no any valid information for:126 line=126\r
+Warning, no any valid information for:2307 line=2307\r
+Delay (in months) between two waves Min=21 Max=51 Mean=24.495826\r
+<font face="Times New Roman">These lines give some warnings on the data file and also some raw statistics on frequencies of transitions.</font>\r
+Age 70 1.=230 loss[1]=3.5% 2.=16 loss[2]=12.5% 1.=222 prev[1]=94.1% 2.=14\r
+ prev[2]=5.9% 1-1=8 11=200 12=7 13=15 2-1=2 21=6 22=7 23=1\r
+Age 102 1.=0 loss[1]=NaNQ% 2.=0 loss[2]=NaNQ% 1.=0 prev[1]=NaNQ% 2.=0 </pre>\r
+    </li>\r
+</ul>\r
+\r
+<p>&nbsp;</p>\r
+\r
+<ul>\r
+    <li>Maximisation with the Powell algorithm. 8 directions are\r
+        given corresponding to the 8 parameters. this can be\r
+        rather long to get convergence.<br>\r
+        <font size="1" face="Courier New"><br>\r
+        Powell iter=1 -2*LL=11531.405658264877 1 0.000000000000 2\r
+        0.000000000000 3<br>\r
+        0.000000000000 4 0.000000000000 5 0.000000000000 6\r
+        0.000000000000 7 <br>\r
+        0.000000000000 8 0.000000000000<br>\r
+        1..........2.................3..........4.................5.........<br>\r
+        6................7........8...............<br>\r
+        Powell iter=23 -2*LL=6744.954108371555 1 -12.967632334283\r
+        <br>\r
+        2 0.135136681033 3 -7.402109728262 4 0.067844593326 <br>\r
+        5 -0.673601538129 6 -0.006615504377 7 -5.051341616718 <br>\r
+        8 0.051272038506<br>\r
+        1..............2...........3..............4...........<br>\r
+        5..........6................7...........8.........<br>\r
+        #Number of iterations = 23, -2 Log likelihood =\r
+        6744.954042573691<br>\r
+        # Parameters<br>\r
+        12 -12.966061 0.135117 <br>\r
+        13 -7.401109 0.067831 <br>\r
+        21 -0.672648 -0.006627 <br>\r
+        23 -5.051297 0.051271 </font><br>\r
+        </li>\r
+    <li><pre><font size="2">Calculation of the hessian matrix. Wait...\r
+12345678.12.13.14.15.16.17.18.23.24.25.26.27.28.34.35.36.37.38.45.46.47.48.56.57.58.67.68.78\r
+\r
+Inverting the hessian to get the covariance matrix. Wait...\r
+\r
+#Hessian matrix#\r
+3.344e+002 2.708e+004 -4.586e+001 -3.806e+003 -1.577e+000 -1.313e+002 3.914e-001 3.166e+001 \r
+2.708e+004 2.204e+006 -3.805e+003 -3.174e+005 -1.303e+002 -1.091e+004 2.967e+001 2.399e+003 \r
+-4.586e+001 -3.805e+003 4.044e+002 3.197e+004 2.431e-002 1.995e+000 1.783e-001 1.486e+001 \r
+-3.806e+003 -3.174e+005 3.197e+004 2.541e+006 2.436e+000 2.051e+002 1.483e+001 1.244e+003 \r
+-1.577e+000 -1.303e+002 2.431e-002 2.436e+000 1.093e+002 8.979e+003 -3.402e+001 -2.843e+003 \r
+-1.313e+002 -1.091e+004 1.995e+000 2.051e+002 8.979e+003 7.420e+005 -2.842e+003 -2.388e+005 \r
+3.914e-001 2.967e+001 1.783e-001 1.483e+001 -3.402e+001 -2.842e+003 1.494e+002 1.251e+004 \r
+3.166e+001 2.399e+003 1.486e+001 1.244e+003 -2.843e+003 -2.388e+005 1.251e+004 1.053e+006 \r
+# Scales\r
+12 1.00000e-004 1.00000e-006\r
+13 1.00000e-004 1.00000e-006\r
+21 1.00000e-003 1.00000e-005\r
+23 1.00000e-004 1.00000e-005\r
+# Covariance\r
+  1 5.90661e-001\r
+  2 -7.26732e-003 8.98810e-005\r
+  3 8.80177e-002 -1.12706e-003 5.15824e-001\r
+  4 -1.13082e-003 1.45267e-005 -6.50070e-003 8.23270e-005\r
+  5 9.31265e-003 -1.16106e-004 6.00210e-004 -8.04151e-006 1.75753e+000\r
+  6 -1.15664e-004 1.44850e-006 -7.79995e-006 1.04770e-007 -2.12929e-002 2.59422e-004\r
+  7 1.35103e-003 -1.75392e-005 -6.38237e-004 7.85424e-006 4.02601e-001 -4.86776e-003 1.32682e+000\r
+  8 -1.82421e-005 2.35811e-007 7.75503e-006 -9.58687e-008 -4.86589e-003 5.91641e-005 -1.57767e-002 1.88622e-004\r
+# agemin agemax for lifexpectancy, bage fage (if mle==0 ie no data nor Max likelihood).\r
+\r
+\r
+agemin=70 agemax=100 bage=50 fage=100\r
+Computing prevalence limit: result on file 'plrmypar.txt' \r
+Computing pij: result on file 'pijrmypar.txt' \r
+Computing Health Expectancies: result on file 'ermypar.txt' \r
+Computing Variance-covariance of DFLEs: file 'vrmypar.txt' \r
+Computing Total LEs with variances: file 'trmypar.txt' \r
+Computing Variance-covariance of Prevalence limit: file 'vplrmypar.txt' \r
+End of Imach\r
+</font></pre>\r
+    </li>\r
+</ul>\r
+\r
+<p><font size="3">Once the running is finished, the program\r
+requires a caracter:</font></p>\r
+\r
+<table border="1">\r
+    <tr>\r
+        <td width="100%"><strong>Type e to edit output files, g\r
+        to graph again, c to start again, and q for exiting:</strong></td>\r
+    </tr>\r
+</table>\r
+\r
+<p><font size="3">First you should enter <strong>e </strong>to\r
+edit the master file mypar.htm. </font></p>\r
+\r
+<ul>\r
+    <li><u>Outputs files</u> <br>\r
+        <br>\r
+        - Copy of the parameter file: <a href="ormypar.txt">ormypar.txt</a><br>\r
+        - Gnuplot file name: <a href="mypar.gp.txt">mypar.gp.txt</a><br>\r
+        - Observed prevalence in each state: <a\r
+        href="prmypar.txt">prmypar.txt</a> <br>\r
+        - Stationary prevalence in each state: <a\r
+        href="plrmypar.txt">plrmypar.txt</a> <br>\r
+        - Transition probabilities: <a href="pijrmypar.txt">pijrmypar.txt</a><br>\r
+        - Life expectancies by age and initial health status\r
+        (estepm=24 months): <a href="ermypar.txt">ermypar.txt</a>\r
+        <br>\r
+        - Parameter file with estimated parameters and the\r
+        covariance matrix: <a href="rmypar.txt">rmypar.txt</a> <br>\r
+        - Variance of one-step probabilities: <a\r
+        href="probrmypar.txt">probrmypar.txt</a> <br>\r
+        - Variances of life expectancies by age and initial\r
+        health status (estepm=24 months): <a href="vrmypar.txt">vrmypar.txt</a><br>\r
+        - Health expectancies with their variances: <a\r
+        href="trmypar.txt">trmypar.txt</a> <br>\r
+        - Standard deviation of stationary prevalences: <a\r
+        href="vplrmypar.txt">vplrmypar.txt</a> <br>\r
+        No population forecast: popforecast = 0 (instead of 1) or\r
+        stepm = 24 (instead of 1) or model=. (instead of .)<br>\r
+        <br>\r
+        </li>\r
+    <li><u>Graphs</u> <br>\r
+        <br>\r
+        -<a href="../mytry/pemypar1.gif">One-step transition\r
+        probabilities</a><br>\r
+        -<a href="../mytry/pmypar11.gif">Convergence to the\r
+        stationary prevalence</a><br>\r
+        -<a href="..\mytry\vmypar11.gif">Observed and stationary\r
+        prevalence in state (1) with the confident interval</a> <br>\r
+        -<a href="..\mytry\vmypar21.gif">Observed and stationary\r
+        prevalence in state (2) with the confident interval</a> <br>\r
+        -<a href="..\mytry\expmypar11.gif">Health life\r
+        expectancies by age and initial health state (1)</a> <br>\r
+        -<a href="..\mytry\expmypar21.gif">Health life\r
+        expectancies by age and initial health state (2)</a> <br>\r
+        -<a href="..\mytry\emypar1.gif">Total life expectancy by\r
+        age and health expectancies in states (1) and (2).</a> </li>\r
+</ul>\r
+\r
+<p>This software have been partly granted by <a\r
+href="http://euroreves.ined.fr">Euro-REVES</a>, a concerted\r
+action from the European Union. It will be copyrighted\r
+identically to a GNU software product, i.e. program and software\r
+can be distributed freely for non commercial use. Sources are not\r
+widely distributed today. You can get them by asking us with a\r
+simple justification (name, email, institute) <a\r
+href="mailto:brouard@ined.fr">mailto:brouard@ined.fr</a> and <a\r
+href="mailto:lievre@ined.fr">mailto:lievre@ined.fr</a> .</p>\r
+\r
+<p>Latest version (0.8a of May 2002) can be accessed at <a\r
+href="http://euroreves.ined.fr/imach">http://euroreves.ined.fr/imach</a><br>\r
+</p>\r
+</body>\r
+</html>\r
author	N. Brouard <brouard@ined.fr>
	Wed, 16 Jun 2004 12:05:30 +0000 (12:05 +0000)
committer	N. Brouard <brouard@ined.fr>
	Wed, 16 Jun 2004 12:05:30 +0000 (12:05 +0000)