File:  [Local Repository] / imach096d / doc / imach.htm
Revision 1.12: download - view: text, annotated - select for diffs
Wed Mar 13 17:27:44 2002 UTC (22 years, 4 months ago) by brouard
Branches: MAIN
CVS tags: Version-0-8a-jackson-revised, Version-0-8a, HEAD
CHANGE ncov to ncovcol
There was a confusion with older ncov parameter. In fact it was the
number of columns, between id and date of birth, which can be used for
covariates. In the program we use ncovmodel for the real number of
covariates. Version 0.8 !

    1: <!-- $Id: imach.htm,v 1.12 2002/03/13 17:27:44 brouard Exp $ --!>
    2: <html>
    3: 
    4: <head>
    5: <meta http-equiv="Content-Type"
    6: content="text/html; charset=iso-8859-1">
    7: <meta name="GENERATOR" content="Microsoft FrontPage Express 2.0">
    8: <title>Computing Health Expectancies using IMaCh</title>
    9: <!-- Changed by: Agnes Lievre, 12-Oct-2000 -->
   10: <html>
   11: 
   12: <head>
   13: <meta http-equiv="Content-Type"
   14: content="text/html; charset=iso-8859-1">
   15: <meta name="GENERATOR" content="Microsoft FrontPage Express 2.0">
   16: <title></title>
   17: </head>
   18: 
   19: <body bgcolor="#FFFFFF">
   20: 
   21: <hr size="3" color="#EC5E5E">
   22: 
   23: <h1 align="center"><font color="#00006A">Computing Health
   24: Expectancies using IMaCh</font></h1>
   25: 
   26: <h1 align="center"><font color="#00006A" size="5">(a Maximum
   27: Likelihood Computer Program using Interpolation of Markov Chains)</font></h1>
   28: 
   29: <p align="center">&nbsp;</p>
   30: 
   31: <p align="center"><a href="http://www.ined.fr/"><img
   32: src="logo-ined.gif" border="0" width="151" height="76"></a><img
   33: src="euroreves2.gif" width="151" height="75"></p>
   34: 
   35: <h3 align="center"><a href="http://www.ined.fr/"><font
   36: color="#00006A">INED</font></a><font color="#00006A"> and </font><a
   37: href="http://euroreves.ined.fr"><font color="#00006A">EUROREVES</font></a></h3>
   38: 
   39: <p align="center"><font color="#00006A" size="4"><strong>Version
   40: 0.8, March 2002</strong></font></p>
   41: 
   42: <hr size="3" color="#EC5E5E">
   43: 
   44: <p align="center"><font color="#00006A"><strong>Authors of the
   45: program: </strong></font><a href="http://sauvy.ined.fr/brouard"><font
   46: color="#00006A"><strong>Nicolas Brouard</strong></font></a><font
   47: color="#00006A"><strong>, senior researcher at the </strong></font><a
   48: href="http://www.ined.fr"><font color="#00006A"><strong>Institut
   49: National d'Etudes Démographiques</strong></font></a><font
   50: color="#00006A"><strong> (INED, Paris) in the &quot;Mortality,
   51: Health and Epidemiology&quot; Research Unit </strong></font></p>
   52: 
   53: <p align="center"><font color="#00006A"><strong>and Agnès
   54: Lièvre<br clear="left">
   55: </strong></font></p>
   56: 
   57: <h4><font color="#00006A">Contribution to the mathematics: C. R.
   58: Heathcote </font><font color="#00006A" size="2">(Australian
   59: National University, Canberra).</font></h4>
   60: 
   61: <h4><font color="#00006A">Contact: Agnès Lièvre (</font><a
   62: href="mailto:lievre@ined.fr"><font color="#00006A"><i>lievre@ined.fr</i></font></a><font
   63: color="#00006A">) </font></h4>
   64: 
   65: <hr>
   66: 
   67: <ul>
   68:     <li><a href="#intro">Introduction</a> </li>
   69:     <li><a href="#data">On what kind of data can it be used?</a></li>
   70:     <li><a href="#datafile">The data file</a> </li>
   71:     <li><a href="#biaspar">The parameter file</a> </li>
   72:     <li><a href="#running">Running Imach</a> </li>
   73:     <li><a href="#output">Output files and graphs</a> </li>
   74:     <li><a href="#example">Exemple</a> </li>
   75: </ul>
   76: 
   77: <hr>
   78: 
   79: <h2><a name="intro"><font color="#00006A">Introduction</font></a></h2>
   80: 
   81: <p>This program computes <b>Healthy Life Expectancies</b> from <b>cross-longitudinal
   82: data</b> using the methodology pioneered by Laditka and Wolf (1).
   83: Within the family of Health Expectancies (HE), Disability-free
   84: life expectancy (DFLE) is probably the most important index to
   85: monitor. In low mortality countries, there is a fear that when
   86: mortality declines, the increase in DFLE is not proportionate to
   87: the increase in total Life expectancy. This case is called the <em>Expansion
   88: of morbidity</em>. Most of the data collected today, in
   89: particular by the international <a href="http://www.reves.org">REVES</a>
   90: network on Health expectancy, and most HE indices based on these
   91: data, are <em>cross-sectional</em>. It means that the information
   92: collected comes from a single cross-sectional survey: people from
   93: various ages (but mostly old people) are surveyed on their health
   94: status at a single date. Proportion of people disabled at each
   95: age, can then be measured at that date. This age-specific
   96: prevalence curve is then used to distinguish, within the
   97: stationary population (which, by definition, is the life table
   98: estimated from the vital statistics on mortality at the same
   99: date), the disable population from the disability-free
  100: population. Life expectancy (LE) (or total population divided by
  101: the yearly number of births or deaths of this stationary
  102: population) is then decomposed into DFLE and DLE. This method of
  103: computing HE is usually called the Sullivan method (from the name
  104: of the author who first described it).</p>
  105: 
  106: <p>Age-specific proportions of people disable are very difficult
  107: to forecast because each proportion corresponds to historical
  108: conditions of the cohort and it is the result of the historical
  109: flows from entering disability and recovering in the past until
  110: today. The age-specific intensities (or incidence rates) of
  111: entering disability or recovering a good health, are reflecting
  112: actual conditions and therefore can be used at each age to
  113: forecast the future of this cohort. For example if a country is
  114: improving its technology of prosthesis, the incidence of
  115: recovering the ability to walk will be higher at each (old) age,
  116: but the prevalence of disability will only slightly reflect an
  117: improve because the prevalence is mostly affected by the history
  118: of the cohort and not by recent period effects. To measure the
  119: period improvement we have to simulate the future of a cohort of
  120: new-borns entering or leaving at each age the disability state or
  121: dying according to the incidence rates measured today on
  122: different cohorts. The proportion of people disabled at each age
  123: in this simulated cohort will be much lower (using the exemple of
  124: an improvement) that the proportions observed at each age in a
  125: cross-sectional survey. This new prevalence curve introduced in a
  126: life table will give a much more actual and realistic HE level
  127: than the Sullivan method which mostly measured the History of
  128: health conditions in this country.</p>
  129: 
  130: <p>Therefore, the main question is how to measure incidence rates
  131: from cross-longitudinal surveys? This is the goal of the IMaCH
  132: program. From your data and using IMaCH you can estimate period
  133: HE and not only Sullivan's HE. Also the standard errors of the HE
  134: are computed.</p>
  135: 
  136: <p>A cross-longitudinal survey consists in a first survey
  137: (&quot;cross&quot;) where individuals from different ages are
  138: interviewed on their health status or degree of disability. At
  139: least a second wave of interviews (&quot;longitudinal&quot;)
  140: should measure each new individual health status. Health
  141: expectancies are computed from the transitions observed between
  142: waves and are computed for each degree of severity of disability
  143: (number of life states). More degrees you consider, more time is
  144: necessary to reach the Maximum Likelihood of the parameters
  145: involved in the model. Considering only two states of disability
  146: (disable and healthy) is generally enough but the computer
  147: program works also with more health statuses.<br>
  148: <br>
  149: The simplest model is the multinomial logistic model where <i>pij</i>
  150: is the probability to be observed in state <i>j</i> at the second
  151: wave conditional to be observed in state <em>i</em> at the first
  152: wave. Therefore a simple model is: log<em>(pij/pii)= aij +
  153: bij*age+ cij*sex,</em> where '<i>age</i>' is age and '<i>sex</i>'
  154: is a covariate. The advantage that this computer program claims,
  155: comes from that if the delay between waves is not identical for
  156: each individual, or if some individual missed an interview, the
  157: information is not rounded or lost, but taken into account using
  158: an interpolation or extrapolation. <i>hPijx</i> is the
  159: probability to be observed in state <i>i</i> at age <i>x+h</i>
  160: conditional to the observed state <i>i</i> at age <i>x</i>. The
  161: delay '<i>h</i>' can be split into an exact number (<i>nh*stepm</i>)
  162: of unobserved intermediate states. This elementary transition (by
  163: month or quarter trimester, semester or year) is modeled as a
  164: multinomial logistic. The <i>hPx</i> matrix is simply the matrix
  165: product of <i>nh*stepm</i> elementary matrices and the
  166: contribution of each individual to the likelihood is simply <i>hPijx</i>.
  167: <br>
  168: </p>
  169: 
  170: <p>The program presented in this manual is a quite general
  171: program named <strong>IMaCh</strong> (for <strong>I</strong>nterpolated
  172: <strong>MA</strong>rkov <strong>CH</strong>ain), designed to
  173: analyse transition data from longitudinal surveys. The first step
  174: is the parameters estimation of a transition probabilities model
  175: between an initial status and a final status. From there, the
  176: computer program produces some indicators such as observed and
  177: stationary prevalence, life expectancies and their variances and
  178: graphs. Our transition model consists in absorbing and
  179: non-absorbing states with the possibility of return across the
  180: non-absorbing states. The main advantage of this package,
  181: compared to other programs for the analysis of transition data
  182: (For example: Proc Catmod of SAS<sup>®</sup>) is that the whole
  183: individual information is used even if an interview is missing, a
  184: status or a date is unknown or when the delay between waves is
  185: not identical for each individual. The program can be executed
  186: according to parameters: selection of a sub-sample, number of
  187: absorbing and non-absorbing states, number of waves taken in
  188: account (the user inputs the first and the last interview), a
  189: tolerance level for the maximization function, the periodicity of
  190: the transitions (we can compute annual, quarterly or monthly
  191: transitions), covariates in the model. It works on Windows or on
  192: Unix.<br>
  193: </p>
  194: 
  195: <hr>
  196: 
  197: <p>(1) Laditka, Sarah B. and Wolf, Douglas A. (1998), &quot;New
  198: Methods for Analyzing Active Life Expectancy&quot;. <i>Journal of
  199: Aging and Health</i>. Vol 10, No. 2. </p>
  200: 
  201: <hr>
  202: 
  203: <h2><a name="data"><font color="#00006A">On what kind of data can
  204: it be used?</font></a></h2>
  205: 
  206: <p>The minimum data required for a transition model is the
  207: recording of a set of individuals interviewed at a first date and
  208: interviewed again at least one another time. From the
  209: observations of an individual, we obtain a follow-up over time of
  210: the occurrence of a specific event. In this documentation, the
  211: event is related to health status at older ages, but the program
  212: can be applied on a lot of longitudinal studies in different
  213: contexts. To build the data file explained into the next section,
  214: you must have the month and year of each interview and the
  215: corresponding health status. But in order to get age, date of
  216: birth (month and year) is required (missing values is allowed for
  217: month). Date of death (month and year) is an important
  218: information also required if the individual is dead. Shorter
  219: steps (i.e. a month) will more closely take into account the
  220: survival time after the last interview.</p>
  221: 
  222: <hr>
  223: 
  224: <h2><a name="datafile"><font color="#00006A">The data file</font></a></h2>
  225: 
  226: <p>In this example, 8,000 people have been interviewed in a
  227: cross-longitudinal survey of 4 waves (1984, 1986, 1988, 1990).
  228: Some people missed 1, 2 or 3 interviews. Health statuses are
  229: healthy (1) and disable (2). The survey is not a real one. It is
  230: a simulation of the American Longitudinal Survey on Aging. The
  231: disability state is defined if the individual missed one of four
  232: ADL (Activity of daily living, like bathing, eating, walking).
  233: Therefore, even is the individuals interviewed in the sample are
  234: virtual, the information brought with this sample is close to the
  235: situation of the United States. Sex is not recorded is this
  236: sample.</p>
  237: 
  238: <p>Each line of the data set (named <a href="data1.txt">data1.txt</a>
  239: in this first example) is an individual record which fields are: </p>
  240: 
  241: <ul>
  242:     <li><b>Index number</b>: positive number (field 1) </li>
  243:     <li><b>First covariate</b> positive number (field 2) </li>
  244:     <li><b>Second covariate</b> positive number (field 3) </li>
  245:     <li><a name="Weight"><b>Weight</b></a>: positive number
  246:         (field 4) . In most surveys individuals are weighted
  247:         according to the stratification of the sample.</li>
  248:     <li><b>Date of birth</b>: coded as mm/yyyy. Missing dates are
  249:         coded as 99/9999 (field 5) </li>
  250:     <li><b>Date of death</b>: coded as mm/yyyy. Missing dates are
  251:         coded as 99/9999 (field 6) </li>
  252:     <li><b>Date of first interview</b>: coded as mm/yyyy. Missing
  253:         dates are coded as 99/9999 (field 7) </li>
  254:     <li><b>Status at first interview</b>: positive number.
  255:         Missing values ar coded -1. (field 8) </li>
  256:     <li><b>Date of second interview</b>: coded as mm/yyyy.
  257:         Missing dates are coded as 99/9999 (field 9) </li>
  258:     <li><strong>Status at second interview</strong> positive
  259:         number. Missing values ar coded -1. (field 10) </li>
  260:     <li><b>Date of third interview</b>: coded as mm/yyyy. Missing
  261:         dates are coded as 99/9999 (field 11) </li>
  262:     <li><strong>Status at third interview</strong> positive
  263:         number. Missing values ar coded -1. (field 12) </li>
  264:     <li><b>Date of fourth interview</b>: coded as mm/yyyy.
  265:         Missing dates are coded as 99/9999 (field 13) </li>
  266:     <li><strong>Status at fourth interview</strong> positive
  267:         number. Missing values are coded -1. (field 14) </li>
  268:     <li>etc</li>
  269: </ul>
  270: 
  271: <p>&nbsp;</p>
  272: 
  273: <p>If your longitudinal survey do not include information about
  274: weights or covariates, you must fill the column with a number
  275: (e.g. 1) because a missing field is not allowed.</p>
  276: 
  277: <hr>
  278: 
  279: <h2><font color="#00006A">Your first example parameter file</font><a
  280: href="http://euroreves.ined.fr/imach"></a><a name="uio"></a></h2>
  281: 
  282: <h2><a name="biaspar"></a>#Imach version 0.8, March 2002,
  283: INED-EUROREVES </h2>
  284: 
  285: <p>This is a comment. Comments start with a '#'.</p>
  286: 
  287: <h4><font color="#FF0000">First uncommented line</font></h4>
  288: 
  289: <pre>title=1st_example datafile=data1.txt lastobs=8600 firstpass=1 lastpass=4</pre>
  290: 
  291: <ul>
  292:     <li><b>title=</b> 1st_example is title of the run. </li>
  293:     <li><b>datafile=</b>data1.txt is the name of the data set.
  294:         Our example is a six years follow-up survey. It consists
  295:         in a baseline followed by 3 reinterviews. </li>
  296:     <li><b>lastobs=</b> 8600 the program is able to run on a
  297:         subsample where the last observation number is lastobs.
  298:         It can be set a bigger number than the real number of
  299:         observations (e.g. 100000). In this example, maximisation
  300:         will be done on the 8600 first records. </li>
  301:     <li><b>firstpass=1</b> , <b>lastpass=4 </b>In case of more
  302:         than two interviews in the survey, the program can be run
  303:         on selected transitions periods. firstpass=1 means the
  304:         first interview included in the calculation is the
  305:         baseline survey. lastpass=4 means that the information
  306:         brought by the 4th interview is taken into account.</li>
  307: </ul>
  308: 
  309: <p>&nbsp;</p>
  310: 
  311: <h4><a name="biaspar-2"><font color="#FF0000">Second uncommented
  312: line</font></a></h4>
  313: 
  314: <pre>ftol=1.e-08 stepm=1 ncovcol=2 nlstate=2 ndeath=1 maxwav=4 mle=1 weight=0</pre>
  315: 
  316: <ul>
  317:     <li><b>ftol=1e-8</b> Convergence tolerance on the function
  318:         value in the maximisation of the likelihood. Choosing a
  319:         correct value for ftol is difficult. 1e-8 is a correct
  320:         value for a 32 bits computer.</li>
  321:     <li><b>stepm=1</b> Time unit in months for interpolation.
  322:         Examples:<ul>
  323:             <li>If stepm=1, the unit is a month </li>
  324:             <li>If stepm=4, the unit is a trimester</li>
  325:             <li>If stepm=12, the unit is a year </li>
  326:             <li>If stepm=24, the unit is two years</li>
  327:             <li>... </li>
  328:         </ul>
  329:     </li>
  330:     <li><b>ncovcol=2</b> Number of covariate columns in the datafile
  331:     which precede the date of birth. Here you can put variables that
  332:     won't necessary be used during the run. It is not the number of
  333:     covariates that will be specified by the model. The 'model'
  334:     syntax describe the covariates to take into account. </li>
  335:     <li><b>nlstate=2</b> Number of non-absorbing (alive) states.
  336:         Here we have two alive states: disability-free is coded 1
  337:         and disability is coded 2. </li>
  338:     <li><b>ndeath=1</b> Number of absorbing states. The absorbing
  339:         state death is coded 3. </li>
  340:     <li><b>maxwav=4</b> Number of waves in the datafile.</li>
  341:     <li><a name="mle"><b>mle</b></a><b>=1</b> Option for the
  342:         Maximisation Likelihood Estimation. <ul>
  343:             <li>If mle=1 the program does the maximisation and
  344:                 the calculation of health expectancies </li>
  345:             <li>If mle=0 the program only does the calculation of
  346:                 the health expectancies. </li>
  347:         </ul>
  348:     </li>
  349:     <li><b>weight=0</b> Possibility to add weights. <ul>
  350:             <li>If weight=0 no weights are included </li>
  351:             <li>If weight=1 the maximisation integrates the
  352:                 weights which are in field <a href="#Weight">4</a></li>
  353:         </ul>
  354:     </li>
  355: </ul>
  356: 
  357: <h4><font color="#FF0000">Covariates</font></h4>
  358: 
  359: <p>Intercept and age are systematically included in the model.
  360: Additional covariates can be included with the command: </p>
  361: 
  362: <pre>model=<em>list of covariates</em></pre>
  363: 
  364: <ul>
  365:     <li>if<strong> model=. </strong>then no covariates are
  366:         included</li>
  367:     <li>if <strong>model=V1</strong> the model includes the first
  368:         covariate (field 2)</li>
  369:     <li>if <strong>model=V2 </strong>the model includes the
  370:         second covariate (field 3)</li>
  371:     <li>if <strong>model=V1+V2 </strong>the model includes the
  372:         first and the second covariate (fields 2 and 3)</li>
  373:     <li>if <strong>model=V1*V2 </strong>the model includes the
  374:         product of the first and the second covariate (fields 2
  375:         and 3)</li>
  376:     <li>if <strong>model=V1+V1*age</strong> the model includes
  377:         the product covariate*age</li>
  378: </ul>
  379: 
  380: <p>In this example, we have two covariates in the data file
  381: (fields 2 and 3). The number of covariates included in the data file
  382: between the id and the date of birth is ncovcol=2 (it was named ncov
  383: in version prior to 0.8). If you have 3 covariates in the datafile
  384: (fields 2, 3 and 4), you will set ncovcol=3. Then you can run the
  385: programme with a new parametrisation taking into account the
  386: third covariate. For example, <strong>model=V1+V3 </strong>estimates
  387: a model with the first and third covariates. More complicated
  388: models can be used, but it will takes more time to converge. With
  389: a simple model (no covariates), the programme estimates 8
  390: parameters. Adding covariates increases the number of parameters
  391: : 12 for <strong>model=V1, </strong>16 for <strong>model=V1+V1*age
  392: </strong>and 20 for <strong>model=V1+V2+V3.</strong></p>
  393: 
  394: <h4><font color="#FF0000">Guess values for optimization</font><font
  395: color="#00006A"> </font></h4>
  396: 
  397: <p>You must write the initial guess values of the parameters for
  398: optimization. The number of parameters, <em>N</em> depends on the
  399: number of absorbing states and non-absorbing states and on the
  400: number of covariates. <br>
  401: <em>N</em> is given by the formula <em>N</em>=(<em>nlstate</em> +
  402: <em>ndeath</em>-1)*<em>nlstate</em>*<em>ncovmodel</em>&nbsp;. <br>
  403: <br>
  404: Thus in the simple case with 2 covariates (the model is log
  405: (pij/pii) = aij + bij * age where intercept and age are the two
  406: covariates), and 2 health degrees (1 for disability-free and 2
  407: for disability) and 1 absorbing state (3), you must enter 8
  408: initials values, a12, b12, a13, b13, a21, b21, a23, b23. You can
  409: start with zeros as in this example, but if you have a more
  410: precise set (for example from an earlier run) you can enter it
  411: and it will speed up them<br>
  412: Each of the four lines starts with indices &quot;ij&quot;: <b>ij
  413: aij bij</b> </p>
  414: 
  415: <blockquote>
  416:     <pre># Guess values of aij and bij in log (pij/pii) = aij + bij * age
  417: 12 -14.155633  0.110794 
  418: 13  -7.925360  0.032091 
  419: 21  -1.890135 -0.029473 
  420: 23  -6.234642  0.022315 </pre>
  421: </blockquote>
  422: 
  423: <p>or, to simplify (in most of cases it converges but there is no
  424: warranty!): </p>
  425: 
  426: <blockquote>
  427:     <pre>12 0.0 0.0
  428: 13 0.0 0.0
  429: 21 0.0 0.0
  430: 23 0.0 0.0</pre>
  431: </blockquote>
  432: 
  433: <p> In order to speed up the convergence you can make a first run with
  434: a large stepm i.e stepm=12 or 24 and then decrease the stepm until
  435: stepm=1 month. If newstepm is the new shorter stepm and stepm can be
  436: expressed as a multiple of newstepm, like newstepm=n stepm, then the
  437: following approximation holds: 
  438: <pre>aij(stepm) = aij(n . stepm) - ln(n)
  439: </pre> and
  440: <pre>bij(stepm) = bij(n . stepm) .</pre>
  441: 
  442: <p> For example if you already ran for a 6 months interval and
  443: got:<br>
  444:  <pre># Parameters
  445: 12 -13.390179  0.126133 
  446: 13  -7.493460  0.048069 
  447: 21   0.575975 -0.041322 
  448: 23  -4.748678  0.030626 
  449: </pre>
  450: If you now want to get the monthly estimates, you can guess the aij by
  451: substracting ln(6)= 1,7917<br> and running<br>
  452: <pre>12 -15.18193847  0.126133 
  453: 13 -9.285219469  0.048069
  454: 21 -1.215784469 -0.041322
  455: 23 -6.540437469  0.030626
  456: </pre>
  457: and get<br>
  458: <pre>12 -15.029768 0.124347 
  459: 13 -8.472981 0.036599 
  460: 21 -1.472527 -0.038394 
  461: 23 -6.553602 0.029856 
  462: </br>
  463: which is closer to the results. The approximation is probably useful
  464: only for very small intervals and we don't have enough experience to
  465: know if you will speed up the convergence or not.
  466: <pre>         -ln(12)= -2.484
  467:  -ln(6/1)=-ln(6)= -1.791
  468:  -ln(3/1)=-ln(3)= -1.0986
  469: -ln(12/6)=-ln(2)= -0.693
  470: </pre>
  471: 
  472: <h4><font color="#FF0000">Guess values for computing variances</font></h4>
  473: 
  474: <p>This is an output if <a href="#mle">mle</a>=1. But it can be
  475: used as an input to get the various output data files (Health
  476: expectancies, stationary prevalence etc.) and figures without
  477: rerunning the rather long maximisation phase (mle=0). </p>
  478: 
  479: <p>The scales are small values for the evaluation of numerical
  480: derivatives. These derivatives are used to compute the hessian
  481: matrix of the parameters, that is the inverse of the covariance
  482: matrix, and the variances of health expectancies. Each line
  483: consists in indices &quot;ij&quot; followed by the initial scales
  484: (zero to simplify) associated with aij and bij. </p>
  485: <ul> <li>If mle=1 you can enter zeros:</li>
  486: <blockquote><pre># Scales (for hessian or gradient estimation)
  487: 12 0. 0. 
  488: 13 0. 0. 
  489: 21 0. 0. 
  490: 23 0. 0. </pre>
  491: </blockquote>
  492:     <li>If mle=0 you must enter a covariance matrix (usually
  493:         obtained from an earlier run).</li>
  494: </ul>
  495: 
  496: <h4><font color="#FF0000">Covariance matrix of parameters</font></h4>
  497: 
  498: <p>This is an output if <a href="#mle">mle</a>=1. But it can be
  499: used as an input to get the various output data files (Health
  500: expectancies, stationary prevalence etc.) and figures without
  501: rerunning the rather long maximisation phase (mle=0). <br>
  502: Each line starts with indices &quot;ijk&quot; followed by the
  503: covariances between aij and bij:<br>
  504: <pre>
  505:    121 Var(a12) 
  506:    122 Cov(b12,a12)  Var(b12) 
  507:           ...
  508:    232 Cov(b23,a12)  Cov(b23,b12) ... Var (b23) </pre>
  509: <ul>
  510:     <li>If mle=1 you can enter zeros. </li>
  511:     <pre># Covariance matrix
  512: 121 0.
  513: 122 0. 0.
  514: 131 0. 0. 0. 
  515: 132 0. 0. 0. 0. 
  516: 211 0. 0. 0. 0. 0. 
  517: 212 0. 0. 0. 0. 0. 0. 
  518: 231 0. 0. 0. 0. 0. 0. 0. 
  519: 232 0. 0. 0. 0. 0. 0. 0. 0.</pre>
  520:     <li>If mle=0 you must enter a covariance matrix (usually
  521:         obtained from an earlier run). </li>
  522: </ul>
  523: 
  524: <h4><font color="#FF0000">Age range for calculation of stationary
  525: prevalences and health expectancies</font></h4>
  526: 
  527: <pre>agemin=70 agemax=100 bage=50 fage=100</pre>
  528: 
  529: <br>Once we obtained the estimated parameters, the program is able
  530: to calculated stationary prevalence, transitions probabilities
  531: and life expectancies at any age. Choice of age range is useful
  532: for extrapolation. In our data file, ages varies from age 70 to
  533: 102. It is possible to get extrapolated stationary prevalence by
  534: age ranging from agemin to agemax.
  535: 
  536: <br>Setting bage=50 (begin age) and fage=100 (final age), makes
  537: the program computing life expectancy from age 'bage' to age
  538: 'fage'. As we use a model, we can interessingly compute life
  539: expectancy on a wider age range than the age range from the data.
  540: But the model can be rather wrong on much larger intervals.
  541: Program is limited to around 120 for upper age!
  542: <ul>
  543:     <li><b>agemin=</b> Minimum age for calculation of the
  544:         stationary prevalence </li>
  545:     <li><b>agemax=</b> Maximum age for calculation of the
  546:         stationary prevalence </li>
  547:     <li><b>bage=</b> Minimum age for calculation of the health
  548:         expectancies </li>
  549:     <li><b>fage=</b> Maximum age for calculation of the health
  550:         expectancies </li>
  551: </ul>
  552: 
  553: <h4><a name="Computing"><font color="#FF0000">Computing</font></a><font
  554: color="#FF0000"> the observed prevalence</font></h4>
  555: 
  556: <pre>begin-prev-date=1/1/1984 end-prev-date=1/6/1988 </pre>
  557: 
  558: <br>Statements 'begin-prev-date' and 'end-prev-date' allow to
  559: select the period in which we calculate the observed prevalences
  560: in each state. In this example, the prevalences are calculated on
  561: data survey collected between 1 january 1984 and 1 june 1988. 
  562: <ul>
  563:     <li><strong>begin-prev-date= </strong>Starting date
  564:         (day/month/year)</li>
  565:     <li><strong>end-prev-date= </strong>Final date
  566:         (day/month/year)</li>
  567: </ul>
  568: 
  569: <h4><font color="#FF0000">Population- or status-based health
  570: expectancies</font></h4>
  571: 
  572: <pre>pop_based=0</pre>
  573: 
  574: <p>The program computes status-based health expectancies, i.e
  575: health expectancies which depends on your initial health state.
  576: If you are healthy your healthy life expectancy (e11) is higher
  577: than if you were disabled (e21, with e11 &gt; e21).<br>
  578: To compute a healthy life expectancy independant of the initial
  579: status we have to weight e11 and e21 according to the probability
  580: to be in each state at initial age or, with other word, according
  581: to the proportion of people in each state.<br>
  582: We prefer computing a 'pure' period healthy life expectancy based
  583: only on the transtion forces. Then the weights are simply the
  584: stationnary prevalences or 'implied' prevalences at the initial
  585: age.<br>
  586: Some other people would like to use the cross-sectional
  587: prevalences (the &quot;Sullivan prevalences&quot;) observed at
  588: the initial age during a period of time <a href="#Computing">defined
  589: just above</a>. <br>
  590: 
  591: <ul>
  592:     <li><strong>popbased= 0 </strong>Health expectancies are
  593:         computed at each age from stationary prevalences
  594:         'expected' at this initial age.</li>
  595:     <li><strong>popbased= 1 </strong>Health expectancies are
  596:         computed at each age from cross-sectional 'observed'
  597:         prevalence at this initial age. As all the population is
  598:         not observed at the same exact date we define a short
  599:         period were the observed prevalence is computed.</li>
  600: </ul>
  601: 
  602: <h4><font color="#FF0000">Prevalence forecasting ( Experimental)</font></h4>
  603: 
  604: <pre>starting-proj-date=1/1/1989 final-proj-date=1/1/1992 mov_average=0 </pre>
  605: 
  606: <p>Prevalence and population projections are only available if
  607: the interpolation unit is a month, i.e. stepm=1 and if there are
  608: no covariate. The programme estimates the prevalence in each
  609: state at a precise date expressed in day/month/year. The
  610: programme computes one forecasted prevalence a year from a
  611: starting date (1 january of 1989 in this example) to a final date
  612: (1 january 1992). The statement mov_average allows to compute
  613: smoothed forecasted prevalences with a five-age moving average
  614: centered at the mid-age of the five-age period. <br>
  615: 
  616: <ul>
  617:     <li><strong>starting-proj-date</strong>= starting date
  618:         (day/month/year) of forecasting</li>
  619:     <li><strong>final-proj-date= </strong>final date
  620:         (day/month/year) of forecasting</li>
  621:     <li><strong>mov_average</strong>= smoothing with a five-age
  622:         moving average centered at the mid-age of the five-age
  623:         period. The command<strong> mov_average</strong> takes
  624:         value 1 if the prevalences are smoothed and 0 otherwise.</li>
  625: </ul>
  626: 
  627: <h4><font color="#FF0000">Last uncommented line : Population
  628: forecasting </font></h4>
  629: 
  630: <pre>popforecast=0 popfile=pyram.txt popfiledate=1/1/1989 last-popfiledate=1/1/1992</pre>
  631: 
  632: <p>This command is available if the interpolation unit is a
  633: month, i.e. stepm=1 and if popforecast=1. From a data file
  634: including age and number of persons alive at the precise date
  635: &#145;popfiledate&#146;, you can forecast the number of persons
  636: in each state until date &#145;last-popfiledate&#146;. In this
  637: example, the popfile <a href="pyram.txt"><b>pyram.txt</b></a>
  638: includes real data which are the Japanese population in 1989.<br>
  639: 
  640: <ul type="disc">
  641:     <li class="MsoNormal"
  642:     style="TEXT-ALIGN: justify; mso-margin-top-alt: auto; mso-margin-bottom-alt: auto; mso-list: l10 level1 lfo36; tab-stops: list 36.0pt"><b>popforecast=
  643:         0 </b>Option for population forecasting. If
  644:         popforecast=1, the programme does the forecasting<b>.</b></li>
  645:     <li class="MsoNormal"
  646:     style="TEXT-ALIGN: justify; mso-margin-top-alt: auto; mso-margin-bottom-alt: auto; mso-list: l10 level1 lfo36; tab-stops: list 36.0pt"><b>popfile=
  647:         </b>name of the population file</li>
  648:     <li class="MsoNormal"
  649:     style="TEXT-ALIGN: justify; mso-margin-top-alt: auto; mso-margin-bottom-alt: auto; mso-list: l10 level1 lfo36; tab-stops: list 36.0pt"><b>popfiledate=</b>
  650:         date of the population population</li>
  651:     <li class="MsoNormal"
  652:     style="TEXT-ALIGN: justify; mso-margin-top-alt: auto; mso-margin-bottom-alt: auto; mso-list: l10 level1 lfo36; tab-stops: list 36.0pt"><b>last-popfiledate</b>=
  653:         date of the last population projection&nbsp;</li>
  654: </ul>
  655: 
  656: <hr>
  657: 
  658: <h2><a name="running"></a><font color="#00006A">Running Imach
  659: with this example</font></h2>
  660: 
  661: We assume that you typed in your <a href="biaspar.imach">1st_example
  662: parameter file</a> as explained <a href="#biaspar">above</a>. 
  663: <br>To run the program you should either:
  664: <ul> <li> click on the imach.exe icon and enter
  665: the name of the parameter file which is for example <a
  666: href="C:\usr\imach\mle\biaspar.imach">C:\usr\imach\mle\biaspar.imach</a>
  667: <li> You also can locate the biaspar.imach icon in 
  668: <a href="C:\usr\imach\mle">C:\usr\imach\mle</a> with your mouse and drag it with
  669: the mouse on the imach window).
  670: <li> With latest version (0.7 and higher) if you setup windows in order to
  671: understand ".imach" extension you can right click the
  672: biaspar.imach icon and either edit with notepad the parameter file or
  673: execute it with imach or whatever.
  674: </ul>  
  675: 
  676: The time to converge depends on the step unit that you used (1
  677: month is cpu consuming), on the number of cases, and on the
  678: number of variables.
  679: 
  680: <br>The program outputs many files. Most of them are files which
  681: will be plotted for better understanding.
  682: 
  683: <hr>
  684: 
  685: <h2><a name="output"><font color="#00006A">Output of the program
  686: and graphs</font> </a></h2>
  687: 
  688: <p>Once the optimization is finished, some graphics can be made
  689: with a grapher. We use Gnuplot which is an interactive plotting
  690: program copyrighted but freely distributed. A gnuplot reference
  691: manual is available <a href="http://www.gnuplot.info/">here</a>. <br>
  692: When the running is finished, the user should enter a caracter
  693: for plotting and output editing.
  694: 
  695: <br>These caracters are:<br>
  696: 
  697: <ul>
  698:     <li>'c' to start again the program from the beginning.</li>
  699:     <li>'e' opens the <a href="biaspar.htm"><strong>biaspar.htm</strong></a>
  700:         file to edit the output files and graphs. </li>
  701:     <li>'q' for exiting.</li>
  702: </ul>
  703: 
  704: <h5><font size="4"><strong>Results files </strong></font><br>
  705: <br>
  706: <font color="#EC5E5E" size="3"><strong>- </strong></font><a
  707: name="Observed prevalence in each state"><font color="#EC5E5E"
  708: size="3"><strong>Observed prevalence in each state</strong></font></a><font
  709: color="#EC5E5E" size="3"><strong> (and at first pass)</strong></font><b>:
  710: </b><a href="prbiaspar.txt"><b>prbiaspar.txt</b></a><br>
  711: </h5>
  712: 
  713: <p>The first line is the title and displays each field of the
  714: file. The first column is age. The fields 2 and 6 are the
  715: proportion of individuals in states 1 and 2 respectively as
  716: observed during the first exam. Others fields are the numbers of
  717: people in states 1, 2 or more. The number of columns increases if
  718: the number of states is higher than 2.<br>
  719: The header of the file is </p>
  720: 
  721: <pre># Age Prev(1) N(1) N Age Prev(2) N(2) N
  722: 70 1.00000 631 631 70 0.00000 0 631
  723: 71 0.99681 625 627 71 0.00319 2 627 
  724: 72 0.97125 1115 1148 72 0.02875 33 1148 </pre>
  725: 
  726: <p>It means that at age 70, the prevalence in state 1 is 1.000
  727: and in state 2 is 0.00 . At age 71 the number of individuals in
  728: state 1 is 625 and in state 2 is 2, hence the total number of
  729: people aged 71 is 625+2=627. <br>
  730: </p>
  731: 
  732: <h5><font color="#EC5E5E" size="3"><b>- Estimated parameters and
  733: covariance matrix</b></font><b>: </b><a href="rbiaspar.txt"><b>rbiaspar.imach</b></a></h5>
  734: 
  735: <p>This file contains all the maximisation results: </p>
  736: 
  737: <pre> -2 log likelihood= 21660.918613445392
  738:  Estimated parameters: a12 = -12.290174 b12 = 0.092161 
  739:                        a13 = -9.155590  b13 = 0.046627 
  740:                        a21 = -2.629849  b21 = -0.022030 
  741:                        a23 = -7.958519  b23 = 0.042614  
  742:  Covariance matrix: Var(a12) = 1.47453e-001
  743:                     Var(b12) = 2.18676e-005
  744:                     Var(a13) = 2.09715e-001
  745:                     Var(b13) = 3.28937e-005  
  746:                     Var(a21) = 9.19832e-001
  747:                     Var(b21) = 1.29229e-004
  748:                     Var(a23) = 4.48405e-001
  749:                     Var(b23) = 5.85631e-005 
  750:  </pre>
  751: 
  752: <p>By substitution of these parameters in the regression model,
  753: we obtain the elementary transition probabilities:</p>
  754: 
  755: <p><img src="pebiaspar1.gif" width="400" height="300"></p>
  756: 
  757: <h5><font color="#EC5E5E" size="3"><b>- Transition probabilities</b></font><b>:
  758: </b><a href="pijrbiaspar.txt"><b>pijrbiaspar.txt</b></a></h5>
  759: 
  760: <p>Here are the transitions probabilities Pij(x, x+nh) where nh
  761: is a multiple of 2 years. The first column is the starting age x
  762: (from age 50 to 100), the second is age (x+nh) and the others are
  763: the transition probabilities p11, p12, p13, p21, p22, p23. For
  764: example, line 5 of the file is: </p>
  765: 
  766: <pre> 100 106 0.02655 0.17622 0.79722 0.01809 0.13678 0.84513 </pre>
  767: 
  768: <p>and this means: </p>
  769: 
  770: <pre>p11(100,106)=0.02655
  771: p12(100,106)=0.17622
  772: p13(100,106)=0.79722
  773: p21(100,106)=0.01809
  774: p22(100,106)=0.13678
  775: p22(100,106)=0.84513 </pre>
  776: 
  777: <h5><font color="#EC5E5E" size="3"><b>- </b></font><a
  778: name="Stationary prevalence in each state"><font color="#EC5E5E"
  779: size="3"><b>Stationary prevalence in each state</b></font></a><b>:
  780: </b><a href="plrbiaspar.txt"><b>plrbiaspar.txt</b></a></h5>
  781: 
  782: <pre>#Prevalence
  783: #Age 1-1 2-2
  784: 
  785: #************ 
  786: 70 0.90134 0.09866
  787: 71 0.89177 0.10823 
  788: 72 0.88139 0.11861 
  789: 73 0.87015 0.12985 </pre>
  790: 
  791: <p>At age 70 the stationary prevalence is 0.90134 in state 1 and
  792: 0.09866 in state 2. This stationary prevalence differs from
  793: observed prevalence. Here is the point. The observed prevalence
  794: at age 70 results from the incidence of disability, incidence of
  795: recovery and mortality which occurred in the past of the cohort.
  796: Stationary prevalence results from a simulation with actual
  797: incidences and mortality (estimated from this cross-longitudinal
  798: survey). It is the best predictive value of the prevalence in the
  799: future if &quot;nothing changes in the future&quot;. This is
  800: exactly what demographers do with a Life table. Life expectancy
  801: is the expected mean time to survive if observed mortality rates
  802: (incidence of mortality) &quot;remains constant&quot; in the
  803: future. </p>
  804: 
  805: <h5><font color="#EC5E5E" size="3"><b>- Standard deviation of
  806: stationary prevalence</b></font><b>: </b><a
  807: href="vplrbiaspar.txt"><b>vplrbiaspar.txt</b></a></h5>
  808: 
  809: <p>The stationary prevalence has to be compared with the observed
  810: prevalence by age. But both are statistical estimates and
  811: subjected to stochastic errors due to the size of the sample, the
  812: design of the survey, and, for the stationary prevalence to the
  813: model used and fitted. It is possible to compute the standard
  814: deviation of the stationary prevalence at each age.</p>
  815: 
  816: <h5><font color="#EC5E5E" size="3">-Observed and stationary
  817: prevalence in state (2=disable) with the confident interval</font>:<b>
  818: </b><a href="vbiaspar21.htm"><b>vbiaspar21.gif</b></a></h5>
  819: 
  820: <p>This graph exhibits the stationary prevalence in state (2)
  821: with the confidence interval in red. The green curve is the
  822: observed prevalence (or proportion of individuals in state (2)).
  823: Without discussing the results (it is not the purpose here), we
  824: observe that the green curve is rather below the stationary
  825: prevalence. It suggests an increase of the disability prevalence
  826: in the future.</p>
  827: 
  828: <p><img src="vbiaspar21.gif" width="400" height="300"></p>
  829: 
  830: <h5><font color="#EC5E5E" size="3"><b>-Convergence to the
  831: stationary prevalence of disability</b></font><b>: </b><a
  832: href="pbiaspar11.gif"><b>pbiaspar11.gif</b></a><br>
  833: <img src="pbiaspar11.gif" width="400" height="300"> </h5>
  834: 
  835: <p>This graph plots the conditional transition probabilities from
  836: an initial state (1=healthy in red at the bottom, or 2=disable in
  837: green on top) at age <em>x </em>to the final state 2=disable<em> </em>at
  838: age <em>x+h. </em>Conditional means at the condition to be alive
  839: at age <em>x+h </em>which is <i>hP12x</i> + <em>hP22x</em>. The
  840: curves <i>hP12x/(hP12x</i> + <em>hP22x) </em>and <i>hP22x/(hP12x</i>
  841: + <em>hP22x) </em>converge with <em>h, </em>to the <em>stationary
  842: prevalence of disability</em>. In order to get the stationary
  843: prevalence at age 70 we should start the process at an earlier
  844: age, i.e.50. If the disability state is defined by severe
  845: disability criteria with only a few chance to recover, then the
  846: incidence of recovery is low and the time to convergence is
  847: probably longer. But we don't have experience yet.</p>
  848: 
  849: <h5><font color="#EC5E5E" size="3"><b>- Life expectancies by age
  850: and initial health status</b></font><b>: </b><a
  851: href="erbiaspar.txt"><b>erbiaspar.txt</b></a></h5>
  852: 
  853: <pre># Health expectancies 
  854: # Age 1-1 1-2 2-1 2-2 
  855: 70 10.9226 3.0401 5.6488 6.2122 
  856: 71 10.4384 3.0461 5.2477 6.1599 
  857: 72 9.9667 3.0502 4.8663 6.1025 
  858: 73 9.5077 3.0524 4.5044 6.0401 </pre>
  859: 
  860: <pre>For example 70 10.4227 3.0402 5.6488 5.7123 means:
  861: e11=10.4227 e12=3.0402 e21=5.6488 e22=5.7123</pre>
  862: 
  863: <pre><img src="expbiaspar21.gif" width="400" height="300"><img
  864: src="expbiaspar11.gif" width="400" height="300"></pre>
  865: 
  866: <p>For example, life expectancy of a healthy individual at age 70
  867: is 10.42 in the healthy state and 3.04 in the disability state
  868: (=13.46 years). If he was disable at age 70, his life expectancy
  869: will be shorter, 5.64 in the healthy state and 5.71 in the
  870: disability state (=11.35 years). The total life expectancy is a
  871: weighted mean of both, 13.46 and 11.35; weight is the proportion
  872: of people disabled at age 70. In order to get a pure period index
  873: (i.e. based only on incidences) we use the <a
  874: href="#Stationary prevalence in each state">computed or
  875: stationary prevalence</a> at age 70 (i.e. computed from
  876: incidences at earlier ages) instead of the <a
  877: href="#Observed prevalence in each state">observed prevalence</a>
  878: (for example at first exam) (<a href="#Health expectancies">see
  879: below</a>).</p>
  880: 
  881: <h5><font color="#EC5E5E" size="3"><b>- Variances of life
  882: expectancies by age and initial health status</b></font><b>: </b><a
  883: href="vrbiaspar.txt"><b>vrbiaspar.txt</b></a></h5>
  884: 
  885: <p>For example, the covariances of life expectancies Cov(ei,ej)
  886: at age 50 are (line 3) </p>
  887: 
  888: <pre>   Cov(e1,e1)=0.4776  Cov(e1,e2)=0.0488=Cov(e2,e1)  Cov(e2,e2)=0.0424</pre>
  889: 
  890: <h5><font color="#EC5E5E" size="3"><b>- </b></font><a
  891: name="Health expectancies"><font color="#EC5E5E" size="3"><b>Health
  892: expectancies</b></font></a><font color="#EC5E5E" size="3"><b>
  893: with standard errors in parentheses</b></font><b>: </b><a
  894: href="trbiaspar.txt"><font face="Courier New"><b>trbiaspar.txt</b></font></a></h5>
  895: 
  896: <pre>#Total LEs with variances: e.. (std) e.1 (std) e.2 (std) </pre>
  897: 
  898: <pre>70 13.26 (0.22) 9.95 (0.20) 3.30 (0.14) </pre>
  899: 
  900: <p>Thus, at age 70 the total life expectancy, e..=13.26 years is
  901: the weighted mean of e1.=13.46 and e2.=11.35 by the stationary
  902: prevalence at age 70 which are 0.90134 in state 1 and 0.09866 in
  903: state 2, respectively (the sum is equal to one). e.1=9.95 is the
  904: Disability-free life expectancy at age 70 (it is again a weighted
  905: mean of e11 and e21). e.2=3.30 is also the life expectancy at age
  906: 70 to be spent in the disability state.</p>
  907: 
  908: <h5><font color="#EC5E5E" size="3"><b>-Total life expectancy by
  909: age and health expectancies in states (1=healthy) and (2=disable)</b></font><b>:
  910: </b><a href="ebiaspar1.gif"><b>ebiaspar1.gif</b></a></h5>
  911: 
  912: <p>This figure represents the health expectancies and the total
  913: life expectancy with the confident interval in dashed curve. </p>
  914: 
  915: <pre>        <img src="ebiaspar1.gif" width="400" height="300"></pre>
  916: 
  917: <p>Standard deviations (obtained from the information matrix of
  918: the model) of these quantities are very useful.
  919: Cross-longitudinal surveys are costly and do not involve huge
  920: samples, generally a few thousands; therefore it is very
  921: important to have an idea of the standard deviation of our
  922: estimates. It has been a big challenge to compute the Health
  923: Expectancy standard deviations. Don't be confuse: life expectancy
  924: is, as any expected value, the mean of a distribution; but here
  925: we are not computing the standard deviation of the distribution,
  926: but the standard deviation of the estimate of the mean.</p>
  927: 
  928: <p>Our health expectancies estimates vary according to the sample
  929: size (and the standard deviations give confidence intervals of
  930: the estimate) but also according to the model fitted. Let us
  931: explain it in more details.</p>
  932: 
  933: <p>Choosing a model means ar least two kind of choices. First we
  934: have to decide the number of disability states. Second we have to
  935: design, within the logit model family, the model: variables,
  936: covariables, confonding factors etc. to be included.</p>
  937: 
  938: <p>More disability states we have, better is our demographical
  939: approach of the disability process, but smaller are the number of
  940: transitions between each state and higher is the noise in the
  941: measurement. We do not have enough experiments of the various
  942: models to summarize the advantages and disadvantages, but it is
  943: important to say that even if we had huge and unbiased samples,
  944: the total life expectancy computed from a cross-longitudinal
  945: survey, varies with the number of states. If we define only two
  946: states, alive or dead, we find the usual life expectancy where it
  947: is assumed that at each age, people are at the same risk to die.
  948: If we are differentiating the alive state into healthy and
  949: disable, and as the mortality from the disability state is higher
  950: than the mortality from the healthy state, we are introducing
  951: heterogeneity in the risk of dying. The total mortality at each
  952: age is the weighted mean of the mortality in each state by the
  953: prevalence in each state. Therefore if the proportion of people
  954: at each age and in each state is different from the stationary
  955: equilibrium, there is no reason to find the same total mortality
  956: at a particular age. Life expectancy, even if it is a very useful
  957: tool, has a very strong hypothesis of homogeneity of the
  958: population. Our main purpose is not to measure differential
  959: mortality but to measure the expected time in a healthy or
  960: disability state in order to maximise the former and minimize the
  961: latter. But the differential in mortality complexifies the
  962: measurement.</p>
  963: 
  964: <p>Incidences of disability or recovery are not affected by the
  965: number of states if these states are independant. But incidences
  966: estimates are dependant on the specification of the model. More
  967: covariates we added in the logit model better is the model, but
  968: some covariates are not well measured, some are confounding
  969: factors like in any statistical model. The procedure to &quot;fit
  970: the best model' is similar to logistic regression which itself is
  971: similar to regression analysis. We haven't yet been sofar because
  972: we also have a severe limitation which is the speed of the
  973: convergence. On a Pentium III, 500 MHz, even the simplest model,
  974: estimated by month on 8,000 people may take 4 hours to converge.
  975: Also, the program is not yet a statistical package, which permits
  976: a simple writing of the variables and the model to take into
  977: account in the maximisation. The actual program allows only to
  978: add simple variables like age+sex or age+sex+ age*sex but will
  979: never be general enough. But what is to remember, is that
  980: incidences or probability of change from one state to another is
  981: affected by the variables specified into the model.</p>
  982: 
  983: <p>Also, the age range of the people interviewed has a link with
  984: the age range of the life expectancy which can be estimated by
  985: extrapolation. If your sample ranges from age 70 to 95, you can
  986: clearly estimate a life expectancy at age 70 and trust your
  987: confidence interval which is mostly based on your sample size,
  988: but if you want to estimate the life expectancy at age 50, you
  989: should rely in your model, but fitting a logistic model on a age
  990: range of 70-95 and estimating probabilties of transition out of
  991: this age range, say at age 50 is very dangerous. At least you
  992: should remember that the confidence interval given by the
  993: standard deviation of the health expectancies, are under the
  994: strong assumption that your model is the 'true model', which is
  995: probably not the case.</p>
  996: 
  997: <h5><font color="#EC5E5E" size="3"><b>- Copy of the parameter
  998: file</b></font><b>: </b><a href="orbiaspar.txt"><b>orbiaspar.txt</b></a></h5>
  999: 
 1000: <p>This copy of the parameter file can be useful to re-run the
 1001: program while saving the old output files. </p>
 1002: 
 1003: <h5><font color="#EC5E5E" size="3"><b>- Prevalence forecasting</b></font><b>:
 1004: </b><a href="frbiaspar.txt"><b>frbiaspar.txt</b></a></h5>
 1005: 
 1006: <p
 1007: style="TEXT-ALIGN: justify; tab-stops: 45.8pt 91.6pt 137.4pt 183.2pt 229.0pt 274.8pt 320.6pt 366.4pt 412.2pt 458.0pt 503.8pt 549.6pt 595.4pt 641.2pt 687.0pt 732.8pt">First,
 1008: we have estimated the observed prevalence between 1/1/1984 and
 1009: 1/6/1988. The mean date of interview (weighed average of the
 1010: interviews performed between1/1/1984 and 1/6/1988) is estimated
 1011: to be 13/9/1985, as written on the top on the file. Then we
 1012: forecast the probability to be in each state. </p>
 1013: 
 1014: <p
 1015: style="TEXT-ALIGN: justify; tab-stops: 45.8pt 91.6pt 137.4pt 183.2pt 229.0pt 274.8pt 320.6pt 366.4pt 412.2pt 458.0pt 503.8pt 549.6pt 595.4pt 641.2pt 687.0pt 732.8pt">Example,
 1016: at date 1/1/1989 : </p>
 1017: 
 1018: <pre class="MsoNormal"># StartingAge FinalAge P.1 P.2 P.3
 1019: # Forecasting at date 1/1/1989
 1020:   73 0.807 0.078 0.115</pre>
 1021: 
 1022: <p
 1023: style="TEXT-ALIGN: justify; tab-stops: 45.8pt 91.6pt 137.4pt 183.2pt 229.0pt 274.8pt 320.6pt 366.4pt 412.2pt 458.0pt 503.8pt 549.6pt 595.4pt 641.2pt 687.0pt 732.8pt">Since
 1024: the minimum age is 70 on the 13/9/1985, the youngest forecasted
 1025: age is 73. This means that at age a person aged 70 at 13/9/1989
 1026: has a probability to enter state1 of 0.807 at age 73 on 1/1/1989.
 1027: Similarly, the probability to be in state 2 is 0.078 and the
 1028: probability to die is 0.115. Then, on the 1/1/1989, the
 1029: prevalence of disability at age 73 is estimated to be 0.088.</p>
 1030: 
 1031: <h5><font color="#EC5E5E" size="3"><b>- Population forecasting</b></font><b>:
 1032: </b><a href="poprbiaspar.txt"><b>poprbiaspar.txt</b></a></h5>
 1033: 
 1034: <pre># Age P.1 P.2 P.3 [Population]
 1035: # Forecasting at date 1/1/1989 
 1036: 75 572685.22 83798.08 
 1037: 74 621296.51 79767.99 
 1038: 73 645857.70 69320.60 </pre>
 1039: 
 1040: <pre># Forecasting at date 1/1/19909 
 1041: 76 442986.68 92721.14 120775.48
 1042: 75 487781.02 91367.97 121915.51
 1043: 74 512892.07 85003.47 117282.76 </pre>
 1044: 
 1045: <p>From the population file, we estimate the number of people in
 1046: each state. At age 73, 645857 persons are in state 1 and 69320
 1047: are in state 2. One year latter, 512892 are still in state 1,
 1048: 85003 are in state 2 and 117282 died before 1/1/1990.</p>
 1049: 
 1050: <hr>
 1051: 
 1052: <h2><a name="example"></a><font color="#00006A">Trying an example</font></h2>
 1053: 
 1054: <p>Since you know how to run the program, it is time to test it
 1055: on your own computer. Try for example on a parameter file named <a
 1056: href="..\mytry\imachpar.imach">imachpar.imach</a> which is a copy of <font
 1057: size="2" face="Courier New">mypar.imach</font> included in the
 1058: subdirectory of imach, <font size="2" face="Courier New">mytry</font>.
 1059: Edit it to change the name of the data file to <font size="2"
 1060: face="Courier New">..\data\mydata.txt</font> if you don't want to
 1061: copy it on the same directory. The file <font face="Courier New">mydata.txt</font>
 1062: is a smaller file of 3,000 people but still with 4 waves. </p>
 1063: 
 1064: <p>Click on the imach.exe icon to open a window. Answer to the
 1065: question:'<strong>Enter the parameter file name:'</strong></p>
 1066: 
 1067: <table border="1">
 1068:     <tr>
 1069:         <td width="100%"><strong>IMACH, Version 0.8</strong><p><strong>Enter
 1070:         the parameter file name: ..\mytry\imachpar.imach</strong></p>
 1071:         </td>
 1072:     </tr>
 1073: </table>
 1074: 
 1075: <p>Most of the data files or image files generated, will use the
 1076: 'imachpar' string into their name. The running time is about 2-3
 1077: minutes on a Pentium III. If the execution worked correctly, the
 1078: outputs files are created in the current directory, and should be
 1079: the same as the mypar files initially included in the directory <font
 1080: size="2" face="Courier New">mytry</font>.</p>
 1081: 
 1082: <ul>
 1083:     <li><pre><u>Output on the screen</u> The output screen looks like <a
 1084: href="imachrun.LOG">this Log file</a>
 1085: #
 1086: 
 1087: title=MLE datafile=..\data\mydata.txt lastobs=3000 firstpass=1 lastpass=3
 1088: ftol=1.000000e-008 stepm=24 ncovcol=2 nlstate=2 ndeath=1 maxwav=4 mle=1 weight=0</pre>
 1089:     </li>
 1090:     <li><pre>Total number of individuals= 2965, Agemin = 70.00, Agemax= 100.92
 1091: 
 1092: Warning, no any valid information for:126 line=126
 1093: Warning, no any valid information for:2307 line=2307
 1094: Delay (in months) between two waves Min=21 Max=51 Mean=24.495826
 1095: <font face="Times New Roman">These lines give some warnings on the data file and also some raw statistics on frequencies of transitions.</font>
 1096: Age 70 1.=230 loss[1]=3.5% 2.=16 loss[2]=12.5% 1.=222 prev[1]=94.1% 2.=14
 1097:  prev[2]=5.9% 1-1=8 11=200 12=7 13=15 2-1=2 21=6 22=7 23=1
 1098: Age 102 1.=0 loss[1]=NaNQ% 2.=0 loss[2]=NaNQ% 1.=0 prev[1]=NaNQ% 2.=0 </pre>
 1099:     </li>
 1100: </ul>
 1101: 
 1102: <p>&nbsp;</p>
 1103: 
 1104: <ul>
 1105:     <li>Maximisation with the Powell algorithm. 8 directions are
 1106:         given corresponding to the 8 parameters. this can be
 1107:         rather long to get convergence.<br>
 1108:         <font size="1" face="Courier New"><br>
 1109:         Powell iter=1 -2*LL=11531.405658264877 1 0.000000000000 2
 1110:         0.000000000000 3<br>
 1111:         0.000000000000 4 0.000000000000 5 0.000000000000 6
 1112:         0.000000000000 7 <br>
 1113:         0.000000000000 8 0.000000000000<br>
 1114:         1..........2.................3..........4.................5.........<br>
 1115:         6................7........8...............<br>
 1116:         Powell iter=23 -2*LL=6744.954108371555 1 -12.967632334283
 1117:         <br>
 1118:         2 0.135136681033 3 -7.402109728262 4 0.067844593326 <br>
 1119:         5 -0.673601538129 6 -0.006615504377 7 -5.051341616718 <br>
 1120:         8 0.051272038506<br>
 1121:         1..............2...........3..............4...........<br>
 1122:         5..........6................7...........8.........<br>
 1123:         #Number of iterations = 23, -2 Log likelihood =
 1124:         6744.954042573691<br>
 1125:         # Parameters<br>
 1126:         12 -12.966061 0.135117 <br>
 1127:         13 -7.401109 0.067831 <br>
 1128:         21 -0.672648 -0.006627 <br>
 1129:         23 -5.051297 0.051271 </font><br>
 1130:         </li>
 1131:     <li><pre><font size="2">Calculation of the hessian matrix. Wait...
 1132: 12345678.12.13.14.15.16.17.18.23.24.25.26.27.28.34.35.36.37.38.45.46.47.48.56.57.58.67.68.78
 1133: 
 1134: Inverting the hessian to get the covariance matrix. Wait...
 1135: 
 1136: #Hessian matrix#
 1137: 3.344e+002 2.708e+004 -4.586e+001 -3.806e+003 -1.577e+000 -1.313e+002 3.914e-001 3.166e+001 
 1138: 2.708e+004 2.204e+006 -3.805e+003 -3.174e+005 -1.303e+002 -1.091e+004 2.967e+001 2.399e+003 
 1139: -4.586e+001 -3.805e+003 4.044e+002 3.197e+004 2.431e-002 1.995e+000 1.783e-001 1.486e+001 
 1140: -3.806e+003 -3.174e+005 3.197e+004 2.541e+006 2.436e+000 2.051e+002 1.483e+001 1.244e+003 
 1141: -1.577e+000 -1.303e+002 2.431e-002 2.436e+000 1.093e+002 8.979e+003 -3.402e+001 -2.843e+003 
 1142: -1.313e+002 -1.091e+004 1.995e+000 2.051e+002 8.979e+003 7.420e+005 -2.842e+003 -2.388e+005 
 1143: 3.914e-001 2.967e+001 1.783e-001 1.483e+001 -3.402e+001 -2.842e+003 1.494e+002 1.251e+004 
 1144: 3.166e+001 2.399e+003 1.486e+001 1.244e+003 -2.843e+003 -2.388e+005 1.251e+004 1.053e+006 
 1145: # Scales
 1146: 12 1.00000e-004 1.00000e-006
 1147: 13 1.00000e-004 1.00000e-006
 1148: 21 1.00000e-003 1.00000e-005
 1149: 23 1.00000e-004 1.00000e-005
 1150: # Covariance
 1151:   1 5.90661e-001
 1152:   2 -7.26732e-003 8.98810e-005
 1153:   3 8.80177e-002 -1.12706e-003 5.15824e-001
 1154:   4 -1.13082e-003 1.45267e-005 -6.50070e-003 8.23270e-005
 1155:   5 9.31265e-003 -1.16106e-004 6.00210e-004 -8.04151e-006 1.75753e+000
 1156:   6 -1.15664e-004 1.44850e-006 -7.79995e-006 1.04770e-007 -2.12929e-002 2.59422e-004
 1157:   7 1.35103e-003 -1.75392e-005 -6.38237e-004 7.85424e-006 4.02601e-001 -4.86776e-003 1.32682e+000
 1158:   8 -1.82421e-005 2.35811e-007 7.75503e-006 -9.58687e-008 -4.86589e-003 5.91641e-005 -1.57767e-002 1.88622e-004
 1159: # agemin agemax for lifexpectancy, bage fage (if mle==0 ie no data nor Max likelihood).
 1160: 
 1161: 
 1162: agemin=70 agemax=100 bage=50 fage=100
 1163: Computing prevalence limit: result on file 'plrmypar.txt' 
 1164: Computing pij: result on file 'pijrmypar.txt' 
 1165: Computing Health Expectancies: result on file 'ermypar.txt' 
 1166: Computing Variance-covariance of DFLEs: file 'vrmypar.txt' 
 1167: Computing Total LEs with variances: file 'trmypar.txt' 
 1168: Computing Variance-covariance of Prevalence limit: file 'vplrmypar.txt' 
 1169: End of Imach
 1170: </font></pre>
 1171:     </li>
 1172: </ul>
 1173: 
 1174: <p><font size="3">Once the running is finished, the program
 1175: requires a caracter:</font></p>
 1176: 
 1177: <table border="1">
 1178:     <tr>
 1179:         <td width="100%"><strong>Type e to edit output files, c
 1180:         to start again, and q for exiting:</strong></td>
 1181:     </tr>
 1182: </table>
 1183: 
 1184: <p><font size="3">First you should enter <strong>e </strong>to
 1185: edit the master file mypar.htm. </font></p>
 1186: 
 1187: <ul>
 1188:     <li><u>Outputs files</u> <br>
 1189:         <br>
 1190:         - Observed prevalence in each state: <a
 1191:         href="..\mytry\prmypar.txt">pmypar.txt</a> <br>
 1192:         - Estimated parameters and the covariance matrix: <a
 1193:         href="..\mytry\rmypar.txt">rmypar.imach</a> <br>
 1194:         - Stationary prevalence in each state: <a
 1195:         href="..\mytry\plrmypar.txt">plrmypar.txt</a> <br>
 1196:         - Transition probabilities: <a
 1197:         href="..\mytry\pijrmypar.txt">pijrmypar.txt</a> <br>
 1198:         - Copy of the parameter file: <a
 1199:         href="..\mytry\ormypar.txt">ormypar.txt</a> <br>
 1200:         - Life expectancies by age and initial health status: <a
 1201:         href="..\mytry\ermypar.txt">ermypar.txt</a> <br>
 1202:         - Variances of life expectancies by age and initial
 1203:         health status: <a href="..\mytry\vrmypar.txt">vrmypar.txt</a>
 1204:         <br>
 1205:         - Health expectancies with their variances: <a
 1206:         href="..\mytry\trmypar.txt">trmypar.txt</a> <br>
 1207:         - Standard deviation of stationary prevalence: <a
 1208:         href="..\mytry\vplrmypar.txt">vplrmypar.txt</a><br>
 1209:         - Prevalences forecasting: <a href="frmypar.txt">frmypar.txt</a>
 1210:         <br>
 1211:         - Population forecasting (if popforecast=1): <a
 1212:         href="poprmypar.txt">poprmypar.txt</a> <br>
 1213:         </li>
 1214:     <li><u>Graphs</u> <br>
 1215:         <br>
 1216:         -<a href="../mytry/pemypar1.gif">One-step transition probabilities</a><br>
 1217:         -<a href="../mytry/pmypar11.gif">Convergence to the stationary prevalence</a><br>
 1218:         -<a href="..\mytry\vmypar11.gif">Observed and stationary prevalence in state (1) with the confident interval</a> <br>
 1219:         -<a href="..\mytry\vmypar21.gif">Observed and stationary prevalence in state (2) with the confident interval</a> <br>
 1220:         -<a href="..\mytry\expmypar11.gif">Health life expectancies by age and initial health state (1)</a> <br>
 1221:         -<a href="..\mytry\expmypar21.gif">Health life expectancies by age and initial health state (2)</a> <br>
 1222:         -<a href="..\mytry\emypar1.gif">Total life expectancy by age and health expectancies in states (1) and (2).</a> </li>
 1223: </ul>
 1224: 
 1225: <p>This software have been partly granted by <a
 1226: href="http://euroreves.ined.fr">Euro-REVES</a>, a concerted
 1227: action from the European Union. It will be copyrighted
 1228: identically to a GNU software product, i.e. program and software
 1229: can be distributed freely for non commercial use. Sources are not
 1230: widely distributed today. You can get them by asking us with a
 1231: simple justification (name, email, institute) <a
 1232: href="mailto:brouard@ined.fr">mailto:brouard@ined.fr</a> and <a
 1233: href="mailto:lievre@ined.fr">mailto:lievre@ined.fr</a> .</p>
 1234: 
 1235: <p>Latest version (0.8 of March 2002) can be accessed at <a
 1236: href="http://euroreves.ined.fr/imach">http://euroreves.ined.fr/imach</a><br>
 1237: </p>
 1238: </body>
 1239: </html>

FreeBSD-CVSweb <freebsd-cvsweb@FreeBSD.org>