--- imach096d/doc/imach.htm 2000/12/28 18:49:54 1.1 +++ imach096d/doc/imach.htm 2002/03/04 10:01:45 1.5 @@ -1,1024 +1,2131 @@ - - - - - -Computing Health Expectancies using IMaCh - - - - -
- -

Computing Health -Expectancies using IMaCh

- -

(a Maximum -Likelihood Computer Program using Interpolation of Markov Chains)

- -

 

- -

- -

INED and EUROREVES

- -

March -2000

- -
- -

Authors of the -program: Nicolas Brouard, senior researcher at the Institut -National d'Etudes Démographiques (INED, Paris) in the "Mortality, -Health and Epidemiology" Research Unit

- -

and Agnès -Lièvre
-

- -

Contribution to the mathematics: C. R. -Heathcote (Australian -National University, Canberra).

- -

Contact: Agnès Lièvre (lievre@ined.fr)

- -
- - - -
- -

Introduction

- -

This program computes Healthy Life Expectancies from cross-longitudinal -data. Within the family of Health Expectancies (HE), -Disability-free life expectancy (DFLE) is probably the most -important index to monitor. In low mortality countries, there is -a fear that when mortality declines, the increase in DFLE is not -proportionate to the increase in total Life expectancy. This case -is called the Expansion of morbidity. Most of the data -collected today, in particular by the international REVES network on Health -expectancy, and most HE indices based on these data, are cross-sectional. -It means that the information collected comes from a single -cross-sectional survey: people from various ages (but mostly old -people) are surveyed on their health status at a single date. -Proportion of people disabled at each age, can then be measured -at that date. This age-specific prevalence curve is then used to -distinguish, within the stationary population (which, by -definition, is the life table estimated from the vital statistics -on mortality at the same date), the disable population from the -disability-free population. Life expectancy (LE) (or total -population divided by the yearly number of births or deaths of -this stationary population) is then decomposed into DFLE and DLE. -This method of computing HE is usually called the Sullivan method -(from the name of the author who first described it).

- -

Age-specific proportions of people disable are very difficult -to forecast because each proportion corresponds to historical -conditions of the cohort and it is the result of the historical -flows from entering disability and recovering in the past until -today. The age-specific intensities (or incidence rates) of -entering disability or recovering a good health, are reflecting -actual conditions and therefore can be used at each age to -forecast the future of this cohort. For example if a country is -improving its technology of prosthesis, the incidence of -recovering the ability to walk will be higher at each (old) age, -but the prevalence of disability will only slightly reflect an -improve because the prevalence is mostly affected by the history -of the cohort and not by recent period effects. To measure the -period improvement we have to simulate the future of a cohort of -new-borns entering or leaving at each age the disability state or -dying according to the incidence rates measured today on -different cohorts. The proportion of people disabled at each age -in this simulated cohort will be much lower (using the exemple of -an improvement) that the proportions observed at each age in a -cross-sectional survey. This new prevalence curve introduced in a -life table will give a much more actual and realistic HE level -than the Sullivan method which mostly measured the History of -health conditions in this country.

- -

Therefore, the main question is how to measure incidence rates -from cross-longitudinal surveys? This is the goal of the IMaCH -program. From your data and using IMaCH you can estimate period -HE and not only Sullivan's HE. Also the standard errors of the HE -are computed.

- -

A cross-longitudinal survey consists in a first survey -("cross") where individuals from different ages are -interviewed on their health status or degree of disability. At -least a second wave of interviews ("longitudinal") -should measure each new individual health status. Health -expectancies are computed from the transitions observed between -waves and are computed for each degree of severity of disability -(number of life states). More degrees you consider, more time is -necessary to reach the Maximum Likelihood of the parameters -involved in the model. Considering only two states of disability -(disable and healthy) is generally enough but the computer -program works also with more health statuses.
-
-The simplest model is the multinomial logistic model where pij -is the probability to be observed in state j at the second -wave conditional to be observed in state i at the first -wave. Therefore a simple model is: log(pij/pii)= aij + -bij*age+ cij*sex, where 'age' is age and 'sex' -is a covariate. The advantage that this computer program claims, -comes from that if the delay between waves is not identical for -each individual, or if some individual missed an interview, the -information is not rounded or lost, but taken into account using -an interpolation or extrapolation. hPijx is the -probability to be observed in state i at age x+h -conditional to the observed state i at age x. The -delay 'h' can be split into an exact number (nh*stepm) -of unobserved intermediate states. This elementary transition (by -month or quarter trimester, semester or year) is modeled as a -multinomial logistic. The hPx matrix is simply the matrix -product of nh*stepm elementary matrices and the -contribution of each individual to the likelihood is simply hPijx. -
-

- -

The program presented in this manual is a quite general -program named IMaCh (for Interpolated -MArkov CHain), designed to -analyse transition data from longitudinal surveys. The first step -is the parameters estimation of a transition probabilities model -between an initial status and a final status. From there, the -computer program produces some indicators such as observed and -stationary prevalence, life expectancies and their variances and -graphs. Our transition model consists in absorbing and -non-absorbing states with the possibility of return across the -non-absorbing states. The main advantage of this package, -compared to other programs for the analysis of transition data -(For example: Proc Catmod of SAS®) is that the whole -individual information is used even if an interview is missing, a -status or a date is unknown or when the delay between waves is -not identical for each individual. The program can be executed -according to parameters: selection of a sub-sample, number of -absorbing and non-absorbing states, number of waves taken in -account (the user inputs the first and the last interview), a -tolerance level for the maximization function, the periodicity of -the transitions (we can compute annual, quaterly or monthly -transitions), covariates in the model. It works on Windows or on -Unix.
-

- -
- -

On what kind of data can -it be used?

- -

The minimum data required for a transition model is the -recording of a set of individuals interviewed at a first date and -interviewed again at least one another time. From the -observations of an individual, we obtain a follow-up over time of -the occurrence of a specific event. In this documentation, the -event is related to health status at older ages, but the program -can be applied on a lot of longitudinal studies in different -contexts. To build the data file explained into the next section, -you must have the month and year of each interview and the -corresponding health status. But in order to get age, date of -birth (month and year) is required (missing values is allowed for -month). Date of death (month and year) is an important -information also required if the individual is dead. Shorter -steps (i.e. a month) will more closely take into account the -survival time after the last interview.

- -
- -

The data file

- -

In this example, 8,000 people have been interviewed in a -cross-longitudinal survey of 4 waves (1984, 1986, 1988, 1990). -Some people missed 1, 2 or 3 interviews. Health statuses are -healthy (1) and disable (2). The survey is not a real one. It is -a simulation of the American Longitudinal Survey on Aging. The -disability state is defined if the individual missed one of four -ADL (Activity of daily living, like bathing, eating, walking). -Therefore, even is the individuals interviewed in the sample are -virtual, the information brought with this sample is close to the -situation of the United States. Sex is not recorded is this -sample.

- -

Each line of the data set (named data1.txt -in this first example) is an individual record which fields are:

- - - -

 

- -

If your longitudinal survey do not include information about -weights or covariates, you must fill the column with a number -(e.g. 1) because a missing field is not allowed.

- -
- -

Your first example parameter file

- -

#Imach version 0.63, February 2000, -INED-EUROREVES

- -

This is a comment. Comments start with a '#'.

- -

First uncommented line

- -
title=1st_example datafile=data1.txt lastobs=8600 firstpass=1 lastpass=4
- - - -

 

- -

Second uncommented -line

- -
ftol=1.e-08 stepm=1 ncov=2 nlstate=2 ndeath=1 maxwav=4 mle=1 weight=0
- - - -

Guess values for optimization

- -

You must write the initial guess values of the parameters for -optimization. The number of parameters, N depends on the -number of absorbing states and non-absorbing states and on the -number of covariates.
-N is given by the formula N=(nlstate + -ndeath-1)*nlstate*ncov .
-
-Thus in the simple case with 2 covariates (the model is log -(pij/pii) = aij + bij * age where intercept and age are the two -covariates), and 2 health degrees (1 for disability-free and 2 -for disability) and 1 absorbing state (3), you must enter 8 -initials values, a12, b12, a13, b13, a21, b21, a23, b23. You can -start with zeros as in this example, but if you have a more -precise set (for example from an earlier run) you can enter it -and it will speed up them
-Each of the four lines starts with indices "ij":
-
-ij aij bij

- -
-
# Guess values of aij and bij in log (pij/pii) = aij + bij * age
-12 -14.155633  0.110794 
-13  -7.925360  0.032091 
-21  -1.890135 -0.029473 
-23  -6.234642  0.022315 
-
- -

or, to simplify:

- -
-
12 0.0 0.0
-13 0.0 0.0
-21 0.0 0.0
-23 0.0 0.0
-
- -

Guess values for computing variances

- -

This is an output if mle=1. But it can be -used as an input to get the vairous output data files (Health -expectancies, stationary prevalence etc.) and figures without -rerunning the rather long maximisation phase (mle=0).

- -

The scales are small values for the evaluation of numerical -derivatives. These derivatives are used to compute the hessian -matrix of the parameters, that is the inverse of the covariance -matrix, and the variances of health expectancies. Each line -consists in indices "ij" followed by the initial scales -(zero to simplify) associated with aij and bij.

- - - -
-
# Scales (for hessian or gradient estimation)
-12 0. 0. 
-13 0. 0. 
-21 0. 0. 
-23 0. 0. 
-
- - - -

Covariance matrix of parameters

- -

This is an output if mle=1. But it can be -used as an input to get the vairous output data files (Health -expectancies, stationary prevalence etc.) and figures without -rerunning the rather long maximisation phase (mle=0).

- -

Each line starts with indices "ijk" followed by the -covariances between aij and bij:

- -
-   121 Var(a12) 
-   122 Cov(b12,a12)  Var(b12) 
-          ...
-   232 Cov(b23,a12)  Cov(b23,b12) ... Var (b23) 
- - - -
-
# Covariance matrix
-121 0.
-122 0. 0.
-131 0. 0. 0. 
-132 0. 0. 0. 0. 
-211 0. 0. 0. 0. 0. 
-212 0. 0. 0. 0. 0. 0. 
-231 0. 0. 0. 0. 0. 0. 0. 
-232 0. 0. 0. 0. 0. 0. 0. 0.
-
- - - -

last -uncommented line

- -
agemin=70 agemax=100 bage=50 fage=100
- -

Once we obtained the estimated parameters, the program is able -to calculated stationary prevalence, transitions probabilities -and life expectancies at any age. Choice of age ranges is useful -for extrapolation. In our data file, ages varies from age 70 to -102. Setting bage=50 and fage=100, makes the program computing -life expectancy from age bage to age fage. As we use a model, we -can compute life expectancy on a wider age range than the age -range from the data. But the model can be rather wrong on big -intervals.

- -

Similarly, it is possible to get extrapolated stationary -prevalence by age raning from agemin to agemax.

- - - -
- -

Running Imach -with this example

- -

We assume that you entered your 1st_example -parameter file as explained above. To -run the program you should click on the imach.exe icon and enter -the name of the parameter file which is for example C:\usr\imach\mle\biaspar.txt -(you also can click on the biaspar.txt icon located in
-C:\usr\imach\mle and put it with -the mouse on the imach window).
-

- -

The time to converge depends on the step unit that you used (1 -month is cpu consuming), on the number of cases, and on the -number of variables.

- -

The program outputs many files. Most of them are files which -will be plotted for better understanding.

- -
- -

Output of the program -and graphs

- -

Once the optimization is finished, some graphics can be made -with a grapher. We use Gnuplot which is an interactive plotting -program copyrighted but freely distributed. Imach outputs the -source of a gnuplot file, named 'graph.gp', which can be directly -input into gnuplot.
-When the running is finished, the user should enter a caracter -for plotting and output editing.

- -

These caracters are:

- - - -
Results files
-
-- Observed prevalence in each state (and at first pass): -prbiaspar.txt
-
- -

The first line is the title and displays each field of the -file. The first column is age. The fields 2 and 6 are the -proportion of individuals in states 1 and 2 respectively as -observed during the first exam. Others fields are the numbers of -people in states 1, 2 or more. The number of columns increases if -the number of states is higher than 2.
-The header of the file is

- -
# Age Prev(1) N(1) N Age Prev(2) N(2) N
-70 1.00000 631 631 70 0.00000 0 631
-71 0.99681 625 627 71 0.00319 2 627 
-72 0.97125 1115 1148 72 0.02875 33 1148 
- -
# Age Prev(1) N(1) N Age Prev(2) N(2) N
-    70 0.95721 604 631 70 0.04279 27 631
- -

It means that at age 70, the prevalence in state 1 is 1.000 -and in state 2 is 0.00 . At age 71 the number of individuals in -state 1 is 625 and in state 2 is 2, hence the total number of -people aged 71 is 625+2=627.
-

- -
- Estimated parameters and -covariance matrix: rbiaspar.txt
- -

This file contains all the maximisation results:

- -
 Number of iterations=47
- -2 log likelihood=46553.005854373667  
- Estimated parameters: a12 = -12.691743 b12 = 0.095819 
-                       a13 = -7.815392   b13 = 0.031851 
-                       a21 = -1.809895 b21 = -0.030470 
-                       a23 = -7.838248  b23 = 0.039490  
- Covariance matrix: Var(a12) = 1.03611e-001
-                    Var(b12) = 1.51173e-005
-                    Var(a13) = 1.08952e-001
-                    Var(b13) = 1.68520e-005  
-                    Var(a21) = 4.82801e-001
-                    Var(b21) = 6.86392e-005
-                    Var(a23) = 2.27587e-001
-                    Var(b23) = 3.04465e-005 
- 
- -
- Transition probabilities: -pijrbiaspar.txt
- -

Here are the transitions probabilities Pij(x, x+nh) where nh -is a multiple of 2 years. The first column is the starting age x -(from age 50 to 100), the second is age (x+nh) and the others are -the transition probabilities p11, p12, p13, p21, p22, p23. For -example, line 5 of the file is:

- -
 100 106 0.03286 0.23512 0.73202 0.02330 0.19210 0.78460 
- -

and this means:

- -
p11(100,106)=0.03286
-p12(100,106)=0.23512
-p13(100,106)=0.73202
-p21(100,106)=0.02330
-p22(100,106)=0.19210 
-p22(100,106)=0.78460 
- -
- Stationary prevalence in each state: -plrbiaspar.txt
- -
#Age 1-1 2-2 
-70 0.92274 0.07726 
-71 0.91420 0.08580 
-72 0.90481 0.09519 
-73 0.89453 0.10547
- -

At age 70 the stationary prevalence is 0.92274 in state 1 and -0.07726 in state 2. This stationary prevalence differs from -observed prevalence. Here is the point. The observed prevalence -at age 70 results from the incidence of disability, incidence of -recovery and mortality which occurred in the past of the cohort. -Stationary prevalence results from a simulation with actual -incidences and mortality (estimated from this cross-longitudinal -survey). It is the best predictive value of the prevalence in the -future if "nothing changes in the future". This is -exactly what demographers do with a Life table. Life expectancy -is the expected mean time to survive if observed mortality rates -(incidence of mortality) "remains constant" in the -future.

- -
- Standard deviation of -stationary prevalence: vplrbiaspar.txt
- -

The stationary prevalence has to be compared with the observed -prevalence by age. But both are statistical estimates and -subjected to stochastic errors due to the size of the sample, the -design of the survey, and, for the stationary prevalence to the -model used and fitted. It is possible to compute the standard -deviation of the stationary prevalence at each age.

- -
Observed and stationary -prevalence in state (2=disable) with the confident interval: -vbiaspar2.gif
- -


-This graph exhibits the stationary prevalence in state (2) with -the confidence interval in red. The green curve is the observed -prevalence (or proportion of individuals in state (2)). Without -discussing the results (it is not the purpose here), we observe -that the green curve is rather below the stationary prevalence. -It suggests an increase of the disability prevalence in the -future.

- -

- -
Convergence to the -stationary prevalence of disability: pbiaspar1.gif
-
- -

This graph plots the conditional transition probabilities from -an initial state (1=healthy in red at the bottom, or 2=disable in -green on top) at age x to the final state 2=disable at -age x+h. Conditional means at the condition to be alive -at age x+h which is hP12x + hP22x. The -curves hP12x/(hP12x + hP22x) and hP22x/(hP12x -+ hP22x) converge with h, to the stationary -prevalence of disability. In order to get the stationary -prevalence at age 70 we should start the process at an earlier -age, i.e.50. If the disability state is defined by severe -disability criteria with only a few chance to recover, then the -incidence of recovery is low and the time to convergence is -probably longer. But we don't have experience yet.

- -
- Life expectancies by age -and initial health status: erbiaspar.txt
- -
# Health expectancies 
-# Age 1-1 1-2 2-1 2-2 
-70 10.7297 2.7809 6.3440 5.9813 
-71 10.3078 2.8233 5.9295 5.9959 
-72 9.8927 2.8643 5.5305 6.0033 
-73 9.4848 2.9036 5.1474 6.0035 
- -
For example 70 10.7297 2.7809 6.3440 5.9813 means:
-e11=10.7297 e12=2.7809 e21=6.3440 e22=5.9813
- -
- -

For example, life expectancy of a healthy individual at age 70 -is 10.73 in the healthy state and 2.78 in the disability state -(=13.51 years). If he was disable at age 70, his life expectancy -will be shorter, 6.34 in the healthy state and 5.98 in the -disability state (=12.32 years). The total life expectancy is a -weighted mean of both, 13.51 and 12.32; weight is the proportion -of people disabled at age 70. In order to get a pure period index -(i.e. based only on incidences) we use the computed or -stationary prevalence at age 70 (i.e. computed from -incidences at earlier ages) instead of the observed prevalence -(for example at first exam) (see -below).

- -
- Variances of life -expectancies by age and initial health status: vrbiaspar.txt
- -

For example, the covariances of life expectancies Cov(ei,ej) -at age 50 are (line 3)

- -
   Cov(e1,e1)=0.4667  Cov(e1,e2)=0.0605=Cov(e2,e1)  Cov(e2,e2)=0.0183
- -
- Health -expectancies -with standard errors in parentheses: trbiaspar.txt
- -
#Total LEs with variances: e.. (std) e.1 (std) e.2 (std) 
- -
70 13.42 (0.18) 10.39 (0.15) 3.03 (0.10)70 13.81 (0.18) 11.28 (0.14) 2.53 (0.09) 
- -

Thus, at age 70 the total life expectancy, e..=13.42 years is -the weighted mean of e1.=13.51 and e2.=12.32 by the stationary -prevalence at age 70 which are 0.92274 in state 1 and 0.07726 in -state 2, respectively (the sum is equal to one). e.1=10.39 is the -Disability-free life expectancy at age 70 (it is again a weighted -mean of e11 and e21). e.2=3.03 is also the life expectancy at age -70 to be spent in the disability state.

- -
Total life expectancy by -age and health expectancies in states (1=healthy) and (2=disable): -ebiaspar.gif
- -

This figure represents the health expectancies and the total -life expectancy with the confident interval in dashed curve.

- -
        
- -

Standard deviations (obtained from the information matrix of -the model) of these quantities are very useful. -Cross-longitudinal surveys are costly and do not involve huge -samples, generally a few thousands; therefore it is very -important to have an idea of the standard deviation of our -estimates. It has been a big challenge to compute the Health -Expectancy standard deviations. Don't be confuse: life expectancy -is, as any expected value, the mean of a distribution; but here -we are not computing the standard deviation of the distribution, -but the standard deviation of the estimate of the mean.

- -

Our health expectancies estimates vary according to the sample -size (and the standard deviations give confidence intervals of -the estimate) but also according to the model fitted. Let us -explain it in more details.

- -

Choosing a model means ar least two kind of choices. First we -have to decide the number of disability states. Second we have to -design, within the logit model family, the model: variables, -covariables, confonding factors etc. to be included.

- -

More disability states we have, better is our demographical -approach of the disability process, but smaller are the number of -transitions between each state and higher is the noise in the -measurement. We do not have enough experiments of the various -models to summarize the advantages and disadvantages, but it is -important to say that even if we had huge and unbiased samples, -the total life expectancy computed from a cross-longitudinal -survey, varies with the number of states. If we define only two -states, alive or dead, we find the usual life expectancy where it -is assumed that at each age, people are at the same risk to die. -If we are differentiating the alive state into healthy and -disable, and as the mortality from the disability state is higher -than the mortality from the healthy state, we are introducing -heterogeneity in the risk of dying. The total mortality at each -age is the weighted mean of the mortality in each state by the -prevalence in each state. Therefore if the proportion of people -at each age and in each state is different from the stationary -equilibrium, there is no reason to find the same total mortality -at a particular age. Life expectancy, even if it is a very useful -tool, has a very strong hypothesis of homogeneity of the -population. Our main purpose is not to measure differential -mortality but to measure the expected time in a healthy or -disability state in order to maximise the former and minimize the -latter. But the differential in mortality complexifies the -measurement.

- -

Incidences of disability or recovery are not affected by the -number of states if these states are independant. But incidences -estimates are dependant on the specification of the model. More -covariates we added in the logit model better is the model, but -some covariates are not well measured, some are confounding -factors like in any statistical model. The procedure to "fit -the best model' is similar to logistic regression which itself is -similar to regression analysis. We haven't yet been sofar because -we also have a severe limitation which is the speed of the -convergence. On a Pentium III, 500 MHz, even the simplest model, -estimated by month on 8,000 people may take 4 hours to converge. -Also, the program is not yet a statistical package, which permits -a simple writing of the variables and the model to take into -account in the maximisation. The actual program allows only to -add simple variables without covariations, like age+sex but -without age+sex+ age*sex . This can be done from the source code -(you have to change three lines in the source code) but will -never be general enough. But what is to remember, is that -incidences or probability of change from one state to another is -affected by the variables specified into the model.

- -

Also, the age range of the people interviewed has a link with -the age range of the life expectancy which can be estimated by -extrapolation. If your sample ranges from age 70 to 95, you can -clearly estimate a life expectancy at age 70 and trust your -confidence interval which is mostly based on your sample size, -but if you want to estimate the life expectancy at age 50, you -should rely in your model, but fitting a logistic model on a age -range of 70-95 and estimating probabilties of transition out of -this age range, say at age 50 is very dangerous. At least you -should remember that the confidence interval given by the -standard deviation of the health expectancies, are under the -strong assumption that your model is the 'true model', which is -probably not the case.

- -
- Copy of the parameter -file: orbiaspar.txt
- -

This copy of the parameter file can be useful to re-run the -program while saving the old output files.

- -
- -

Trying an example

- -

Since you know how to run the program, it is time to test it -on your own computer. Try for example on a parameter file named imachpar.txt which is a -copy of mypar.txt -included in the subdirectory of imach, mytry. Edit it to change the name of -the data file to ..\data\mydata.txt -if you don't want to copy it on the same directory. The file mydata.txt is a smaller file of 3,000 -people but still with 4 waves.

- -

Click on the imach.exe icon to open a window. Answer to the -question:'Enter the parameter file name:'

- - - - - -
IMACH, Version 0.63

Enter - the parameter file name: ..\mytry\imachpar.txt

-
- -

Most of the data files or image files generated, will use the -'imachpar' string into their name. The running time is about 2-3 -minutes on a Pentium III. If the execution worked correctly, the -outputs files are created in the current directory, and should be -the same as the mypar files initially included in the directory mytry.

- - - -

 

- - - -

Once the running is finished, the program -requires a caracter:

- - - - - -
Type g for plotting (available - if mle=1), e to edit output files, c to start again,

and - q for exiting:

-
- -

First you should enter g to -make the figures and then you can edit all the results by typing e. -

- - - -

This software have been partly granted by Euro-REVES, a concerted -action from the European Union. It will be copyrighted -identically to a GNU software product, i.e. program and software -can be distributed freely for non commercial use. Sources are not -widely distributed today. You can get them by asking us with a -simple justification (name, email, institute) mailto:brouard@ined.fr and mailto:lievre@ined.fr .

- -

Latest version (0.63 of 16 march 2000) can be accessed at http://euroreves.ined.fr/imach
-

- - + + + + + + + +Computing Health Expectancies using IMaCh + + + + + + + + + + + + + +
+ +

Computing Health +Expectancies using IMaCh

+ +

(a Maximum +Likelihood Computer Program using Interpolation of Markov Chains)

+ +

 

+ +

+ +

INED and EUROREVES

+ +

Version 0.7, +February 2002

+ +
+ +

Authors of +the program: Nicolas +Brouard, senior researcher at the Institut National d'Etudes +Démographiques (INED, Paris) in the +"Mortality, Health and Epidemiology" Research Unit

+ +

and Agnès +Lièvre
+

+ +

Contribution to the mathematics: C. R. Heathcote (Australian +National University, Canberra).

+ +

Contact: Agnès Lièvre (lievre@ined.fr) +

+ +
+ + + +
+ +

Introduction

+ +

This program computes Healthy +Life Expectancies from cross-longitudinal data using +the methodology pioneered by Laditka and Wolf (1). Within the +family of Health Expectancies (HE), Disability-free life +expectancy (DFLE) is probably the most important index to +monitor. In low mortality countries, there is a fear that when +mortality declines, the increase in DFLE is not proportionate to +the increase in total Life expectancy. This case is called the Expansion +of morbidity. Most of the data collected today, in +particular by the international REVES +network on Health expectancy, and most HE indices based on these +data, are cross-sectional. It means that the information +collected comes from a single cross-sectional survey: people from +various ages (but mostly old people) are surveyed on their health +status at a single date. Proportion of people disabled at each +age, can then be measured at that date. This age-specific +prevalence curve is then used to distinguish, within the +stationary population (which, by definition, is the life table +estimated from the vital statistics on mortality at the same +date), the disable population from the disability-free +population. Life expectancy (LE) (or total population divided by +the yearly number of births or deaths of this stationary +population) is then decomposed into DFLE and DLE. This method of +computing HE is usually called the Sullivan method (from the name +of the author who first described it).

+ +

Age-specific proportions of people +disable are very difficult to forecast because each proportion +corresponds to historical conditions of the cohort and it is the +result of the historical flows from entering disability and +recovering in the past until today. The age-specific intensities +(or incidence rates) of entering disability or recovering a good +health, are reflecting actual conditions and therefore can be +used at each age to forecast the future of this cohort. For +example if a country is improving its technology of prosthesis, +the incidence of recovering the ability to walk will be higher at +each (old) age, but the prevalence of disability will only +slightly reflect an improve because the prevalence is mostly +affected by the history of the cohort and not by recent period +effects. To measure the period improvement we have to simulate +the future of a cohort of new-borns entering or leaving at each +age the disability state or dying according to the incidence +rates measured today on different cohorts. The proportion of +people disabled at each age in this simulated cohort will be much +lower (using the example of an improvement) that the proportions +observed at each age in a cross-sectional survey. This new +prevalence curve introduced in a life table will give a much more +actual and realistic HE level than the Sullivan method which +mostly measured the History of health conditions in this country.

+ +

Therefore, the main question is how +to measure incidence rates from cross-longitudinal surveys? This +is the goal of the IMaCH program. From your data and using IMaCH +you can estimate period HE and not only Sullivan's HE. Also the +standard errors of the HE are computed.

+ +

A cross-longitudinal survey +consists in a first survey ("cross") where individuals +from different ages are interviewed on their health status or +degree of disability. At least a second wave of interviews +("longitudinal") should measure each new individual +health status. Health expectancies are computed from the +transitions observed between waves and are computed for each +degree of severity of disability (number of life states). More +degrees you consider, more time is necessary to reach the Maximum +Likelihood of the parameters involved in the model. Considering +only two states of disability (disable and healthy) is generally +enough but the computer program works also with more health +statuses.       
+
+The simplest model is the multinomial logistic model where pij +is the probability to be observed in state j at the second +wave conditional to be observed in state i at the first +wave. Therefore a simple model is: log(pij/pii)= aij + +bij*age+ cij*sex, where 'age' is age and 'sex' +is a covariate. The advantage that this computer program claims, +comes from that if the delay between waves is not identical for +each individual, or if some individual missed an interview, the +information is not rounded or lost, but taken into account using +an interpolation or extrapolation. hPijx is the +probability to be observed in state i at age x+h +conditional to the observed state i at age x. The +delay 'h' can be split into an exact number (nh*stepm) +of unobserved intermediate states. This elementary transition (by +month or quarter trimester, semester or year) is modeled as a +multinomial logistic. The hPx matrix is simply the matrix +product of nh*stepm elementary matrices and the +contribution of each individual to the likelihood is simply hPijx. +

+ +

The program presented in this +manual is a quite general program named IMaCh +(for Interpolated MArkov CHain), +designed to analyse transition data from longitudinal surveys. +The first step is the parameters estimation of a transition +probabilities model between an initial status and a final status. +From there, the computer program produces some indicators such as +observed and stationary prevalence, life expectancies and their +variances and graphs. Our transition model consists in absorbing +and non-absorbing states with the possibility of return across +the non-absorbing states. The main advantage of this package, +compared to other programs for the analysis of transition data +(For example: Proc Catmod of SAS(r)) is that the whole +individual information is used even if an interview is missing, a +status or a date is unknown or when the delay between waves is +not identical for each individual. The program can be executed +according to parameters: selection of a sub-sample, number of +absorbing and non-absorbing states, number of waves taken in +account (the user inputs the first and the last interview), a +tolerance level for the maximization function, the periodicity of +the transitions (we can compute annual, quarterly or monthly +transitions), covariates in the model. It works on Windows or on +Unix.

+ +
+ +

(1) Laditka, Sarah B. and Wolf, Douglas A. (1998), "New +Methods for Analyzing Active Life Expectancy". Journal of +Aging and Health. Vol 10, No. 2.

+ +
+ +

On what kind of data can it be used?

+ +

The minimum data required for a +transition model is the recording of a set of individuals +interviewed at a first date and interviewed again at least one +another time. From the observations of an individual, we obtain a +follow-up over time of the occurrence of a specific event. In +this documentation, the event is related to health status at +older ages, but the program can be applied on a lot of +longitudinal studies in different contexts. To build the data +file explained into the next section, you must have the month and +year of each interview and the corresponding health status. But +in order to get age, date of birth (month and year) is required +(missing values is allowed for month). Date of death (month and +year) is an important information also required if the individual +is dead. Shorter steps (i.e. a month) will more closely take into +account the survival time after the last interview.

+ +
+ +

The data file

+ +

In this example, 8,000 people have +been interviewed in a cross-longitudinal survey of 4 waves (1984, +1986, 1988, 1990). Some people missed 1, 2 or 3 interviews. +Health statuses are healthy (1) and disable (2). The survey is +not a real one. It is a simulation of the American Longitudinal +Survey on Aging. The disability state is defined if the +individual missed one of four ADL (Activity of daily living, like +bathing, eating, walking). Therefore, even is the individuals +interviewed in the sample are virtual, the information brought +with this sample is close to the situation of the United States. +Sex is not recorded is this sample.

+ +

Each line of the data set (named data1.txt +in this first example) is an individual record which fields are:

+ + + +

 

+ +

If your longitudinal survey do not +include information about weights or covariates, you must fill +the column with a number (e.g. 1) because a missing field is not +allowed.

+ +
+ +

Your first example parameter file

+ +

#Imach version 0.7, February 2002, +INED-EUROREVES

+ +

This is a comment. Comments start with a '#'.

+ +

First uncommented line

+ +
title=1st_example datafile=data1.txt lastobs=8600 firstpass=1 lastpass=4
+ + + +

 

+ +

Second +uncommented line

+ +
ftol=1.e-08 stepm=1 ncov=2 nlstate=2 ndeath=1 maxwav=4 mle=1 weight=0
+ + + +

Covariates

+ +

Intercept +and age are systematically included in the model. Additional +covariates can be included with the command

+ +
model=list of covariates
+ + + +

Guess +values for optimisation

+ +

You +must write the initial guess values of the parameters for +optimisation. The number of parameters, N depends on the +number of absorbing states and non-absorbing states and on the +number of covariates.
+N is given by the formula N=(nlstate + +ndeath-1)*nlstate*ncov .
+
+Thus in the simple case with 2 covariates (the model is log +(pij/pii) = aij + bij * age where intercept and age are the two +covariates), and 2 health degrees (1 for disability-free and 2 +for disability) and 1 absorbing state (3), you must enter 8 +initials values, a12, b12, a13, b13, a21, b21, a23, b23. You can +start with zeros as in this example, but if you have a more +precise set (for example from an earlier run) you can enter it +and it will speed up them
+Each of the four lines starts with indices "ij": ij +aij bij

+ +
# Guess values of aij and bij in log (pij/pii) = aij + bij * age
+ +
12 -14.155633  0.110794 
+ +
13  -7.925360  0.032091 
+ +
21  -1.890135 -0.029473 
+ +
23  -6.234642  0.022315 
+ +

or, +to simplify:

+ +
12 0.0 0.0
+ +
13 0.0 0.0
+ +
21 0.0 0.0
+ +
23 0.0 0.0
+ +

Guess +values for computing variances

+ +

This +is an output if mle=1. But it can be used as +an input to get the various output data files (Health +expectancies, stationary prevalence etc.) and figures without +rerunning the rather long maximisation phase (mle=0).

+ +

The +scales are small values for the evaluation of numerical +derivatives. These derivatives are used to compute the hessian +matrix of the parameters, that is the inverse of the covariance +matrix, and the variances of health expectancies. Each line +consists in indices "ij" followed by the initial scales +(zero to simplify) associated with aij and bij.

+ + + +
# Scales (for hessian or gradient estimation)
+ +
12 0. 0. 
+ +
13 0. 0. 
+ +
21 0. 0. 
+ +
23 0. 0. 
+ + + +

Covariance +matrix of parameters

+ +

This +is an output if mle=1. But it can be used as +an input to get the various output data files (Health +expectancies, stationary prevalence etc.) and figures without +rerunning the rather long maximisation phase (mle=0).

+ +

Each +line starts with indices "ijk" followed by the +covariances between aij and bij:

+ +
 
+ +
   121 Var(a12) 
+ +
   122 Cov(b12,a12)  Var(b12) 
+ +
          ...
+ +
   232 Cov(b23,a12)  Cov(b23,b12) ... Var (b23) 
+ + + +
# Covariance matrix
+ +
121 0.
+ +
122 0. 0.
+ +
131 0. 0. 0. 
+ +
132 0. 0. 0. 0. 
+ +
211 0. 0. 0. 0. 0. 
+ +
212 0. 0. 0. 0. 0. 0. 
+ +
231 0. 0. 0. 0. 0. 0. 0. 
+ +
232 0. 0. 0. 0. 0. 0. 0. 0.
+ + + +

Age +range for calculation of stationary prevalences and health +expectancies

+ +
agemin=70 agemax=100 bage=50 fage=100
+ +

Once +we obtained the estimated parameters, the program is able to +calculated stationary prevalence, transitions probabilities and +life expectancies at any age. Choice of age range is useful for +extrapolation. In our data file, ages varies from age 70 to 102. +Setting bage=50 and fage=100, makes the program computing life +expectancy from age bage to age fage. As we use a model, we can +compute life expectancy on a wider age range than the age range +from the data. But the model can be rather wrong on big +intervals.

+ +

Similarly, +it is possible to get extrapolated stationary prevalence by age +ranging from agemin to agemax.

+ + + +

Computing the observed prevalence

+ +
begin-prev-date=1/1/1984 end-prev-date=1/6/1988 
+ +

Statements +'begin-prev-date' and 'end-prev-date' allow to select the period +in which we calculate the observed prevalences in each state. In +this example, the prevalences are calculated on data survey +collected between 1 January 1984 and 1 June 1988.

+ + + +

Population- +or status-based health expectancies

+ +
pop_based=0
+ +

The +user has the possibility to choose between population-based or +status-based health expectancies. If pop_based=0 then +status-based health expectancies are computed and if pop_based=1, +the programme computes population-based health expectancies. +Health expectancies are weighted averages of health expectancies +respective of the initial state. For a status-based index, the +weights are the cross-sectional prevalences observed between two +dates, as previously explained, whereas +for a population-based index, the weights are the stationary +prevalences.

+ +

Prevalence +forecasting

+ +
starting-proj-date=1/1/1989 final-proj-date=1/1/1992 mov_average=0 
+ +

Prevalence +and population projections are available only if the +interpolation unit is a month, i.e. stepm=1. The programme +estimates the prevalence in each state at a precise date +expressed in day/month/year. The programme computes one +forecasted prevalence a year from a starting date (1 January of +1989 in this example) to a final date (1 January 1992). The +statement mov_average allows to compute smoothed forecasted +prevalences with a five-age moving average centred at the mid-age +of the five-age period.

+ + + +

Last +uncommented line : Population forecasting

+ +
popforecast=0 popfile=pyram.txt popfiledate=1/1/1989 last-popfiledate=1/1/1992
+ +

This +command is available if the interpolation unit is a month, i.e. +stepm=1 and if popforecast=1. From a data file including age and +number of persons alive at the precise date ‘popfiledate’, +you can forecast the number of persons in each state until date +‘last-popfiledate’. In this example, the popfile pyram.txt  includes real +data which are the Japanese population in 1989. 

+ + + +
+ +

Running Imach with this example

+ +

We +assume that you entered your 1st_example +parameter file as explained above. To +run the program you should click on the imach.exe icon and enter +the name of the parameter file which is for example C:\usr\imach\mle\biaspar.txt (you +also can click on the biaspar.txt icon located in C:\usr\imach\mle and put it with the mouse on +the imach window).

+ +

The +time to converge depends on the step unit that you used (1 month +is cpu consuming), on the number of cases, and on the number of +variables.

+ +

The +program outputs many files. Most of them are files which will be +plotted for better understanding.

+ +
+ +

Output of the program and graphs

+ +

Once +the optimization is finished, some graphics can be made with a +grapher. We use Gnuplot which is an interactive plotting program +copyrighted but freely distributed. A gnuplot reference manual is +available here.
+When the running is finished, the user should enter a character +for plotting and output editing.

+ +

These +characters are:

+ + + +
Results +files
+
+- Observed +prevalence in each state (and at first pass): +prbiaspar.txt
+ +

The +first line is the title and displays each field of the file. The +first column is age. The fields 2 and 6 are the proportion of +individuals in states 1 and 2 respectively as observed during the +first exam. Others fields are the numbers of people in states 1, +2 or more. The number of columns increases if the number of +states is higher than 2.
+The header of the file is

+ +
# Age Prev(1) N(1) N Age Prev(2) N(2) N
+ +
70 1.00000 631 631 70 0.00000 0 631
+ +
71 0.99681 625 627 71 0.00319 2 627 
+ +
72 0.97125 1115 1148 72 0.02875 33 1148 
+ +

It +means that at age 70, the prevalence in state 1 is 1.000 and in +state 2 is 0.00 . At age 71 the number of individuals in state 1 +is 625 and in state 2 is 2, hence the total number of people aged +71 is 625+2=627.

+ +
- +Estimated parameters and covariance matrix: rbiaspar.txt
+ +

This +file contains all the maximisation results:

+ +
 -2 log likelihood= 21660.918613445392
+ +
 Estimated parameters: a12 = -12.290174 b12 = 0.092161 
+ +
                       a13 = -9.155590  b13 = 0.046627 
+ +
                       a21 = -2.629849  b21 = -0.022030 
+ +
                       a23 = -7.958519  b23 = 0.042614  
+ +
 Covariance matrix: Var(a12) = 1.47453e-001
+ +
                    Var(b12) = 2.18676e-005
+ +
                    Var(a13) = 2.09715e-001
+ +
                    Var(b13) = 3.28937e-005  
+ +
                    Var(a21) = 9.19832e-001
+ +
                    Var(b21) = 1.29229e-004
+ +
                    Var(a23) = 4.48405e-001
+ +
                    Var(b23) = 5.85631e-005 
+ +
 
+ +

By +substitution of these parameters in the regression model, we +obtain the elementary transition probabilities:

+ +

+ +
- +Transition probabilities: pijrbiaspar.txt
+ +

Here +are the transitions probabilities Pij(x, x+nh) where nh is a +multiple of 2 years. The first column is the starting age x (from +age 50 to 100), the second is age (x+nh) and the others are the +transition probabilities p11, p12, p13, p21, p22, p23. For +example, line 5 of the file is:

+ +
 100 106 0.02655 0.17622 0.79722 0.01809 0.13678 0.84513 
+ +

and +this means:

+ +
p11(100,106)=0.02655
+ +
p12(100,106)=0.17622
+ +
p13(100,106)=0.79722
+ +
p21(100,106)=0.01809
+ +
p22(100,106)=0.13678
+ +
p22(100,106)=0.84513 
+ +
- +Stationary +prevalence in each state: plrbiaspar.txt
+ +
#Prevalence
+ +
#Age 1-1 2-2
+ +
 
+ +
#************ 
+ +
70 0.90134 0.09866
+ +
71 0.89177 0.10823 
+ +
72 0.88139 0.11861 
+ +
73 0.87015 0.12985 
+ +

At +age 70 the stationary prevalence is 0.90134 in state 1 and +0.09866 in state 2. This stationary prevalence differs from +observed prevalence. Here is the point. The observed prevalence +at age 70 results from the incidence of disability, incidence of +recovery and mortality which occurred in the past of the cohort. +Stationary prevalence results from a simulation with actual +incidences and mortality (estimated from this cross-longitudinal +survey). It is the best predictive value of the prevalence in the +future if "nothing changes in the future". This is +exactly what demographers do with a Life table. Life expectancy +is the expected mean time to survive if observed mortality rates +(incidence of mortality) "remains constant" in the +future.

+ +
- +Standard deviation of stationary prevalence: vplrbiaspar.txt
+ +

The +stationary prevalence has to be compared with the observed +prevalence by age. But both are statistical estimates and +subjected to stochastic errors due to the size of the sample, the +design of the survey, and, for the stationary prevalence to the +model used and fitted. It is possible to compute the standard +deviation of the stationary prevalence at each age.

+ +
-Observed +and stationary prevalence in state (2=disable) with the confident +interval: vbiaspar21.gif
+ +

This +graph exhibits the stationary prevalence in state (2) with the +confidence interval in red. The green curve is the observed +prevalence (or proportion of individuals in state (2)). Without +discussing the results (it is not the purpose here), we observe +that the green curve is rather below the stationary prevalence. +It suggests an increase of the disability prevalence in the +future.

+ +

+ +
-Convergence +to the stationary prevalence of disability: pbiaspar11.gif
+
+ +

This +graph plots the conditional transition probabilities from an +initial state (1=healthy in red at the bottom, or 2=disable in +green on top) at age x to the final state 2=disable at +age x+h. Conditional means at the condition to be alive +at age x+h which is hP12x + hP22x. The +curves hP12x/(hP12x + hP22x) and hP22x/(hP12x ++ hP22x) converge with h, to the stationary +prevalence of disability. In order to get the stationary +prevalence at age 70 we should start the process at an earlier +age, i.e.50. If the disability state is defined by severe +disability criteria with only a few chance to recover, then the +incidence of recovery is low and the time to convergence is +probably longer. But we don't have experience yet.

+ +
- +Life expectancies by age and initial health status: erbiaspar.txt
+ +
# Health expectancies 
+ +
# Age 1-1 1-2 2-1 2-2 
+ +
70 10.9226 3.0401 5.6488 6.2122 
+ +
71 10.4384 3.0461 5.2477 6.1599 
+ +
72 9.9667 3.0502 4.8663 6.1025 
+ +
73 9.5077 3.0524 4.5044 6.0401 
+ +
For example 70 10.9226 3.0401 5.6488 6.2122 means:
+ +
e11=10.9226 e12=3.0401 e21=5.6488 e22=6.2122
+ +
+ +

For +example, life expectancy of a healthy individual at age 70 is +10.92 in the healthy state and 3.04 in the disability state +(=13.96 years). If he was disable at age 70, his life expectancy +will be shorter, 5.64 in the healthy state and 6.21 in the +disability state (=11.85 years). The total life expectancy is a +weighted mean of both, 13.96 and 11.85; weight is the proportion +of people disabled at age 70. In order to get a pure period index +(i.e. based only on incidences) we use the computed or +stationary prevalence at age 70 (i.e. computed from +incidences at earlier ages) instead of the observed prevalence +(for example at first exam) (see +below).

+ +
- +Variances of life expectancies by age and initial health status: vrbiaspar.txt
+ +

For +example, the covariances of life expectancies Cov(ei,ej) at age +50 are (line 3)

+ +
   Cov(e1,e1)=0.4776  Cov(e1,e2)=0.0488=Cov(e2,e1)  Cov(e2,e2)=0.0424
+ +
- +Health expectancies with +standard errors in parentheses: trbiaspar.txt
+ +
#Total LEs with variances: e.. (std) e.1 (std) e.2 (std) 
+ +
70 13.76 (0.22) 10.40 (0.20) 3.35 (0.14) 
+ +

Thus, +at age 70 the total life expectancy, e..=13.76years is the +weighted mean of e1.=13.96 and e2.=11.85 by the stationary +prevalence at age 70 which are 0.90134 in state 1 and 0.09866 in +state 2, respectively (the sum is equal to one). e.1=10.40 is the +Disability-free life expectancy at age 70 (it is again a weighted +mean of e11 and e21). e.2=3.35 is also the life expectancy at age +70 to be spent in the disability state.

+ +
-Total +life expectancy by age and health expectancies in states +(1=healthy) and (2=disable): ebiaspar1.gif
+ +

This +figure represents the health expectancies and the total life +expectancy with the confident interval in dashed curve.

+ +
        
+ +

Standard +deviations (obtained from the information matrix of the model) of +these quantities are very useful. Cross-longitudinal surveys are +costly and do not involve huge samples, generally a few +thousands; therefore it is very important to have an idea of the +standard deviation of our estimates. It has been a big challenge +to compute the Health Expectancy standard deviations. Don't be +confuse: life expectancy is, as any expected value, the mean of a +distribution; but here we are not computing the standard +deviation of the distribution, but the standard deviation of the +estimate of the mean.

+ +

Our +health expectancies estimates vary according to the sample size +(and the standard deviations give confidence intervals of the +estimate) but also according to the model fitted. Let us explain +it in more details.

+ +

Choosing +a model means at least two kind of choices. First we have to +decide the number of disability states. Second we have to design, +within the logit model family, the model: variables, covariables, +confounding factors etc. to be included.

+ +

More +disability states we have, better is our demographical approach +of the disability process, but smaller are the number of +transitions between each state and higher is the noise in the +measurement. We do not have enough experiments of the various +models to summarize the advantages and disadvantages, but it is +important to say that even if we had huge and unbiased samples, +the total life expectancy computed from a cross-longitudinal +survey, varies with the number of states. If we define only two +states, alive or dead, we find the usual life expectancy where it +is assumed that at each age, people are at the same risk to die. +If we are differentiating the alive state into healthy and +disable, and as the mortality from the disability state is higher +than the mortality from the healthy state, we are introducing +heterogeneity in the risk of dying. The total mortality at each +age is the weighted mean of the mortality in each state by the +prevalence in each state. Therefore if the proportion of people +at each age and in each state is different from the stationary +equilibrium, there is no reason to find the same total mortality +at a particular age. Life expectancy, even if it is a very useful +tool, has a very strong hypothesis of homogeneity of the +population. Our main purpose is not to measure differential +mortality but to measure the expected time in a healthy or +disability state in order to maximise the former and minimize the +latter. But the differential in mortality complexifies the +measurement.

+ +

Incidences +of disability or recovery are not affected by the number of +states if these states are independant. But incidences estimates +are dependant on the specification of the model. More covariates +we added in the logit model better is the model, but some +covariates are not well measured, some are confounding factors +like in any statistical model. The procedure to "fit the +best model' is similar to logistic regression which itself is +similar to regression analysis. We haven't yet been so far +because we also have a severe limitation which is the speed of +the convergence. On a Pentium III, 500 MHz, even the simplest +model, estimated by month on 8,000 people may take 4 hours to +converge. Also, the program is not yet a statistical package, +which permits a simple writing of the variables and the model to +take into account in the maximisation. The actual program allows +only to add simple variables like age+sex or age+sex+ age*sex but +will never be general enough. But what is to remember, is that +incidences or probability of change from one state to another is +affected by the variables specified into the model.

+ +

Also, +the age range of the people interviewed has a link with the age +range of the life expectancy which can be estimated by +extrapolation. If your sample ranges from age 70 to 95, you can +clearly estimate a life expectancy at age 70 and trust your +confidence interval which is mostly based on your sample size, +but if you want to estimate the life expectancy at age 50, you +should rely in your model, but fitting a logistic model on a age +range of 70-95 and estimating probabilities of transition out of +this age range, say at age 50 is very dangerous. At least you +should remember that the confidence interval given by the +standard deviation of the health expectancies, are under the +strong assumption that your model is the 'true model', which is +probably not the case.

+ +
- +Copy of the parameter file: orbiaspar.txt
+ +

This +copy of the parameter file can be useful to re-run the program +while saving the old output files.

+ +
- +Prevalence forecasting: frbiaspar.txt
+ +

First, +we have estimated the observed prevalence between 1/1/1984 and +1/6/1988.  The mean date of interview (weighed average of +the interviews performed between1/1/1984 and 1/6/1988) is +estimated to be 13/9/1985, as written on the top on the file. +Then we forecast the probability to be in each state.

+ +

Example, +at date 1/1/1989 :

+ +

# StartingAge FinalAge P.1 P.2 P.3

+ +

# Forecasting at date 1/1/1989

+ +

73 0.807 0.078 0.115

+ +

Since +the minimum age is 70 on the 13/9/1985, the youngest forecasted +age is 73. This means that at age a person aged 70 at 13/9/1989 +has a probability to enter state1 of 0.807 at age 73 on 1/1/1989. +Similarly, the probability to be in state 2 is 0.078 and the +probability to die is 0.115. Then, on the 1/1/1989, the +prevalence of disability at age 73 is estimated to be 0.088.

+ +
- +Population forecasting: poprbiaspar.txt
+ +
# Age P.1 P.2 P.3 [Population]
+ +
# Forecasting at date 1/1/1989 
+ +
75 572685.22 83798.08 
+ +
74 621296.51 79767.99 
+ +
73 645857.70 69320.60 
+ +
# Forecasting at date 1/1/1990
+ +
76 442986.68 92721.14 120775.48
+ +
75 487781.02 91367.97 121915.51
+ +
74 512892.07 85003.47 117282.76 
+ +
 
+ +

From the population file, we estimate the +number of people in each state. At age 73, 645857 persons are in +state 1 and 69320 are in state 2. One year latter, 512892 are +still in state 1, 85003 are in state 2 and 117282 died before +1/1/1990.

+ +
 
+ +
+ +

Trying an example

+ +

Since +you know how to run the program, it is time to test it on your +own computer. Try for example on a parameter file named imachpar.txt which is a copy of mypar.txt +included in the subdirectory of imach, mytry. Edit it to change +the name of the data file to ..\data\mydata.txt if you don't want +to copy it on the same directory. The file mydata.txt is a +smaller file of 3,000 people but still with 4 waves.

+ +

Click +on the imach.exe icon to open a window. Answer to the question: 'Enter +the parameter file name:'

+ + + + + +
IMACH, + Version 0.7

Enter + the parameter file name: ..\mytry\imachpar.txt

+
+ +

Most +of the data files or image files generated, will use the +'imachpar' string into their name. The running time is about 2-3 +minutes on a Pentium III. If the execution worked correctly, the +outputs files are created in the current directory, and should be +the same as the mypar files initially included in the directory mytry.

+ +
·                Output on the screen The output screen looks like this Log file
+ +
 
+ +
#title=MLE datafile=..\data\mydata.txt lastobs=3000 firstpass=1 lastpass=3
+ +
ftol=1.000000e-008 stepm=24 ncov=2 nlstate=2 ndeath=1 maxwav=4 mle=1 weight=0
+ +
Total number of individuals= 2965, Agemin = 70.00, Agemax= 100.92
+ +
 
+ +
Warning, no any valid information for:126 line=126
+ +
Warning, no any valid information for:2307 line=2307
+ +
Delay (in months) between two waves Min=21 Max=51 Mean=24.495826
+ +
These lines give some warnings on the data file and also some raw statistics on frequencies of transitions.
+ +
Age 70 1.=230 loss[1]=3.5% 2.=16 loss[2]=12.5% 1.=222 prev[1]=94.1% 2.=14
+ +
 prev[2]=5.9% 1-1=8 11=200 12=7 13=15 2-1=2 21=6 22=7 23=1
+ +
Age 102 1.=0 loss[1]=NaNQ% 2.=0 loss[2]=NaNQ% 1.=0 prev[1]=NaNQ% 2.=0 
+ + + +
·                Calculation of the hessian matrix. Wait...
+ +
12345678.12.13.14.15.16.17.18.23.24.25.26.27.28.34.35.36.37.38.45.46.47.48.56.57.58.67.68.78
+ +
 
+ +
Inverting the hessian to get the covariance matrix. Wait...
+ +
 
+ +
#Hessian matrix#
+ +
3.344e+002 2.708e+004 -4.586e+001 -3.806e+003 -1.577e+000 -1.313e+002 3.914e-001 3.166e+001 
+ +
2.708e+004 2.204e+006 -3.805e+003 -3.174e+005 -1.303e+002 -1.091e+004 2.967e+001 2.399e+003 
+ +
-4.586e+001 -3.805e+003 4.044e+002 3.197e+004 2.431e-002 1.995e+000 1.783e-001 1.486e+001 
+ +
-3.806e+003 -3.174e+005 3.197e+004 2.541e+006 2.436e+000 2.051e+002 1.483e+001 1.244e+003 
+ +
-1.577e+000 -1.303e+002 2.431e-002 2.436e+000 1.093e+002 8.979e+003 -3.402e+001 -2.843e+003 
+ +
-1.313e+002 -1.091e+004 1.995e+000 2.051e+002 8.979e+003 7.420e+005 -2.842e+003 -2.388e+005 
+ +
3.914e-001 2.967e+001 1.783e-001 1.483e+001 -3.402e+001 -2.842e+003 1.494e+002 1.251e+004 
+ +
3.166e+001 2.399e+003 1.486e+001 1.244e+003 -2.843e+003 -2.388e+005 1.251e+004 1.053e+006 
+ +
# Scales
+ +
12 1.00000e-004 1.00000e-006
+ +
13 1.00000e-004 1.00000e-006
+ +
21 1.00000e-003 1.00000e-005
+ +
23 1.00000e-004 1.00000e-005
+ +
# Covariance
+ +
  1 5.90661e-001
+ +
  2 -7.26732e-003 8.98810e-005
+ +
  3 8.80177e-002 -1.12706e-003 5.15824e-001
+ +
  4 -1.13082e-003 1.45267e-005 -6.50070e-003 8.23270e-005
+ +
  5 9.31265e-003 -1.16106e-004 6.00210e-004 -8.04151e-006 1.75753e+000
+ +
  6 -1.15664e-004 1.44850e-006 -7.79995e-006 1.04770e-007 -2.12929e-002 2.59422e-004
+ +
  7 1.35103e-003 -1.75392e-005 -6.38237e-004 7.85424e-006 4.02601e-001 -4.86776e-003 1.32682e+000
+ +
  8 -1.82421e-005 2.35811e-007 7.75503e-006 -9.58687e-008 -4.86589e-003 5.91641e-005 -1.57767e-002 1.88622e-004
+ +
# agemin agemax for lifexpectancy, bage fage (if mle==0 ie no data nor Max likelihood).
+ +
 
+ +
 
+ +
agemin=70 agemax=100 bage=50 fage=100
+ +
Computing prevalence limit: result on file 'plrmypar.txt' 
+ +
Computing pij: result on file 'pijrmypar.txt' 
+ +
Computing Health Expectancies: result on file 'ermypar.txt' 
+ +
Computing Variance-covariance of DFLEs: file 'vrmypar.txt' 
+ +
Computing Total LEs with variances: file 'trmypar.txt' 
+ +
Computing Variance-covariance of Prevalence limit: file 'vplrmypar.txt' 
+ +
End of Imach
+ +

Once +the running is finished, the program requires a caracter:

+ + + + + +
Type + e to edit output files, c to start again, and q for + exiting:
+ +

First +you should enter e to edit the master file +mypar.htm.

+ + + +

This +software have been partly granted by Euro-REVES, a concerted +action from the European Union. It will be copyrighted +identically to a GNU software product, i.e. program and software +can be distributed freely for non commercial use. Sources are not +widely distributed today. You can get them by asking us with a +simple justification (name, email, institute) mailto:brouard@ined.fr and mailto:lievre@ined.fr .

+ +

Latest +version (0.7 of February 2002) can be accessed at http://euroreves.ined.fr/imach

+ +