From ca8142362c3432a71ab5f2a832cff5d177b15da5 Mon Sep 17 00:00:00 2001 From: "N. Brouard" Date: Wed, 16 Jun 2004 23:41:19 +0000 Subject: [PATCH] *** empty log message *** --- html/ChangeLog | 4 + html/README.htm | 2131 +++++++++++++++++++++++++++++++++++++++++++++++ html/index.htm | 92 +- 3 files changed, 2183 insertions(+), 44 deletions(-) create mode 100644 html/ChangeLog create mode 100644 html/README.htm diff --git a/html/ChangeLog b/html/ChangeLog new file mode 100644 index 0000000..1a4e0f7 --- /dev/null +++ b/html/ChangeLog @@ -0,0 +1,4 @@ +2004-06-16 Brouard Nicolas + + * doc/biaspar.htm (Repository): New add + diff --git a/html/README.htm b/html/README.htm new file mode 100644 index 0000000..c9d27b6 --- /dev/null +++ b/html/README.htm @@ -0,0 +1,2131 @@ + + + + + + + +Computing Health Expectancies using IMaCh + + + + + + + + + + + + + +

+ +

Computing Health +Expectancies using IMaCh

+ +

(a Maximum +Likelihood Computer Program using Interpolation of Markov Chains)

+ +

INED and EUROREVES

+ +

Version 0.7, +February 2002

+ +

Authors of +the program: Nicolas +Brouard, senior researcher at the Institut National d'Etudes +Démographiques (INED, Paris) in the +"Mortality, Health and Epidemiology" Research Unit

+ +

and Agnès +Lièvre
+

+ +

Contribution to the mathematics: C. R. Heathcote (Australian +National University, Canberra).

+ +

Contact: Agnès Lièvre (lievre@ined.fr) +

+ +

Introduction
On what kind of data can it be used?
The data file
The parameter file
Running Imach
Output files and graphs
Exemple

+ +

Introduction

+ +

This program computes Healthy +Life Expectancies from cross-longitudinal data using +the methodology pioneered by Laditka and Wolf (1). Within the +family of Health Expectancies (HE), Disability-free life +expectancy (DFLE) is probably the most important index to +monitor. In low mortality countries, there is a fear that when +mortality declines, the increase in DFLE is not proportionate to +the increase in total Life expectancy. This case is called the Expansion +of morbidity. Most of the data collected today, in +particular by the international REVES +network on Health expectancy, and most HE indices based on these +data, are cross-sectional. It means that the information +collected comes from a single cross-sectional survey: people from +various ages (but mostly old people) are surveyed on their health +status at a single date. Proportion of people disabled at each +age, can then be measured at that date. This age-specific +prevalence curve is then used to distinguish, within the +stationary population (which, by definition, is the life table +estimated from the vital statistics on mortality at the same +date), the disable population from the disability-free +population. Life expectancy (LE) (or total population divided by +the yearly number of births or deaths of this stationary +population) is then decomposed into DFLE and DLE. This method of +computing HE is usually called the Sullivan method (from the name +of the author who first described it).

+ +

Age-specific proportions of people +disable are very difficult to forecast because each proportion +corresponds to historical conditions of the cohort and it is the +result of the historical flows from entering disability and +recovering in the past until today. The age-specific intensities +(or incidence rates) of entering disability or recovering a good +health, are reflecting actual conditions and therefore can be +used at each age to forecast the future of this cohort. For +example if a country is improving its technology of prosthesis, +the incidence of recovering the ability to walk will be higher at +each (old) age, but the prevalence of disability will only +slightly reflect an improve because the prevalence is mostly +affected by the history of the cohort and not by recent period +effects. To measure the period improvement we have to simulate +the future of a cohort of new-borns entering or leaving at each +age the disability state or dying according to the incidence +rates measured today on different cohorts. The proportion of +people disabled at each age in this simulated cohort will be much +lower (using the example of an improvement) that the proportions +observed at each age in a cross-sectional survey. This new +prevalence curve introduced in a life table will give a much more +actual and realistic HE level than the Sullivan method which +mostly measured the History of health conditions in this country.

+ +

Therefore, the main question is how +to measure incidence rates from cross-longitudinal surveys? This +is the goal of the IMaCH program. From your data and using IMaCH +you can estimate period HE and not only Sullivan's HE. Also the +standard errors of the HE are computed.

+ +

A cross-longitudinal survey +consists in a first survey ("cross") where individuals +from different ages are interviewed on their health status or +degree of disability. At least a second wave of interviews +("longitudinal") should measure each new individual +health status. Health expectancies are computed from the +transitions observed between waves and are computed for each +degree of severity of disability (number of life states). More +degrees you consider, more time is necessary to reach the Maximum +Likelihood of the parameters involved in the model. Considering +only two states of disability (disable and healthy) is generally +enough but the computer program works also with more health +statuses.
+
+The simplest model is the multinomial logistic model where pij +is the probability to be observed in state j at the second +wave conditional to be observed in state i at the first +wave. Therefore a simple model is: log(pij/pii)= aij + +bij*age+ cij*sex, where 'age' is age and 'sex' +is a covariate. The advantage that this computer program claims, +comes from that if the delay between waves is not identical for +each individual, or if some individual missed an interview, the +information is not rounded or lost, but taken into account using +an interpolation or extrapolation. hPijx is the +probability to be observed in state i at age x+h +conditional to the observed state i at age x. The +delay 'h' can be split into an exact number (nh*stepm) +of unobserved intermediate states. This elementary transition (by +month or quarter trimester, semester or year) is modeled as a +multinomial logistic. The hPx matrix is simply the matrix +product of nh*stepm elementary matrices and the +contribution of each individual to the likelihood is simply hPijx. +

+ +

The program presented in this +manual is a quite general program named IMaCh +(for Interpolated MArkov CHain), +designed to analyse transition data from longitudinal surveys. +The first step is the parameters estimation of a transition +probabilities model between an initial status and a final status. +From there, the computer program produces some indicators such as +observed and stationary prevalence, life expectancies and their +variances and graphs. Our transition model consists in absorbing +and non-absorbing states with the possibility of return across +the non-absorbing states. The main advantage of this package, +compared to other programs for the analysis of transition data +(For example: Proc Catmod of SAS^(r)) is that the whole +individual information is used even if an interview is missing, a +status or a date is unknown or when the delay between waves is +not identical for each individual. The program can be executed +according to parameters: selection of a sub-sample, number of +absorbing and non-absorbing states, number of waves taken in +account (the user inputs the first and the last interview), a +tolerance level for the maximization function, the periodicity of +the transitions (we can compute annual, quarterly or monthly +transitions), covariates in the model. It works on Windows or on +Unix.

+ +

(1) Laditka, Sarah B. and Wolf, Douglas A. (1998), "New +Methods for Analyzing Active Life Expectancy". Journal of +Aging and Health. Vol 10, No. 2.

+ +

On what kind of data can it be used?

+ +

The minimum data required for a +transition model is the recording of a set of individuals +interviewed at a first date and interviewed again at least one +another time. From the observations of an individual, we obtain a +follow-up over time of the occurrence of a specific event. In +this documentation, the event is related to health status at +older ages, but the program can be applied on a lot of +longitudinal studies in different contexts. To build the data +file explained into the next section, you must have the month and +year of each interview and the corresponding health status. But +in order to get age, date of birth (month and year) is required +(missing values is allowed for month). Date of death (month and +year) is an important information also required if the individual +is dead. Shorter steps (i.e. a month) will more closely take into +account the survival time after the last interview.

+ +

The data file

+ +

In this example, 8,000 people have +been interviewed in a cross-longitudinal survey of 4 waves (1984, +1986, 1988, 1990). Some people missed 1, 2 or 3 interviews. +Health statuses are healthy (1) and disable (2). The survey is +not a real one. It is a simulation of the American Longitudinal +Survey on Aging. The disability state is defined if the +individual missed one of four ADL (Activity of daily living, like +bathing, eating, walking). Therefore, even is the individuals +interviewed in the sample are virtual, the information brought +with this sample is close to the situation of the United States. +Sex is not recorded is this sample.

+ +

Each line of the data set (named data1.txt +in this first example) is an individual record which fields are:

+ +

Index + number: positive number (field 1)
First + covariate positive number (field 2)
Second + covariate positive number (field 3)
Weight: positive number (field + 4) . In most surveys individuals are weighted according + to the stratification of the sample.
Date + of birth: coded as mm/yyyy. Missing dates are coded + as 99/9999 (field 5)
Date + of death: coded as mm/yyyy. Missing dates are coded + as 99/9999 (field 6)
Date + of first interview: coded as mm/yyyy. Missing dates + are coded as 99/9999 (field 7)
Status + at first interview: positive number. Missing values + ar coded -1. (field 8)
Date + of second interview: coded as mm/yyyy. Missing dates + are coded as 99/9999 (field 9)
Status + at second interview positive number. Missing + values ar coded -1. (field 10)
Date + of third interview: coded as mm/yyyy. Missing dates + are coded as 99/9999 (field 11)
Status + at third interview positive number. Missing + values ar coded -1. (field 12)
Date + of fourth interview: coded as mm/yyyy. Missing dates + are coded as 99/9999 (field 13)
Status + at fourth interview positive number. Missing + values are coded -1. (field 14)
etc

+ +

If your longitudinal survey do not +include information about weights or covariates, you must fill +the column with a number (e.g. 1) because a missing field is not +allowed.

+ +

Your first example parameter file

+ +

#Imach version 0.7, February 2002, +INED-EUROREVES

+ +

This is a comment. Comments start with a '#'.

+ +

First uncommented line

+ +

title=1st_example datafile=data1.txt lastobs=8600 firstpass=1 lastpass=4

+ +

title= + 1st_example is title of the run.
datafile=data1.txt + is the name of the data set. Our example is a six years + follow-up survey. It consists in a baseline followed by 3 + reinterviews.
lastobs= + 8600 the program is able to run on a subsample where the + last observation number is lastobs. It can be set a + bigger number than the real number of observations (e.g. + 100000). In this example, maximisation will be done on + the 8600 first records.
firstpass=1 + , lastpass=4 In case of more than two interviews + in the survey, the program can be run on selected + transitions periods. firstpass=1 means the first + interview included in the calculation is the baseline + survey. lastpass=4 means that the information brought by + the 4th interview is taken into account.

+ +

Second +uncommented line

+ +

ftol=1.e-08 stepm=1 ncov=2 nlstate=2 ndeath=1 maxwav=4 mle=1 weight=0

+ +

ftol=1e-8 + Convergence tolerance on the function value in the + maximisation of the likelihood. Choosing a correct value + for ftol is difficult. 1e-8 is a correct value for a 32 + bits computer.
stepm=1 + Time unit in months for interpolation. Examples:
- If + stepm=1, the unit is a month
- If + stepm=4, the unit is a trimester
- If + stepm=12, the unit is a year
- If + stepm=24, the unit is two years
- ... +
+
ncov=2 + Number of covariates in the datafile. The intercept and + the age parameter are counting for 2 covariates.
nlstate=2 + Number of non-absorbing (alive) states. Here we have two + alive states: disability-free is coded 1 and disability + is coded 2.
ndeath=1 + Number of absorbing states. The absorbing state death is + coded 3.
maxwav=4 + Number of waves in the datafile.
mle=1 Option for the + Maximisation Likelihood Estimation.
- If + mle=1 the program does the maximisation and the + calculation of health expectancies
- If + mle=0 the program only does the calculation of + the health expectancies.
+
weight=0 + Possibility to add weights.
- If + weight=0 no weights are included
- If + weight=1 the maximisation integrates the weights + which are in field 4
+

+ +

Covariates

+ +

Intercept +and age are systematically included in the model. Additional +covariates can be included with the command

+ +

model=list of covariates

+ +

if + model=. then no covariates are included
if + model=V1 the model includes the first + covariate (field 2)
if + model=V2 the model includes the second + covariate (field 3)
if + model=V1+V2 the model includes the first + and the second covariate (fields 2 and 3)
if + model=V1*V2 the model includes the + product of the first and the second covariate (fields 2 + and 3)
if + model=V1+V1*age the model includes the + product covariate*age

+ +

Guess +values for optimisation

+ +

You +must write the initial guess values of the parameters for +optimisation. The number of parameters, N depends on the +number of absorbing states and non-absorbing states and on the +number of covariates.
+N is given by the formula N=(nlstate + +ndeath-1)*nlstate*ncov .
+
+Thus in the simple case with 2 covariates (the model is log +(pij/pii) = aij + bij * age where intercept and age are the two +covariates), and 2 health degrees (1 for disability-free and 2 +for disability) and 1 absorbing state (3), you must enter 8 +initials values, a12, b12, a13, b13, a21, b21, a23, b23. You can +start with zeros as in this example, but if you have a more +precise set (for example from an earlier run) you can enter it +and it will speed up them
+Each of the four lines starts with indices "ij": ij +aij bij

+ +

# Guess values of aij and bij in log (pij/pii) = aij + bij * age

+ +

12 -14.155633  0.110794

+ +

13  -7.925360  0.032091

+ +

21  -1.890135 -0.029473

+ +

23  -6.234642  0.022315

+ +

or, +to simplify:

+ +

12 0.0 0.0

+ +

13 0.0 0.0

+ +

21 0.0 0.0

+ +

23 0.0 0.0

+ +

Guess +values for computing variances

+ +

This +is an output if mle=1. But it can be used as +an input to get the various output data files (Health +expectancies, stationary prevalence etc.) and figures without +rerunning the rather long maximisation phase (mle=0).

+ +

The +scales are small values for the evaluation of numerical +derivatives. These derivatives are used to compute the hessian +matrix of the parameters, that is the inverse of the covariance +matrix, and the variances of health expectancies. Each line +consists in indices "ij" followed by the initial scales +(zero to simplify) associated with aij and bij.

+ +

If + mle=1 you can enter zeros:

+ +

# Scales (for hessian or gradient estimation)

+ +

12 0. 0.

+ +

13 0. 0.

+ +

21 0. 0.

+ +

23 0. 0.

+ +

If + mle=0 you must enter a covariance matrix (usually + obtained from an earlier run).

+ +

Covariance +matrix of parameters

+ +

Each +line starts with indices "ijk" followed by the +covariances between aij and bij:

+ +

   121 Var(a12)

+ +

   122 Cov(b12,a12)  Var(b12)

+ +

...

+ +

   232 Cov(b23,a12)  Cov(b23,b12) ... Var (b23)

+ +

If + mle=1 you can enter zeros.

+ +

# Covariance matrix

+ +

121 0.

+ +

122 0. 0.

+ +

131 0. 0. 0.

+ +

132 0. 0. 0. 0.

+ +

211 0. 0. 0. 0. 0.

+ +

212 0. 0. 0. 0. 0. 0.

+ +

231 0. 0. 0. 0. 0. 0. 0.

+ +

232 0. 0. 0. 0. 0. 0. 0. 0.

+ +

If + mle=0 you must enter a covariance matrix (usually + obtained from an earlier run).

+ +

Age +range for calculation of stationary prevalences and health +expectancies

+ +

agemin=70 agemax=100 bage=50 fage=100

+ +

Once +we obtained the estimated parameters, the program is able to +calculated stationary prevalence, transitions probabilities and +life expectancies at any age. Choice of age range is useful for +extrapolation. In our data file, ages varies from age 70 to 102. +Setting bage=50 and fage=100, makes the program computing life +expectancy from age bage to age fage. As we use a model, we can +compute life expectancy on a wider age range than the age range +from the data. But the model can be rather wrong on big +intervals.

+ +

Similarly, +it is possible to get extrapolated stationary prevalence by age +ranging from agemin to agemax.

+ +

agemin= + Minimum age for calculation of the stationary prevalence
agemax= + Maximum age for calculation of the stationary prevalence
bage= + Minimum age for calculation of the health expectancies
fage= + Maximum age for calculation of the health expectancies

+ +

Computing the observed prevalence

+ +

begin-prev-date=1/1/1984 end-prev-date=1/6/1988

+ +

Statements +'begin-prev-date' and 'end-prev-date' allow to select the period +in which we calculate the observed prevalences in each state. In +this example, the prevalences are calculated on data survey +collected between 1 January 1984 and 1 June 1988.

+ +

begin-prev-date= + Starting date (day/month/year)
end-prev-date= + Final date (day/month/year)

+ +

Population- +or status-based health expectancies

+ +

pop_based=0

+ +

The +user has the possibility to choose between population-based or +status-based health expectancies. If pop_based=0 then +status-based health expectancies are computed and if pop_based=1, +the programme computes population-based health expectancies. +Health expectancies are weighted averages of health expectancies +respective of the initial state. For a status-based index, the +weights are the cross-sectional prevalences observed between two +dates, as previously explained, whereas +for a population-based index, the weights are the stationary +prevalences.

+ +

Prevalence +forecasting

+ +

starting-proj-date=1/1/1989 final-proj-date=1/1/1992 mov_average=0

+ +

Prevalence +and population projections are available only if the +interpolation unit is a month, i.e. stepm=1. The programme +estimates the prevalence in each state at a precise date +expressed in day/month/year. The programme computes one +forecasted prevalence a year from a starting date (1 January of +1989 in this example) to a final date (1 January 1992). The +statement mov_average allows to compute smoothed forecasted +prevalences with a five-age moving average centred at the mid-age +of the five-age period.

+ +

starting-proj-date= + starting date (day/month/year) of forecasting
final-proj-date= + final date (day/month/year) of forecasting
mov_average= + smoothing with a five-age moving average centred at the + mid-age of the five-age period. The command + mov_average takes value 1 if the prevalences are + smoothed and 0 otherwise.

+ +

Last +uncommented line : Population forecasting

+ +

popforecast=0 popfile=pyram.txt popfiledate=1/1/1989 last-popfiledate=1/1/1992

+ +

This +command is available if the interpolation unit is a month, i.e. +stepm=1 and if popforecast=1. From a data file including age and +number of persons alive at the precise date ‘popfiledate’, +you can forecast the number of persons in each state until date +‘last-popfiledate’. In this example, the popfile pyram.txt includes real +data which are the Japanese population in 1989.

+ +

popforecast= + 0 Option for population forecasting. If + popforecast=1, the programme does the forecasting.
popfile= + name of the population file
popfiledate= + date of the population population
last-popfiledate= + date of the last population projection

+ +

Running Imach with this example

+ +

We +assume that you entered your 1st_example +parameter file as explained above. To +run the program you should click on the imach.exe icon and enter +the name of the parameter file which is for example C:\usr\imach\mle\biaspar.txt (you +also can click on the biaspar.txt icon located in C:\usr\imach\mle and put it with the mouse on +the imach window).

+ +

The +time to converge depends on the step unit that you used (1 month +is cpu consuming), on the number of cases, and on the number of +variables.

+ +

The +program outputs many files. Most of them are files which will be +plotted for better understanding.

+ +

Output of the program and graphs

+ +

Once +the optimization is finished, some graphics can be made with a +grapher. We use Gnuplot which is an interactive plotting program +copyrighted but freely distributed. A gnuplot reference manual is +available here.
+When the running is finished, the user should enter a character +for plotting and output editing.

+ +

These +characters are:

+ +

'c' + to start again the program from the beginning.
'e' + opens the biaspar.htm + file to edit the output files and graphs.
'q' + for exiting.

+ +

Results +files
+
+- Observed +prevalence in each state (and at first pass): +prbiaspar.txt

+ +

The +first line is the title and displays each field of the file. The +first column is age. The fields 2 and 6 are the proportion of +individuals in states 1 and 2 respectively as observed during the +first exam. Others fields are the numbers of people in states 1, +2 or more. The number of columns increases if the number of +states is higher than 2.
+The header of the file is

+ +

# Age Prev(1) N(1) N Age Prev(2) N(2) N

+ +

70 1.00000 631 631 70 0.00000 0 631

+ +

71 0.99681 625 627 71 0.00319 2 627

+ +

72 0.97125 1115 1148 72 0.02875 33 1148

+ +

It +means that at age 70, the prevalence in state 1 is 1.000 and in +state 2 is 0.00 . At age 71 the number of individuals in state 1 +is 625 and in state 2 is 2, hence the total number of people aged +71 is 625+2=627.

+ +

- +Estimated parameters and covariance matrix: rbiaspar.txt

+ +

This +file contains all the maximisation results:

+ +

 -2 log likelihood= 21660.918613445392

+ +

 Estimated parameters: a12 = -12.290174 b12 = 0.092161

+ +

                       a13 = -9.155590  b13 = 0.046627

+ +

                       a21 = -2.629849  b21 = -0.022030

+ +

                       a23 = -7.958519  b23 = 0.042614

+ +

 Covariance matrix: Var(a12) = 1.47453e-001

+ +

                    Var(b12) = 2.18676e-005

+ +

                    Var(a13) = 2.09715e-001

+ +

                    Var(b13) = 3.28937e-005

+ +

                    Var(a21) = 9.19832e-001

+ +

                    Var(b21) = 1.29229e-004

+ +

                    Var(a23) = 4.48405e-001

+ +

                    Var(b23) = 5.85631e-005

+ +

By +substitution of these parameters in the regression model, we +obtain the elementary transition probabilities:

+ +

- +Transition probabilities: pijrbiaspar.txt

+ +

Here +are the transitions probabilities Pij(x, x+nh) where nh is a +multiple of 2 years. The first column is the starting age x (from +age 50 to 100), the second is age (x+nh) and the others are the +transition probabilities p11, p12, p13, p21, p22, p23. For +example, line 5 of the file is:

+ +

 100 106 0.02655 0.17622 0.79722 0.01809 0.13678 0.84513

+ +

and +this means:

+ +

p11(100,106)=0.02655

+ +

p12(100,106)=0.17622

+ +

p13(100,106)=0.79722

+ +

p21(100,106)=0.01809

+ +

p22(100,106)=0.13678

+ +

p22(100,106)=0.84513

+ +

- +Stationary +prevalence in each state: plrbiaspar.txt

+ +

#Prevalence

+ +

#Age 1-1 2-2

+ +

#************

+ +

70 0.90134 0.09866

+ +

71 0.89177 0.10823

+ +

72 0.88139 0.11861

+ +

73 0.87015 0.12985

+ +

At +age 70 the stationary prevalence is 0.90134 in state 1 and +0.09866 in state 2. This stationary prevalence differs from +observed prevalence. Here is the point. The observed prevalence +at age 70 results from the incidence of disability, incidence of +recovery and mortality which occurred in the past of the cohort. +Stationary prevalence results from a simulation with actual +incidences and mortality (estimated from this cross-longitudinal +survey). It is the best predictive value of the prevalence in the +future if "nothing changes in the future". This is +exactly what demographers do with a Life table. Life expectancy +is the expected mean time to survive if observed mortality rates +(incidence of mortality) "remains constant" in the +future.

+ +

- +Standard deviation of stationary prevalence: vplrbiaspar.txt

+ +

The +stationary prevalence has to be compared with the observed +prevalence by age. But both are statistical estimates and +subjected to stochastic errors due to the size of the sample, the +design of the survey, and, for the stationary prevalence to the +model used and fitted. It is possible to compute the standard +deviation of the stationary prevalence at each age.

+ +

-Observed +and stationary prevalence in state (2=disable) with the confident +interval: vbiaspar21.gif

+ +

This +graph exhibits the stationary prevalence in state (2) with the +confidence interval in red. The green curve is the observed +prevalence (or proportion of individuals in state (2)). Without +discussing the results (it is not the purpose here), we observe +that the green curve is rather below the stationary prevalence. +It suggests an increase of the disability prevalence in the +future.

+ +

-Convergence +to the stationary prevalence of disability: pbiaspar11.gif
+

+ +

This +graph plots the conditional transition probabilities from an +initial state (1=healthy in red at the bottom, or 2=disable in +green on top) at age x to the final state 2=disable at +age x+h. Conditional means at the condition to be alive +at age x+h which is hP12x + hP22x. The +curves hP12x/(hP12x + hP22x) and hP22x/(hP12x ++ hP22x) converge with h, to the stationary +prevalence of disability. In order to get the stationary +prevalence at age 70 we should start the process at an earlier +age, i.e.50. If the disability state is defined by severe +disability criteria with only a few chance to recover, then the +incidence of recovery is low and the time to convergence is +probably longer. But we don't have experience yet.

+ +

- +Life expectancies by age and initial health status: erbiaspar.txt

+ +

# Health expectancies

+ +

# Age 1-1 1-2 2-1 2-2

+ +

70 10.9226 3.0401 5.6488 6.2122

+ +

71 10.4384 3.0461 5.2477 6.1599

+ +

72 9.9667 3.0502 4.8663 6.1025

+ +

73 9.5077 3.0524 4.5044 6.0401

+ +

For example 70 10.9226 3.0401 5.6488 6.2122 means:

+ +

e11=10.9226 e12=3.0401 e21=5.6488 e22=6.2122

+ +

For +example, life expectancy of a healthy individual at age 70 is +10.92 in the healthy state and 3.04 in the disability state +(=13.96 years). If he was disable at age 70, his life expectancy +will be shorter, 5.64 in the healthy state and 6.21 in the +disability state (=11.85 years). The total life expectancy is a +weighted mean of both, 13.96 and 11.85; weight is the proportion +of people disabled at age 70. In order to get a pure period index +(i.e. based only on incidences) we use the computed or +stationary prevalence at age 70 (i.e. computed from +incidences at earlier ages) instead of the observed prevalence +(for example at first exam) (see +below).

+ +

- +Variances of life expectancies by age and initial health status: vrbiaspar.txt

+ +

For +example, the covariances of life expectancies Cov(ei,ej) at age +50 are (line 3)

+ +

   Cov(e1,e1)=0.4776  Cov(e1,e2)=0.0488=Cov(e2,e1)  Cov(e2,e2)=0.0424

+ +

- +Health expectancies with +standard errors in parentheses: trbiaspar.txt

+ +

#Total LEs with variances: e.. (std) e.1 (std) e.2 (std)

+ +

70 13.76 (0.22) 10.40 (0.20) 3.35 (0.14)

+ +

Thus, +at age 70 the total life expectancy, e..=13.76years is the +weighted mean of e1.=13.96 and e2.=11.85 by the stationary +prevalence at age 70 which are 0.90134 in state 1 and 0.09866 in +state 2, respectively (the sum is equal to one). e.1=10.40 is the +Disability-free life expectancy at age 70 (it is again a weighted +mean of e11 and e21). e.2=3.35 is also the life expectancy at age +70 to be spent in the disability state.

+ +

-Total +life expectancy by age and health expectancies in states +(1=healthy) and (2=disable): ebiaspar1.gif

+ +

This +figure represents the health expectancies and the total life +expectancy with the confident interval in dashed curve.

+ +

Standard +deviations (obtained from the information matrix of the model) of +these quantities are very useful. Cross-longitudinal surveys are +costly and do not involve huge samples, generally a few +thousands; therefore it is very important to have an idea of the +standard deviation of our estimates. It has been a big challenge +to compute the Health Expectancy standard deviations. Don't be +confuse: life expectancy is, as any expected value, the mean of a +distribution; but here we are not computing the standard +deviation of the distribution, but the standard deviation of the +estimate of the mean.

+ +

Our +health expectancies estimates vary according to the sample size +(and the standard deviations give confidence intervals of the +estimate) but also according to the model fitted. Let us explain +it in more details.

+ +

Choosing +a model means at least two kind of choices. First we have to +decide the number of disability states. Second we have to design, +within the logit model family, the model: variables, covariables, +confounding factors etc. to be included.

+ +

More +disability states we have, better is our demographical approach +of the disability process, but smaller are the number of +transitions between each state and higher is the noise in the +measurement. We do not have enough experiments of the various +models to summarize the advantages and disadvantages, but it is +important to say that even if we had huge and unbiased samples, +the total life expectancy computed from a cross-longitudinal +survey, varies with the number of states. If we define only two +states, alive or dead, we find the usual life expectancy where it +is assumed that at each age, people are at the same risk to die. +If we are differentiating the alive state into healthy and +disable, and as the mortality from the disability state is higher +than the mortality from the healthy state, we are introducing +heterogeneity in the risk of dying. The total mortality at each +age is the weighted mean of the mortality in each state by the +prevalence in each state. Therefore if the proportion of people +at each age and in each state is different from the stationary +equilibrium, there is no reason to find the same total mortality +at a particular age. Life expectancy, even if it is a very useful +tool, has a very strong hypothesis of homogeneity of the +population. Our main purpose is not to measure differential +mortality but to measure the expected time in a healthy or +disability state in order to maximise the former and minimize the +latter. But the differential in mortality complexifies the +measurement.

+ +

Incidences +of disability or recovery are not affected by the number of +states if these states are independant. But incidences estimates +are dependant on the specification of the model. More covariates +we added in the logit model better is the model, but some +covariates are not well measured, some are confounding factors +like in any statistical model. The procedure to "fit the +best model' is similar to logistic regression which itself is +similar to regression analysis. We haven't yet been so far +because we also have a severe limitation which is the speed of +the convergence. On a Pentium III, 500 MHz, even the simplest +model, estimated by month on 8,000 people may take 4 hours to +converge. Also, the program is not yet a statistical package, +which permits a simple writing of the variables and the model to +take into account in the maximisation. The actual program allows +only to add simple variables like age+sex or age+sex+ age*sex but +will never be general enough. But what is to remember, is that +incidences or probability of change from one state to another is +affected by the variables specified into the model.

+ +

Also, +the age range of the people interviewed has a link with the age +range of the life expectancy which can be estimated by +extrapolation. If your sample ranges from age 70 to 95, you can +clearly estimate a life expectancy at age 70 and trust your +confidence interval which is mostly based on your sample size, +but if you want to estimate the life expectancy at age 50, you +should rely in your model, but fitting a logistic model on a age +range of 70-95 and estimating probabilities of transition out of +this age range, say at age 50 is very dangerous. At least you +should remember that the confidence interval given by the +standard deviation of the health expectancies, are under the +strong assumption that your model is the 'true model', which is +probably not the case.

+ +

- +Copy of the parameter file: orbiaspar.txt

+ +

This +copy of the parameter file can be useful to re-run the program +while saving the old output files.

+ +

- +Prevalence forecasting: frbiaspar.txt

+ +

First, +we have estimated the observed prevalence between 1/1/1984 and +1/6/1988. The mean date of interview (weighed average of +the interviews performed between1/1/1984 and 1/6/1988) is +estimated to be 13/9/1985, as written on the top on the file. +Then we forecast the probability to be in each state.

+ +

Example, +at date 1/1/1989 :

+ +

# StartingAge FinalAge P.1 P.2 P.3

+ +

# Forecasting at date 1/1/1989

+ +

73 0.807 0.078 0.115

+ +

Since +the minimum age is 70 on the 13/9/1985, the youngest forecasted +age is 73. This means that at age a person aged 70 at 13/9/1989 +has a probability to enter state1 of 0.807 at age 73 on 1/1/1989. +Similarly, the probability to be in state 2 is 0.078 and the +probability to die is 0.115. Then, on the 1/1/1989, the +prevalence of disability at age 73 is estimated to be 0.088.

+ +

- +Population forecasting: poprbiaspar.txt

+ +

# Age P.1 P.2 P.3 [Population]

+ +

# Forecasting at date 1/1/1989

+ +

75 572685.22 83798.08

+ +

74 621296.51 79767.99

+ +

73 645857.70 69320.60

+ +

# Forecasting at date 1/1/1990

+ +

76 442986.68 92721.14 120775.48

+ +

75 487781.02 91367.97 121915.51

+ +

74 512892.07 85003.47 117282.76

+ +

From the population file, we estimate the +number of people in each state. At age 73, 645857 persons are in +state 1 and 69320 are in state 2. One year latter, 512892 are +still in state 1, 85003 are in state 2 and 117282 died before +1/1/1990.

+ +

Trying an example

+ +

Since +you know how to run the program, it is time to test it on your +own computer. Try for example on a parameter file named imachpar.txt which is a copy of mypar.txt +included in the subdirectory of imach, mytry. Edit it to change +the name of the data file to ..\data\mydata.txt if you don't want +to copy it on the same directory. The file mydata.txt is a +smaller file of 3,000 people but still with 4 waves.

+ +

Click +on the imach.exe icon to open a window. Answer to the question: 'Enter +the parameter file name:'

+ + + + + +

IMACH, + Version 0.7

Enter + the parameter file name: ..\mytry\imachpar.txt

+ +

Most +of the data files or image files generated, will use the +'imachpar' string into their name. The running time is about 2-3 +minutes on a Pentium III. If the execution worked correctly, the +outputs files are created in the current directory, and should be +the same as the mypar files initially included in the directory mytry.

+ +

·                Output on the screen The output screen looks like this Log file

+ +

#title=MLE datafile=..\data\mydata.txt lastobs=3000 firstpass=1 lastpass=3

+ +

ftol=1.000000e-008 stepm=24 ncov=2 nlstate=2 ndeath=1 maxwav=4 mle=1 weight=0

+ +

Total number of individuals= 2965, Agemin = 70.00, Agemax= 100.92

+ +

Warning, no any valid information for:126 line=126

+ +

Warning, no any valid information for:2307 line=2307

+ +

Delay (in months) between two waves Min=21 Max=51 Mean=24.495826

+ +

These lines give some warnings on the data file and also some raw statistics on frequencies of transitions.

+ +

Age 70 1.=230 loss[1]=3.5% 2.=16 loss[2]=12.5% 1.=222 prev[1]=94.1% 2.=14

+ +

 prev[2]=5.9% 1-1=8 11=200 12=7 13=15 2-1=2 21=6 22=7 23=1

+ +

Age 102 1.=0 loss[1]=NaNQ% 2.=0 loss[2]=NaNQ% 1.=0 prev[1]=NaNQ% 2.=0

+ +

Maximisation + with the Powell algorithm. 8 directions are given + corresponding to the 8 parameters. This can be rather + long to get convergence.
+
+ Powell iter=1 -2*LL=11531.405658264877 1 0.000000000000 2 + 0.000000000000 3
+ 0.000000000000 4 0.000000000000 5 0.000000000000 6 + 0.000000000000 7
+ 0.000000000000 8 0.000000000000
+ 1..........2.................3..........4.................5.........
+ 6................7........8...............
+ Powell iter=23 -2*LL=6744.954108371555 1 -12.967632334283 +
+ 2 0.135136681033 3 -7.402109728262 4 0.067844593326
+ 5 -0.673601538129 6 -0.006615504377 7 -5.051341616718
+ 8 0.051272038506
+ 1..............2...........3..............4...........
+ 5..........6................7...........8.........
+ #Number of iterations = 23, -2 Log likelihood = + 6744.954042573691
+ # Parameters
+ 12 -12.966061 0.135117
+ 13 -7.401109 0.067831
+ 21 -0.672648 -0.006627
+ 23 -5.051297 0.051271

+ +

·                Calculation of the hessian matrix. Wait...

+ +

12345678.12.13.14.15.16.17.18.23.24.25.26.27.28.34.35.36.37.38.45.46.47.48.56.57.58.67.68.78

+ +

Inverting the hessian to get the covariance matrix. Wait...

+ +

#Hessian matrix#

+ +

3.344e+002 2.708e+004 -4.586e+001 -3.806e+003 -1.577e+000 -1.313e+002 3.914e-001 3.166e+001

+ +

2.708e+004 2.204e+006 -3.805e+003 -3.174e+005 -1.303e+002 -1.091e+004 2.967e+001 2.399e+003

+ +

-4.586e+001 -3.805e+003 4.044e+002 3.197e+004 2.431e-002 1.995e+000 1.783e-001 1.486e+001

+ +

-3.806e+003 -3.174e+005 3.197e+004 2.541e+006 2.436e+000 2.051e+002 1.483e+001 1.244e+003

+ +

-1.577e+000 -1.303e+002 2.431e-002 2.436e+000 1.093e+002 8.979e+003 -3.402e+001 -2.843e+003

+ +

-1.313e+002 -1.091e+004 1.995e+000 2.051e+002 8.979e+003 7.420e+005 -2.842e+003 -2.388e+005

+ +

3.914e-001 2.967e+001 1.783e-001 1.483e+001 -3.402e+001 -2.842e+003 1.494e+002 1.251e+004

+ +

3.166e+001 2.399e+003 1.486e+001 1.244e+003 -2.843e+003 -2.388e+005 1.251e+004 1.053e+006

+ +

# Scales

+ +

12 1.00000e-004 1.00000e-006

+ +

13 1.00000e-004 1.00000e-006

+ +

21 1.00000e-003 1.00000e-005

+ +

23 1.00000e-004 1.00000e-005

+ +

# Covariance

+ +

  1 5.90661e-001

+ +

  2 -7.26732e-003 8.98810e-005

+ +

  3 8.80177e-002 -1.12706e-003 5.15824e-001

+ +

  4 -1.13082e-003 1.45267e-005 -6.50070e-003 8.23270e-005

+ +

  5 9.31265e-003 -1.16106e-004 6.00210e-004 -8.04151e-006 1.75753e+000

+ +

  6 -1.15664e-004 1.44850e-006 -7.79995e-006 1.04770e-007 -2.12929e-002 2.59422e-004

+ +

  7 1.35103e-003 -1.75392e-005 -6.38237e-004 7.85424e-006 4.02601e-001 -4.86776e-003 1.32682e+000

+ +

  8 -1.82421e-005 2.35811e-007 7.75503e-006 -9.58687e-008 -4.86589e-003 5.91641e-005 -1.57767e-002 1.88622e-004

+ +

# agemin agemax for lifexpectancy, bage fage (if mle==0 ie no data nor Max likelihood).

+ +

agemin=70 agemax=100 bage=50 fage=100

+ +

Computing prevalence limit: result on file 'plrmypar.txt'

+ +

Computing pij: result on file 'pijrmypar.txt'

+ +

Computing Health Expectancies: result on file 'ermypar.txt'

+ +

Computing Variance-covariance of DFLEs: file 'vrmypar.txt'

+ +

Computing Total LEs with variances: file 'trmypar.txt'

+ +

Computing Variance-covariance of Prevalence limit: file 'vplrmypar.txt'

+ +

End of Imach

+ +

Once +the running is finished, the program requires a caracter:

+ + + + + +

Type + e to edit output files, c to start again, and q for + exiting:

+ +

First +you should enter e to edit the master file +mypar.htm.

+ +

Outputs + files
+
+ - Observed prevalence in each state: pmypar.txt
+ - Estimated parameters and the covariance matrix: rmypar.txt
+ - Stationary prevalence in each state: plrmypar.txt
+ - Transition probabilities: pijrmypar.txt
+ - Copy of the parameter file: ormypar.txt
+ - Life expectancies by age and initial health status: ermypar.txt
+ - Variances of life expectancies by age and initial + health status: vrmypar.txt +
+ - Health expectancies with their variances: trmypar.txt
+ - Standard deviation of stationary prevalence: vplrmypar.txt
+ - Prevalences forecasting: frmypar.txt +
+ - Population forecasting (if popforecast=1): poprmypar.txt
Graphs +
+
+ -One-step transition + probabilities
+ -Convergence to the + stationary prevalence
+ -Observed and stationary + prevalence in state (1) with the confident interval
+ -Observed and stationary + prevalence in state (2) with the confident interval
+ -Health life + expectancies by age and initial health state (1)
+ -Health life + expectancies by age and initial health state (2)
+ -Total life expectancy by + age and health expectancies in states (1) and (2).

+ +

This +software have been partly granted by Euro-REVES, a concerted +action from the European Union. It will be copyrighted +identically to a GNU software product, i.e. program and software +can be distributed freely for non commercial use. Sources are not +widely distributed today. You can get them by asking us with a +simple justification (name, email, institute) mailto:brouard@ined.fr and mailto:lievre@ined.fr .

+ +

Latest +version (0.7 of February 2002) can be accessed at http://euroreves.ined.fr/imach

+ + diff --git a/html/index.htm b/html/index.htm index 28bd56a..41f794e 100644 --- a/html/index.htm +++ b/html/index.htm @@ -47,35 +47,26 @@ National University, Canberra). href="mailto:lievre@ined.fr">lievre@ined.fr) -

Main publication concerning the method is -Lièvre A., Brouard N. and Heathcote Ch. (2003) Estimating Health Expectancies -from Cross-longitudinal surveys. Mathematical Population Studies.- 10(4), pp. 211-248 +

Main publication concerning the method is +Lièvre A., Brouard N. and Heathcote Ch. (2003) Estimating Health Expectancies +from Cross-longitudinal surveys. Mathematical Population Studies.- 10(4), pp. 211-248

Installation

Since the program produces many output files, we suggest to -have a separate directory for imach.
+have a separate directory for imach. A classical installation for Windows is on Program Files\imach

Uncompress the zip file imach.zip - into an empty directory, imach, for - example.
Different sub-directories are created:
- Download latest imach-0.97b.exe + and execute it .
- Different sub-directories are created: +
  - doc: most of the documentation. The main document is doc/imach.htm .
  - bin:
    - imach.exe the - executable for Windows 95/98/NT compiled - with gcc from cygwin.
    - graph.gp: an output file which is used by - the grapher, gnuplot, described next:
    -
  - data: Here are two data files:
    Here are also two data files:
    - data1.txt which is the main data file on which the @@ -85,15 +76,14 @@ have a separate directory for imach.
      smaller data file which you can use for your own trial.
    +
  - bin:
    - imach.exe the + executable for Windows 95/98/NT compiled + with gcc from cygwin.
    - gnuplot, the grapher use by IMaCh and described next
@@ -118,21 +107,28 @@ have a separate directory for imach.
here to access to the detailed documentation

This software have been partly granted by Euro-REVES, a concerted -action from the European Union. Our work is copyrighted as -a GNU software product, i.e. program and software -can be distributed freely for non commercial use, but actually some sources are -not widely distributed today because they borrow some code from the book "Numerical Recipes in C" which is copyrighted. If you are an owner of theses sources you can get our source by asking us with -a simple justification (name, email, Institute) mailto:imach-dev@listes.ined.fr
+href="http://euroreves.ined.fr">Euro-REVES, a concerted action +from the European Union. In 2003-2004 it has been granted by the +French Institute on Longevity. Our work is copyrighted as a GNU +software product, i.e. program and software can be distributed freely +for non commercial use, but actually some sources are not widely +distributed today because they borrow some codes from the book +"Numerical Recipes in C" which is copyrighted. If you are an owner of +theses sources you can get our sources by asking us with a simple +justification (name, email, Institute) mailto:imach-dev@listes.ined.fr +
-
Today we are two developpers only but we already use a private CVS server. The CVS server will be freely accessible as soon as we have replaced "Numerical Recipes in C maximization routines" with equivalent routines from the new GNU scientific library. +
Today we are two developpers only but we already use a private CVS +server. The CVS server will be freely accessible as soon as we have +replaced "Numerical Recipes in C maximization routines" with +equivalent routines from the new GNU scientific library.
Latest documentation can be accessed at http://euroreves.ined.fr/imach
-Imach version (0.63 of 16 march 2000) can be downloaded in zip -file http://euroreves.ined.fr/imach/imach.zip
-Imach version 0.64 May 2001 can be downloaded in zip file http://euroreves.ined.fr/imach/imach.zip +
Imach version 0.64 May 2001 can be downloaded in zip file http://euroreves.ined.fr/imach/imach-0-64.zip
Imach version 0.8 March 2002 can be downloaded in zip file http://euroreves.ined.fr/imach/imach-08a.zip
-Imach latest version 0.96d February 2004 can be downloaded in zip file http://euroreves.ined.fr/imach/imach-096d.zip +
+Imach version 0.97b of June 2004 can be downloaded as a setup.exe file + +http://euroreves.ined.fr/imach/imach-0.97b-setup.exe. The IMaCh +program and gnuplot will be installed in the directory that you want +(usually in Program Files).
-
New (March 2002). We set up a public mailing list of IMaCh's users. -You can subscribe -by sending a mail to imach-users-subscribe@listes.ined.fr (and unsubscribe with imach-users-unsubscribe@listes.ined.fr +
We set up a public mailing list of IMaCh's users. You can +subscribe by sending a mail to imach-users-subscribe@listes.ined.fr +(and unsubscribe with imach-users-unsubscribe@listes.ined.fr
-- 2.43.0

Computing Health +Expectancies using IMaCh

(a Maximum +Likelihood Computer Program using Interpolation of Markov Chains)

INED and EUROREVES

Contribution to the mathematics: C. R. Heathcote (Australian +National University, Canberra).

Contact: Agnès Lièvre (lievre@ined.fr) +

Introduction

On what kind of data can it be used?

The data file

Your first example parameter file

#Imach version 0.7, February 2002, +INED-EUROREVES

First uncommented line

Second +uncommented line

Covariates

Guess +values for optimisation

Guess +values for computing variances

Covariance +matrix of parameters

Age +range for calculation of stationary prevalences and health +expectancies

Computing the observed prevalence

Population- +or status-based health expectancies

Prevalence +forecasting

Last +uncommented line : Population forecasting

Running Imach with this example

Output of the program and graphs

Results +files + +- Observed +prevalence in each state (and at first pass): +prbiaspar.txt

- +Estimated parameters and covariance matrix: rbiaspar.txt

- +Transition probabilities: pijrbiaspar.txt

- +Stationary +prevalence in each state: plrbiaspar.txt

- +Standard deviation of stationary prevalence: vplrbiaspar.txt

-Observed +and stationary prevalence in state (2=disable) with the confident +interval: vbiaspar21.gif

-Convergence +to the stationary prevalence of disability: pbiaspar11.gif +

- +Life expectancies by age and initial health status: erbiaspar.txt

- +Variances of life expectancies by age and initial health status: vrbiaspar.txt

- +Health expectancies with +standard errors in parentheses: trbiaspar.txt

-Total +life expectancy by age and health expectancies in states +(1=healthy) and (2=disable): ebiaspar1.gif

- +Copy of the parameter file: orbiaspar.txt

- +Prevalence forecasting: frbiaspar.txt

- +Population forecasting: poprbiaspar.txt

Trying an example

Installation

Results +files
+
+- Observed +prevalence in each state (and at first pass): +prbiaspar.txt

-Convergence +to the stationary prevalence of disability: pbiaspar11.gif
+