National Health and Nutrition Examination Survey (NHANES).is a research program designed to assess the health and nutritional status of the United States** and children. What makes this survey unique is that it combines:InterviewswithPhysical examination。The U.S. Centers for Disease Control and Prevention (CDC) is responsible for providing health statistics to the nation.
Nhanes interviews include:DemographicsSocio-economicDiet and health-related issuesThe inspection section includes:MedicalDentistrywithPhysiological measurements, as well as laboratory tests performed by trained medical personnel.
In recent years, the number of papers published in public databases has increased year by year, and the data quality of NHANES database is high, and the number of papers published every year is even higherNovel composite indicatorsAnd so on!
Today's reproduction article brings you a reproduction of an article on synthetic metrics in the NHANES database, including all of them** It is also provided with the processed data
1. Introduction to the reproduction article
The reproduction article that we are going to introduce today is published in ".nutrition metabolism and cardiovascular diseasesif=3.9Titled:“association of life’s essential 8 with all-cause and cardiovascular mortality among us adults: a prospective cohort study from thenhanes 2005-2014 ”Research**.
Title: The relationship between the eight elements of life and health and all-cause cardiovascular mortality in adults in the United States: a prospective cohort study of NHANES 2005-2014.
Eight elements of life and health
The Eight Elements of Life (life's essential 8), which is one of the most frequently used composite indicators in NHANES database articles in recent years, including:DietPhysical activitySmoking (nicotine exposure).Sleep healthilybmiBlood lipidsBlood sugarwithBlood pressure。Each of these indicators has a new scoring algorithm (0-100 points), which can be generated at the endNew composite cardiovascular fitness score (0-100 points).
The overall score < 50 points, 50-79 points, and 80 points, respectively, indicating cardiovascular healthPoorMediumwithBetter
Article data introduction
The variables involved in the study of the NHANES database are shown in the table below, and the variables used in this reproduction are also the same as in the articleBe consistent
Medicine** with statistical analysis"Reply"241 reproduction article"Get all** and data.
2. R language reproduction
The statistical methods included in this reproduction are:
Baseline difference analysisDraw a km curveCOX regression multi-model control promiscuityTrend analysis p trendDraw a non-restricted cubic spline (RCS).
Data import and preprocessing
First of all, we imported the processed data extracted from the NHANES database, and the reproduced data included 19,481 research subjects (the original article n=23,110), and the sample size was slightly different, so please pay more attention to the use of statistical methods.
Baseline difference analysis
This replication baseline** is usedtableone package, here“myvars”summarizes the baseline tableAll variables, some of which are categorical variables, need to be passed“catvars”to specify,Otherwise, categorical data will also be presented as quantitative data.
Here tab2 and tab3 show two ways of statistical description, tab2 does not specify the grouping variable, then only the data distribution of each variable is displayed, tab3 uses"strata=" specifies the grouping variable, on the basis of showing the distribution of data, added:Comparison of differences between grouped data
In addition,"showalllevels = true" indicates that the results of all categorical factors of the categorical variable are displayed“nonnormal =”The specified quantitative data will be used as:Skewed distributionFor analysis, if all quantitative data are skewed, it can be used succinctly“nonnormal = true”to represent.
Finally, the baseline table result output is saved inWorkspaces, here we set the save asCSV format
The results in CSV format are displayed as follows:
Draw a km curve
This is where the km curve is drawnsurvival packagewithsurvminer package, if it needs to be done separatelylogrank test, the survdiff function does it directly!
The output reads p<2e-16 of the last line, and the canonical writing should be p<0001。
Used heresurtvfit functionBuild the model,ggsurvplot functionResponsible for the drawing and beautification of graphics, there are many parameters that can be adjusted, here are marked next to **, you can adjust as needed.
Drawing results are displayed
COX returns
Here the survival package is used for regression model modeling, and the autoreg package can beautify the output results, generate a three-line ** style that is more intuitive and concise, and you can also customize the regression method, where"uni=true"Refers to the output of a univariate result'threshold"It is possible to define the p-threshold at which the filter variable enters multivariate regression
Finally, take advantagerrtable packageThe result is output to Word, and the Word version of the result is also saved inWorkspacesMiddle.
Taking model 3 as an example, the R output result is displayed:
Trend analysis p trend
There are two ways to calculate P Trend:
Rank or quantitative independent variables were directly included in the regression analysis, and the median of each group was used as a special value as the node value of the trend analysis.
, hereMore recommended approachThe following two methods are demonstrated separately:
Method 1: Rank or quantitative independent variables were directly included in the regression analysis
, hereas.numeric(cvh1)That is, the original categorical variable CVH1 is converted into a numerical variable and directly incorporated into the regression model, and the rest** is consistent with ordinary COX regression.
Results:
Compared with the method, there is an extra step before the regression analysis, that is, the median of each group of data is selected and transformed, taking the repeated data as an example, there are 3 categories of CVH1 variablesUse the median transformation within a groupto become special tripartite data.
Due to the data conversion usedPipe character %>%., so it needs to be loadeddplyr packageAfter the conversion is completed, the regression analysis is performed as the same method, except that the converted CVH3 is used instead of the original CVH1.
Taking model 3 as an example, the R output result is displayed:
The results of the calculations are also slightly different compared to the methods, but the direction of the positive and negative is generally consistent.
Plot the RCS curve
Here are also two ways to draw R packages, which are plotrcs packages or a combination of RMS and ggplot packages.
plotrcs packageIt's relatively easy to draw RCS images, and the parameter settings are easy to understand, but the other details are a bit overwhelming.
The resulting image shows:
1. Calculate the nonlinear relationship between P value and HR value
2. Draw an RCS image
ggplot2 package is much more flexible to draw images, e.g. by adding an auxiliary line"geom_hline"Refers to the longitudinal axis position of the guide"linetype=2"Defines the line type as solid, in the same way,"geom_vline"Refers to the horizontal axis position of the guidehere"xintercept"To find the value of the variable corresponding to HR=1, you need to query the HR table calculated in the previous step.
The resulting image shows:
3. Storm statistics platform reappears
The storm statistics platform is a statistical analysis platform based on r**, which has a fast, accurate and simple operation to take you, and has been realizedMenu-based operation of multiple statistical analysis methods, here is a comprehensive display for you through the reproduction of the article. (Search for the "Storm Statistics" platform).
Baseline difference analysis
Once you're on the Storm Stats platform, click on"Storm Smart Statistics"."Regression controls for confounding bias""The multi-model approach controlled for confounding bias", this module can be a one-stop solutionBaseline difference analysisCOX regression multiple model construction
Follow the steps one by one according to the following prompts, and there will be a three-line baseline difference on the right, which is very convenient to operate
The p-value calculation is consistent with the R result, and more statistics are displayed than in R!
COX returns
So-calledMultiple models control for confounding bias, i.e., by building model1, model2, model3, etcGradually adjust for different confounders and observe changes in the p-value of core exposure.
At present, the storm statistics platform can build up to 4 multi-factor models"Multiple models control for confounding bias"module, select the regression model, select the regression variables in turn, and give the result analysis of model1 on the right.
The construction of model2 and model3 in the back is selected in turn according to the prompts on the interface below, and the final multi-model three-line table is directly generated on the right, eliminating the steps of drawing the table and filling in the data by yourselfResultsHR values, 95% CI and P values were consistent with R language
Draw a km curve
Storm statistics are needed to plot the km curve from"Storm Smart Statistics".Survival analysisSurvival analysis full setHere enters, after importing the data, in"Survival curve vs. survival time"module, select the variable in turn, and the km curve is generated on the right.
The steps can make fine adjustments to the graph, and finally directly,**The phenomenon of font squeezing of the risk table due to display problems will return to normal in **of**!
Trend analysis p trend
The calculation of trend analysis p trend needs to complete the transformation of data in advance, such as the median transformation within the group, and then use the platform to perform regression analysisCVH1 is quantified as the coreIncluded in the analysis, a p-value can be obtained, which is the p-trend.
Plot the RCS curve
To draw the RCS curve, you need to enter a different module and click it"Storm Smart Statistics".Xiaobai draws a beautiful statistical chartDraw RCS curves with one click
After re-importing the data, select the variables in turn according to the following prompts, and you can get the RCS image!The p-value results are exactly the same as in Rand this module for everyoneR language is available**After completing the analysis, you can also copy the R language for verification, which is also supported**HD**
This is the end of this article reproduction, if you are interested in this reproduction** or the data, welcome to check inMedicine** with statistical analysisBackground reply241 reproduction articleGetFull set of r** and practical data