(50g) Working Multiple linear regression in English
|
04-20-2022, 03:50 AM
Post: #13
|
|||
|
|||
RE: (50g) Working Multiple linear regression in English
My code with comments: and references
--- Installation --- STOre the below INOUT (see end of this file for INOUT's description) code as directory MLR. You can call it anything else you like. In the directory you will find a program called MLR. MLR takes two arguments: X (design matrix, first column is 1s, columns 2 to p+1 (p=number of independent variables) are independent variable values. Y is the response matrix, which contains the values of the dependent variable for each observation. MLR then stores all calculated values into OUTPUT and also puts some important ones on the stack. It is similar to what R’s lm (linear model) object does. RESET clears the contents of OUTPUT. Directory SAMPLE contains a sample X and Y matrix. --- Code begin, cut this line --- DIR OUTPUT DIR END SAMPLE DIR Y [[ 10.2 ] [ 15.8 ] [ 19 ] [ 43.3 ] [ 1.02 ] [ 57.3 ]] X [[ 1 1 1 1 ] [ 1 2 1 2 ] [ 1 4 2 1 ] [ 1 3 4 6 ] [ 1 0 0 0 ] [ 1 9 6 5 ]] END MLR \<< 0. \-> X Y j \<< OUTPUT X TRAN DUP X * INV SWAP * Y * 'b' STO X TRAN DUP X * INV X SWAP * SWAP * 'H' STO X SIZE OBJ\-> DROP 1. - 'p' STO 'n' STO n IDN 'I' STO { n n } 1. CON 'J' STO H n INV J * - Y * Y TRAN SWAP * ABS 'SSR' STO Y TRAN I H - Y * * ABS 'SSE' STO n p - 1. - 'dfE' STO p 'dfR' STO n 1. - 'dfT' STO SSR dfR / SSE dfE / / 'F' STO dfR dfE F UTPF 'pF' STO SSR SSE + 'SSTO' STO SSR SSTO / 'R2' STO 1. SSE dfE / SSTO dfT / / - 'adjR2' STO SSE dfE / '\Gs2' STO \Gs2 X TRAN X * INV * 'C' STO { 'p+1.' 1. } 0. CON 'pi' STO pi 1. p 1. + FOR j { j 1. } dfE b j GET C { j j } GET \v/ / UTPT 2. * PUT NEXT 'pi' STO b "b" \->TAG F "F" \->TAG pF "pF" \->TAG R2 "R2" \->TAG adjR2 "adjR2" \->TAG pi "p(i}" \->TAG UPDIR \>> \>> RESET \<< 'OUTPUT' DUP PGDIR CRDIR \>> END ---Code end, cut this line --- --- MLR code (same as above), this time with comments, just for informational purposes, you won't be able to input this code with comments into the HP 50g --- MLR \<< 0. \-> X Y j ### X is the design matrix. First column is all 1's, remaining columns are values of X1, X2, X3, ..., XN ### Each row of the X design matrix represents an observation of X predictors. ### Y is the response column vector, which has to be represented as a n x 1 matrix for HP 50g's matrix algebra to work. \<< OUTPUT ### We will store all values into the OUTPUT directory. X TRAN DUP X * INV SWAP * Y * 'b' STO ### Calculate regression's b hat coefficients (sometimes called beta hat column vector). X TRAN DUP X * INV X SWAP * SWAP * 'H' STO ### Calculate H (hat) matrix. X SIZE OBJ\-> DROP 1. - 'p' STO 'n' STO ### Number of predictor variables p is the number of predictor variables ### which is the number of columns - 1 of the X design matrix. n is the number of observations which comes from the number of rows of the Y response matrix. n IDN 'I' STO ### Create an n x n identity matrix { n n } 1. CON 'J' STO ### Create an n x n matrix of 1' H n INV J * - Y * Y TRAN SWAP * ABS 'SSR' STO ### H is the "hat" matrix. SSR is sum of squares for the regression. Sometimes SSR is denoted as SSM (sum of squares of the model). Y TRAN I H - Y * * ABS 'SSE' STO ### Calculate standard sum of errors (residuals) n p - 1. - 'dfE' STO ### Calculate degrees of freedom for errors (residuals) = n - (p+1). The +1 comes from the fact that you want to to calculate the intercept too. p 'dfR' STO ### Degrees of freedom for the regression, sometimes referred to as the model n 1. - 'dfT' STO ### Total degrees of freedom SSR dfR / SSE dfE / / 'F' STO ### Calculate F statistic for entire regression model. dfR dfE F UTPF 'pF' STO ### Calculate probability belonging to dfR, dfE and the F value for the entire regression model. SSR SSE + 'SSTO' STO ### SSTO is the standard sum of totals (sometimes denoted as SST) SSR SSTO / 'R2' STO ### Calculate R^2 1. SSE dfE / SSTO dfT / / - 'adjR2' STO ### Calculate adjusted R^2 SSE dfE / '\Gs2' STO ### Calculate sigma^2 \Gs2 X TRAN X * INV * 'C' STO ### Calculate Covariance matrix = sigma^2 * (X*X')^(-1) { 'p+1.' 1. } 0. CON 'pi' STO ### Create an empty matrix of p values corresponding to each correlation coeeficient's Student t test. pi 1. p 1. + FOR j { j 1. } dfE b j GET C { j j } GET \v/ / UTPT 2. * PUT NEXT ### For each correlation coefficient bj, get t_j=b_j/sqrt(C_jj) (diagonal values of the C matrix), then calculate the p value of t_i with dfE, the Student t distribution for lower and upper tails. 'pi' STO ### Store p_i of each correlation coefficient into the pi column vector. b "b" \->TAG F "F" \->TAG pF "pF" \->TAG R2 "R2" \->TAG adjR2 "adjR2" \->TAG pi "p(i}" \->TAG ### Print out important results UPDIR ### Go back to the main directory. \>> \>> Use INOUT from https://www.hpmuseum.org/forum/thread-13941.html to load the below program 'IN' Code: @ 7-bit ascii string -> calc object \<< \->STR 3 TRANSIO RCLF SIZE 3 > #2F34Dh #3016Bh IFTE SYSEVAL + STR\-> \>> 'OUT' Code: @ Calc object -> 7-bit ascii string \<< STD 64 STWS \->STR 3 TRANSIO RCLF SIZE 3 > #2F34Fh #2FEC9h IFTE SYSEVAL \>> --- References --- https://online.stat.psu.edu/stat462/node/132/ https://www.stat.purdue.edu/~boli/stat51...topic3.pdf https://www.robots.ox.ac.uk/~fwood/teach...ure_12.pdf , esp. page "Quadratic forms" https://github.com/SurajGupta/r-source/b...ats/R/lm.R , esp. the summary() function http://users.stat.umn.edu/~helwig/notes/mlr-Notes.pdf https://scholar.princeton.edu/sites/defa...slides.pdf http://www.stat.uchicago.edu/~yibi/teach...es/MLR.pdf http://193.6.12.228/uigtk/uise/gtknappal...appali.pdf esp. section 4.1 |
|||
« Next Oldest | Next Newest »
|
User(s) browsing this thread: 9 Guest(s)