Network autoregression model with grouped factor structures*

2023-12-05 12:20:18,

中山大學(xué)學(xué)報(自然科學(xué)版)(中英文) 2023年5期

School of Data Science, Fudan University, Shanghai 200433,China

Abstract: Network autoregression and factor model are effective methods for modeling network time series data.In this study, we propose a network autoregression model with a factor structure that incorporates a latent group structure to address nodal heterogeneity within the network.An iterative algorithm is employed to minimize a least-squares objective function, allowing for simultaneous estimation of both the parameters and the group structure.To determine the unknown number of groups and factors, a PIC criterion is introduced.Additionally, statistical inference of the estimated parameters is presented.To assess the validity of the proposed estimation and inference procedures, we conduct extensive numerical studies.We also demonstrate the utility of our model using a stock dataset obtained from the Chinese A-Share stock market.

Key words: network autoregression; factor structure; heterogeneity; latent group structure; network time series

1 Introduction

Modeling network time series data is a crucial undertaking in numerous fields, as the popularity of network data continues to grow.Network data modeling has a broad range of applications, including social network analysis(Lewis et al., 2008), financial risk management(Matthews, 2013), and spatial-temporal data modeling, among others.

In order to model network time series data, Zhu et al.(2017) proposed a network autoregression model to capture the dynamic relationships between network nodes.While useful, this model is limited by its assumption of homogeneous regression coefficients across all nodes in the network.To allow for greater heterogeneity, a possible extension is to incorporate a group structure among the regression coefficients.This approach is related to recent literature on group panel data models.For example, Lin et al.(2012) studied linear panel data models with parameter heterogeneity and employed group-specific slope coefficients to capture the group effect.They used a threshold-based method for parameter estimation and group membership assignment.Liu et al.(2020) extended this approach to both linear and nonlinear panel data models with group structure and proposed an iterative estimation algorithm that ensures consistency, provided that the number of groups is properly specified.For network data, Zhu et al.(2022) proposed a group network autoregression (GNAR) model that characterizes the network interaction effect.

The GNAR model, while able to flexibly model time series data collected from a large number of network nodes, is still limited in its ability to fully capture the complex dynamic patterns exhibited by all network nodes.One potential solution to this issue is to incorporate the factor model structure into the GNAR model.The factor model has been extensively studied in the literature and has a wide range of applications.For example, Fama et al.(1993) proposed the well-known five-factor model, which includes bond-market and stock-market factors to explain returns on stocks and bonds.Fan et al.(2011) used approximate factor models for high-dimensional covariance estimation, assuming that the error terms adopt a sparse covariance matrix, and used an adaptive thresholding technique for estimation.Ando et al.(2017) proposed a grouped factor model and applied it to cluster highdimensional time series of financial data to capture the co-movement of stocks.Other related works include Hou et al.(2015), Bai(2003), and Stock et al.(2011).Combining the GNAR model with the grouped factor model is a natural way to enhance the interpretability power of complex time series collected from network nodes.

In this work, we propose a network autoregression model with a factor structure, which introduces a latent group structure to account for nodal heterogeneity and leverages the factor structure to model complex dynamic patterns.To estimate the parameters and group structure simultaneously, we utilize an iterative algorithm.To identify the appropriate group and factor numbers, we introduce a PIC criterion for model selection.Furthermore,we provide a statistical inference procedure for the estimated parameters, and validate the estimation and inference procedures through extensive numerical studies.Finally, we demonstrate the practical usefulness of the proposed model through an empirical analysis using data collected from the Chinese A-Share market.

The rest of this paper is organized as follows.Section 2 introduces the model and notations, the algorithm used to estimate the model, the procedures used for model selection, and the statistical inference.Section 3 presents the results of simulation studies and an empirical data analysis.Finally, Section 4 offers concluding remarks.

2 Network autoregression with factor structure

2.1 Model and notations

Consider a network withNnodes indexed asi= 1,…,N.For theith node, letYit∈R be the dynamic response collected from theith node at thetth time point for 1 ≤t≤T.For instance,Yitcan be the weekly return from theith stock at thetth week.To characterize the network relationship among the network nodes, we employ an adjacency matrixA=(aij) ∈RN×N.Specifically,aij= 1 implies that theith node follows thejth node, otherwiseaij= 0.Following the convention we letaii= 0.LetW=(wij) =∈RN×Nbe the row-normalized adjacency matrix, whereni=is the out-degree of theith node.In addition, associated with theith node at timet, we can collect a set of covariatesxit∈Rp.

To characterize the nodes' heterogeneity patterns, we introduce a group structure among the nodes.Specifically, letgi∈[G] be the group membership of theith node.To model the dynamic pattern ofYit, we consider the following autoregression model by embedding the observed network structure information,

whereεitis a noise term.Hereβgigjcaptures the network effect between nodeiandj,νgiis the momentum effect of the node, andγgiis the regression coefficient for the exogenous covariates.Although the model (1) characterizes the network dependence among the nodes, it does not account for the common driven factors for nodes within the same group.To address this issue, we following Ando et al.(2017) and introduce a grouped factor model structure for the noise term, which is given by:

Herefg,t∈Rrgis a group specific factor with factor loadingλg,i∈Rrg.Lastly,eitis an independent and identically distributed noise term with mean zero and varianceσ2.

2.2 Model estimation

We first discuss the estimation when the number of groupsGand the number of group specific factorsrgare given.Letβ=(βgg':g,g' ∈[G]) ∈RG×G,ν=(νg:g∈[G])T,γ=(γ1,…,γG)T, and we collect the parameters in Θ ={β,ν,γ}.In addition, let Λg=(λg,i:gi=g)T∈RNg×rgand Fg=(fg,1,…,fg,T)T∈RT×rg, whereNgis the number of nodes in thegth group.Lastly, let G =(gi:i∈[N])Tcollects the group memberships of all subjects.To estimate the parameters, we minimize the following objective function,

where Cg={i:gi=g}.

Lastly, given {Θ, Λ1, F1,…, ΛG, FG}, we update the group memberships G by minimizing the sum of squared residuals among allGpossible groups for each nodeisequentially from 1 toN.Specifically, with the information of all group memberships except for nodeidenoted by G(-i)=(gj:j≠i), we assign nodeito the group that has the smallest sum of squared residuals, which takes the form

We summarize the algorithm as illustrated above when the number of groupsGand the number of group specific factorsrg, (g= 1,…,G) are given in Algorithm 1.With respect to the implementation of the algorithm, we have the following remarks.

images/BZ_33_419_298_1209_341.png(i) Fix the number of groups G and the number of group specific factors rg for g = 1,…,G.Initialize the group memberships G()0 =(g()i : i ∈[N])T and the parameters Θ()0 0 g =(β()g ·, ν()0 0 g ,γ()g )T, g =0 T 1,…,G.(ii) For m = 1,2,…, repeat the following iterations till converge(a) Given G()g )T, g = 1,…,G, update the estimation of group specific factors and factor loadings {Λ()m - 1 =(g()i : i ∈[N])T and Θ()m - 1 g=(β()m - 1 g ,γ()g · , ν()m - 1 m - 1 m - 1 T m 1, F()m1,…, Λ()mG , F()mG } via the principle components' estimates.(b) Given G()m - 1 =(g()i : i ∈[N])T and {Λ()m - 1 m 1,F()m1,…, Λ()mG ,F()mG } , update the parameters Θ()mg =(β()m g , ν()g )T, g = 1,…,G by (2).(c) Given {Λ()m g , γ()m T m 1, F()m1,…, Λ()mG ,F()mG } and Θ()mg =(β()g ·,ν()m m g ,γ()g )T, g = 1,…,G, update G()m =(g()m T m i :i ∈[N])T by (3).

Remark 1To obtain a reliable estimation result, it is necessary to set appropriate initial values of group memberships and parameters in Algorithm 1, as the algorithm is not strictly convex.Specifically, we suggest the following approach.First, for each nodei, we ignore the factor structure and the network structure, and only consider theYi(t-1)and xitin the right side of (1).Using the least squares estimation method, we can obtain=Then we can apply theK-means algorithm onα?ito cluster theNnodes intoGgroups.This gives the initial value of the group memberships G(0)=(:i∈[N])T.Next, given the initial values of the group memberships, we can obtain a rough estimate of the parameters=(g= 1,…,G) by ignoring the factor structures.This serves as the initial values for the Algorithm 1.In simulation studies, this initial values contribute to fast convergence and superb estimation quality.

Remark 2The principle components' estimate of the factors requires the eigen-decomposition of theT×Tmatrix, which can be computationally expensive when the time spanTis large.To address this issue, Bai et al.(2002) proposes an alternative approach for estimating the factor structure.Specifically, it firstly estimates the factor loadings Λg, subject to the normalization of= Irg, which istimes the firstrgeigenvectors of∈RNg×Ng.Suppose the estimate is given bythen we can obtain the estimate for FgasThis alternative approach can speed up the estimation of the factor structure whenNg＜T.

Remark 3As pointed out by Zhu et al.(2022), the nodei's group membershipgiis involved in the objective function (2) not only through nodeiitself, but also through all those nodes that follows nodei.As a result,the group membership updating equation (3) takes into consideration of two parts.The first part is the loss function of nodeiitself; the second part is the total loss function of nodes in {j:aij= 1}.Due to the sparse structure of real world networks, the updating procedure can be efficiently conducted.

2.3 Group and factor number estimation

In practice, the number of groupsGand the number of group-specific factorsrgare usually unknown and need to be determined before conducting the estimation algorithm.To conduct the estimation algorithm, we first need to estimate the group number as well as the factor number.Motivated by Ando et al.(2017), we adopt an penalized information criterion(PIC) For estimation.The PIC is formulated as follows,

2.4 Statistical inference

In this section, we provide a statistical inference procedure for the estimated parameters Θ.As implied by the recent results of Ando et al.(2017) and Zhu et al.(2022), the group membership can be consistently estimated whenG=G0, whereG0is the true group number.Therefore, in this subsection, we discuss the statistical inference of Θ? when the true group membership is given, i.e., G = G0, where G0is the true membership.For thegth group, we define Fg=(fgt:t∈[T])T∈RNg×rgand introduce the projection matrix

Then the least squares estimator for Θggiven F ={F1,…,FG} can be expressed as

3 Numerical studies

3.1 Simulation

3.1.1 Simulation models and algorithm implementationTo demonstrate the finite sample performance of the proposed method, we conduct simulation studies with various network structures and parameter settings using model (1).Specifically, the covariatesxit, the group-specific factorfg,t, and the factor loadingλg,iare all independently and identically sampled from the standard normal distribution N(0,1).We investigate the following two network structures.

1) Stochastic Block Model(SBM).In the SBM network structure, the nodes are partitioned intoKcommunities.Nodes from the same communities have higher probability to be connected, compared to nodes from different communities.Specifically, we setP(aij= 1) = 2log(N)/Nif nodeiand nodejare from the same communities, andP(aij= 1) = log(N)/Notherwise.For different network sizesN= 100,200,300, we set the number of communities asK= 5,10,15, respectively.

2) Power-law Distribution Network(POW).In the POW network, the in-degreesof nodes follow a power-law distribution.Specifically, for each nodei, we generateP(di=d) =cd-α, wherecis a normalizing constant andαis the exponent parameter.We setα= 2.5 as suggested by Clauset et al.(2009).This network well captures the characteristics that the majority of nodes have few followers while a small percentage of nodes have a large number of followers.

For each network structure, we consider two group settings with true number of groupsG0= 2 andG0= 3,and sample the true memberships of nodes from a multinomial distribution with (π1,π2) =(0.5, 0.5) forG0= 2 and (π1,π2,π3) =(0.3, 0.3, 0.4) forG0= 3 respectively.For convenience we set the number of factors= 2 forg∈[G0] andp= 2 for the dimension ofxitacross all group settings, and the true parameters are specified in Table 1 forG0= 2 andG0= 3 respectively.refers to the true factor number in thegth group.To compare the model performance under different types of noise terms, we follow Zou et al.(2017) to consider three different scenarios where the noise termseitare sampled from different distributions.In scenario 1, the noise terms are i.i.d sampled from standard normal distribution.In scenario 2, the noise terms are i.i.d sampled from a mixture normal distributionξ· N(0, 5/9) +(1 -ξ) · N(0, 5), whereP(ξ= 1) = 0.9 andP(ξ= 0) = 0.1.In scenario 3, the noise terms are i.i.d sampled from a standardized exponential distribution Exp(1) - 1.We set different network sizes asN= 100, 200, 300 and time spanT= 100, 200, 300 and run each model setting forB= 200 times.

Table 1 True parameters for G0 = 2 and G0 = 3

3.1.2 Performance measurements and simulation resultsWe utilize the proposed PIC criterion to select the number of groupsGand the number of group specific factorsrg, which depends on the value ofC.We set possible candidates forCasC= 0.1k,k= 0,1,…,20.To identify an appropriate value ofCusingV2C, we set the sub-sample sizeT(s)=T-s× 10 withs= 0,1,…,4.For each model setting, we first repeat the calculation offor 20 times to observe its behavior.Figure 1 plotsvalues againstCunder the POW network with sizeN=200,T= 200,G0= 2 and= 2 forg= 1,2.The plot reveals that both extremely small and large values ofCresult in fluctuations of thevalues.However, a range ofCvalues between 0.4 and 1.2 yields stablevalues in the 20 simulations.To facilitate our analysis, we setC= 1 for all model settings in the 200 simulation runs.This choice leads to satisfactory numerical results that are sufficient for illustrating the model's performance.The plot reveals that both extremely small and large values ofCresult in fluctuations of thevalues.However, a range ofCvalues between 0.4 and 1.2 yields stablevalues in the 20 simulations.To facilitate our analysis, we setC= 1 for all model settings in the 200 simulation runs.This choice leads to satisfactory numerical results that are sufficient for illustrating the model's performance.

Fig.1 Behavior pattern of values of V 2C on different values of C

We first evaluate the performance of the proposed model selection method.In thebth simulation round, the estimated group number and factor numbers are denoted asandrespectively.To measure the accuracy of estimated group numbers, we calculateas the percentage of under-estimated, correctly estimated, and overestimated group numbers.Next, we evaluate the accuracy of the factor number estimation.For convenience, we only evaluate the cases when the group number is correctly estimated.Similar to the previous definitions, we define=whereB0=The percentage (%) of under, correct and over-estimated number of groupsG? and the number of factorsfor the SBM network and POW network are summarized in Table 2-4 for scenario 1 to scenario 3, respectively.First we look at the Table 2 for scenario 1.We can see that when the group structure is relatively simple (G0= 2), the proposed model selection method can select the correct model nearly all the time, except for a few cases of over-estimated number of groupswhen the network sizeNand time spanTare both small (N=100,T= 100).WhenG0= 3, the group structure is more complex and the proposed model selection method have both under- and over-estimation cases regarding?andHowever, one can clearly observe that as the network sizeNand time spanTincreases, the selection accuracy significantly increases.This indicates that our proposed model selection method can consistently select the true model.

Table 2 The percentage （%） of under， correct and over-estimated number of groups G? and the number of factors r?g in scenario 1

Table 3 The percentage （%） of under， correct and over-estimated number of groups G? and the number of factors r?g in scenario 2

Table 4 The percentage （%） of under， correct and over-estimated number of groups G? and the number of factors r?g in scenario 3

The model selection results in scenario 2 and 3 are quiet similar to the results in scenario 1.Specifically, the scenario 2 and 3 have relatively higher selection accuracy in the number of groups?, but slightly lower accuracy in the number of group specific factors r?g.However, as N and T gets larger, the accuracy improvement trend in scenario 2 and 3 is just the same as in scenario 1.This indicates the robustness of our proposed model selection method against different types of noises.

The RMSE (×1 000) for the estimation ofβ0，ν0andγ0， as well as their CP (%, in the parenthesis), and the ACmem(%) for the SBM network and the POW network are summarized in Table 5-7 for scenario 1 to scenario3, respectively.First we look at the Table 5 for scenario 1.For the membership estimation, we can see that the proposed model estimation algorithm can achieve very high accuracy, even when the network sizeNand time spanTare small.As for the parameter estimation, we can see that the RMSE of all estimated parameters decreases stably when the network sizeNand time spanTincreases.Lastly, regarding the statistical inference, we can see that the CP values in the parenthesis are all very close to the true value 95%, despite that when theG= 3 the difference between CPs and 95% are larger than whenG= 2 on the whole.

Table 5 The results for RMSE， CP and the ACmem （%） in scenario 1

Table 6 The results for RMSE， CP and the ACmem （%） in scenario 2

Table 7 The results for RMSE， CP and the ACmem （%） in scenario 3

The model estimation results in scenario 2 and 3 are also very similar to scenario 1, except that the membership estimation accuracy are slightly worse (less than 1%).Based on the performances we can conclude that our proposed model estimation algorithm can consistently estimate the model parameters and the memberships, and that the validity of statistical inference procedures in 2.6 are well supported.

3.2 An empirical case study with stocks from Chinese A-Share Market

In this section, we utilize the proposed model to analyze the returns of stocks from the Chinese A-Share market.We begin by presenting a brief description of the data.Next, we employ the PIC criterion to determine the appropriate number of groups and factors for the data.Finally, we conduct a model analysis using the proposed method.

3.2.1 Data descriptionIn this study, we collectedN= 766 stocks from the Chinese A-Share market after performing necessary data cleaning procedures.Specifically, we select those stocks that have complete return and covariates information during the year from 2016 to 2021 from the China Stock Market and Accounting Research(CSMAR) database, which leads toT= 72 months in total.The response variableYitis the standardized return of each stockiat each montht.

First, since stocks from the same industry tend to exhibit higher correlation(Chan et al., 1999), we construct stock networkAbased on their industry classification.Specifically,aij= 1 if stockiand stockjbelong to the same industry, andaij= 0 if otherwise.The diagonal elements ofAare set to be 0.In addition, we also collectedp= 4 covariates from the financial statement of the firms, which are SIZE(the logarithm of the firm's market value), BM(book to market ratio), CASH(cash flow of the firm) and LEV(the firm's leverage ratio).Those covariates are standardized with mean 0 and standard deviation 1.We included these covariates because they have significant potential explanatory power on stock returns(Chan et al., 1998), and they are widely used in existing literature(Zou et al., 2017; Fan et al., 2022).

We first conduct some basic data descriptive analysis as follows.We visualize the histogram of the average return of each stocki= 1,…,Nin Figure 2.We can see that the majority of the stocks gained an average return between -1% and 3%, with only a small proportion of stocks having an average return over 3%or less than -1%.Next, we plot the time series of average stock returns,t= 1,…,Tin Figure 3.We observe two peaks occurring in the third month in the year of 2016 and in the second month in the year of 2019.Regarding the industry distribution of theN= 766 stocks, they are sourced from 47 unique industries.The pharmaceutical industry has the highest representation with 83 stocks in the industry.

Fig.2 Histogram of average return of p = 766 stocks

3.2.2 Model estimation and evaluationNext we fit our model on this stock data.To this end, we consider subsample sizesT()s=T-s, wheres= 0,1,…,5, and computeforC= 0.1k, wherek= 0,1,…,20.The resulting values ofare plotted in Figure 4.We observe thatis relatively small in the rangeC∈[1.1, 2.0].Therefore, we chooseC= 1.2 for the subsequent analysis.

We then use the PIC to select the group number and factor numbers.This results in?= 6 and=(1,1,1,1,1,1).

Table 8 presents the estimation results.We first examine the group membership assignment, where we observe that each of the six groups has a size around 100,with Group 1 having the largest size of 178 stocks and Group 5 having the smallest size of 70 stocks.The stock allocation across the six groups is relatively balanced, with each group having a specific factor.These findings suggest a strong grouped factor structure in the stock market,where the stock returns are influenced by their respective group-specific factors.Moving on to the group-wise network effects, we observe that most groups have negative inner-group network effects, indicating a negative correlation among the stocks within each group.Regarding the momentum effects, we find that most groups have negative coefficients, except for Group 5, which reflects the well-known mean-reversion phenomenon in the stock market.Finally, we examine the covariate effects.SIZE has a positive coefficient on all six groups, while BM has a negative coefficient on all six groups.CASH has a positive coefficient on Group 1 and 5, and a negative coefficient on the other groups.LEV has a small negative coefficient on Group 1 and 2, a negative coefficient on Group 3 and 4, and a positive coefficient on Group 5 and 6.In summary, the Chinese A-share market exhibits a strong group structure in stock returns, and their behavior patterns vary significantly depending on the group to which they belong.

Fig.3 The time series of average stock returns over T = 72 months

Fig.4 Behavior of values of on different values of C, calculated from subsamples of real data

Table 8 The estimated parameters as well as their standard errors （in the parenthesis），the group size， and the number of group specific factors for each group

4 Conclusion

In this work, we study the network autoregression model with factor structure.It introduces a latent group structure so that it can characterize the nodes' heterogeneity patterns.We propose an iterative algorithm to estimate the parameters as well as the group structure simultaneously.A PIC criterion is developed to select the number of groups and the number of group specific factors.We also perform statistical inference on the estimated parameters and conduct a number of simulation studies to examine the finite-sample performances.Lastly, we apply our model to a real stock dataset to demonstrate its usefulness.

Here we briefly discuss some future research topics.First, we assume in our model that the same group structure applies to both the parameters and the factors.It is interesting to allow for different group structures for regression coefficients and latent factors.Second, rigorous theoretical investigation can be provided to the statistical inference.Lastly, besides the group specific factors, the inclusion of common factors may improve the interpretation power of the model.

中山大學(xué)學(xué)報(自然科學(xué)版)(中英文)2023年5期

中山大學(xué)學(xué)報(自然科學(xué)版)(中英文)的其它文章: 關(guān)于Anderson混合的研究進(jìn)展*; 譜壓縮感知的非凸低秩矩陣優(yōu)化模型與方法綜述*; Cd（Ⅱ）配位聚合物的合成及其對六價鉻含氧酸根離子的熒光識別*; 一次局地輻射霧過程及其水汽來源的數(shù)值模擬*; 基于高分辨率數(shù)據(jù)的熱帶氣旋降水時空變化特征*; 充填節(jié)理特性對巖石靜態(tài)壓縮力學(xué)行為影響的顆粒流模擬*