The multidimensional nD‐GRAS method: Applications for the projection of multiregional input–output frameworks and valuation matrices
2021; Elsevier BV; Volume: 100; Issue: 6 Linguagem: Inglês
10.1111/pirs.12625
ISSN1435-5957
AutoresJuan Manuel Valderas Jaramillo, José M. Rueda‐Cantuche,
Tópico(s)Climate Change Policy and Economics
ResumoWe present a multidimensional generalization of the GRAS method (nD-GRAS) for the estimation of multiple matrices in an integrated framework. The potential applications of this method in regional and multi-regional input–output analyses based on national/regional accounts frameworks are many. We provide two real applications, a 3D-GRAS that estimates a use table at basic prices jointly with valuation matrices for Denmark; and a 4D-GRAS for estimating intercountry input–output tables with OECD data. We show that higher dimensional GRAS methods provide more consistent and accurate estimates than those with lower number of dimensions. We provide the analytical closed-form solution and the RAS-like algorithm for an easy operationalization. En este artículo se presenta una generalización multidimensional del método GRAS (nD-GRAS) para la estimación de matrices múltiples en un marco integrado. Las aplicaciones potenciales de este método en los análisis input-output regionales y multirregionales basados en los marcos de cuentas nacionales o regionales son numerosas. Se incluyen dos aplicaciones reales, un 3D-GRAS que estima una tabla de uso a precios básicos conjuntamente con matrices de valoración para Dinamarca; y un 4D-GRAS para estimar tablas input-output entre países con datos de la OCDE. Se demuestra que los métodos GRAS de mayores dimensiones proporcionan estimaciones más consistentes y precisas que aquellos con un menor número de dimensiones. Para una fácil operacionalización, se proporciona la solución analítica en forma cerrada y el algoritmo tipo RAS. 本稿では、統合フレームワークにおける複数の行列の推定のためのGRAS法(n D-GRAS)の多次元一般化モデルを示す。国別・地域別会計のフレームワークに基づく地域別・複数地域別の産業連関分析にこの手法を適用できる可能性のある方法は多くある。デンマークの評価行列と一緒に基本価格で使用表を推定する3 D-GRASまた、OECDのデータを用いて各国間の産業連関表を推定するための4 D-GRAS、以上の実際の二つの応用事例を示す。高次元のGRAS法は、低次元のGRAS法よりも、より一貫性があり正確な推定値が得られることが示された。また、解析的閉形式解と簡単な操作のためのRAS様アルゴリズムが得られた。 Multiple variations of biproportional techniques have been applied to the field of input–output analysis since Leontief's (1941) pioneering work, in which he used a biproportional technique to identify sources of inter-temporal change in the cells of a series of input–output tables (Lahr & de Mesnard, 2004). This also includes –broadly speaking– the RAS-family methods (Bacharach, 1965, 1970) and their extensions. A recent summary that provides a good overview and a large compilation of these methods can be found in Chapter 18 of the UN Handbook on supply, use and input–output tables with extensions and applications (United Nations, 2018). The idea for this method stems from the fact that often in practical situations rather than simply imposing constraints summing all the elements of a matrix row-wise or column-wise (as in standard GRAS), it is necessary to rearrange a matrix representing multiregional information into arrays of a larger dimension imposing constraints on all the dimensions of the array. For instance, in multiregional frameworks, where national IOTs are split using information on bilateral exports and imports, it may be that the corresponding national use tables of imports might serve as constraints to the balancing of a multiregional IOT. Another practical situation that requires more than two dimensions is the estimation of use tables at basic prices and valuation matrices, that is, trade and transport margins tables (TTM), taxes less subsidies on products tables (TLS) to make them consistent with the use tables at purchasers´ prices. We can estimate each of those tables independently with a GRAS method, but the result of summing TTM, TLS and the use table at basic prices would be equal to the initial use table at purchasers´ prices only by chance. The main contribution of this paper is the derivation of an analytical closed-form solution to the GRAS method in a multidimensional framework with an arbitrary number of dimensions and the algorithm to handle these problems in an accessible way. The bi-dimensional case is the standard GRAS method (2D-GRAS according to our terminology). As it will be described in the next section, the problems addressed by the nD-GRAS method can also be embedded and solved within the KRAS framework; nonetheless, our approach can easily be made operational in a RAS-like algorithm, among other differences. Finally yet importantly, another contribution of this paper is to show that projections based on a higher number of dimensions, apart from the global coherence of the estimations, lead to better performance and greater accuracy than independent projections based on a lower number of dimensions. For instance, in general, the 3D slices resulting from the projections using a 4D-GRAS method yield a better fit than the results that we could achieve using a 3D-GRAS applied independently to each of the 3D slices of the 4D array. We will also see that this statement holds for the 3D-GRAS method and the 2D matrices projections. This paper is organized as follows. The next section frames the theoretical background of our work within the latest related literature. Section 3 introduces how several important applications can be embedded in our multidimensional approach. Section 4 contains all the theory about the nD-GRAS, and the set-up and solution of the optimization algorithm. Section 5 explores some conditions regarding feasibility issues and convergence. Section 6 provide two examples for the 3D-GRAS and 4D-GRAS methods. Finally, a concluding section provides a summary of the main findings and theoretical contributions. The RAS method and other related biproportional techniques fall under the category of what is known in other fields as iterative proportional fitting procedures (IPFP). In a bi-dimensional context, Deming and Stephan (1940) used similar methods for the estimation of contingency tables.2 We can find other early applications of these methods in the literature by Sheleikhovskii (Bregman, 1967) and Kruithof (Lahr & de Mesnard, 2004). Generalization for three-dimensional contingency tables was done by Deming (1943),3 and for larger multidimensional contingency tables by Darroch (1962) and Ireland and Kullback (1968). A good summary of these techniques, their implementation and basic literature references can be found in the documentation of the R-package "mipfp" developed by Barthelemy et al. (2018). We can also find other examples in the recent literature concerning the multidimensional generalization of the RAS-family methods. Tilanus (1976) first introduced this approach in an algorithmic way, generalizing the biproportional algorithm of RAS to four dimensions. Oosterhaven et al. (1986) introduced a method for estimating an interregional input–output system in a bi-dimensional RAS set-up where the regional cells must add up to a national figure. The approach followed by Oosterhaven et al. (1986) is similar to the multiregional GRAS (MR-GRAS) method developed by Temursho, Oosterhaven, and Cardenete (2020). Both methods constitute a bi-dimensional set-up where the additional national constraint provides a sort of third dimension. In the case of the MR-GRAS, this method includes an additional set of constraints (different from the typical row-wise and column-wise sum constraints) in a multiregional framework using the same bi-dimensional objective function as the standard GRAS method. In the MR-GRAS method, the third set of linear restrictions operates across any non-overlapping subsets of elements in the multiregional IOT that must add up to a given total. As Temursho, Oosterhaven, and Cardenete (2020) show, this approach can be adapted for updating inter-national/regional or global SUTs, where the third dimension constraint is introduced for the interregional blocks add up to a given figure. The MR-GRAS approach has been extensively used for the creation of a baseline scenario called PIRAMID4 for the 2018 Global Energy and Climate Outlook (Rey Los Santos et al., 2018; Temursho, Cardenete, et al., 2020) in the projection of national IOTs in a multiregional context for future years, ensuring the consistency of the projections with National Accounts. In principle, the approaches of Oosterhaven et al. (1986) and Temursho, Oosterhaven, and Cardenete (2020) are not a real three-dimensional approach because the authors use a bi-dimensional set-up with constraints over these two dimensions, hence, the third dimension is not taken into account as such. Nonetheless, by introducing this additional set of restrictions, the methods of Oosterhaven et al. (1986) and Temursho, Oosterhaven, and Cardenete (2020) produce a three-dimensional solution. The 3D-GRAS can be solved in terms of Temursho, Oosterhaven and Cardenete's approach, since it is possible to formulate a three-dimensional array as a standard matrix, thus making these two approaches correspondent. However, as Temursho, Oosterhaven, and Cardenete (2020) rightly mention, the nD-GRAS method is more general, accounting for additional sets of non-overlapping restrictions that are included to account for other dimensions. Indeed, the nD-GRAS method is more general and so we describe it in this paper in a geometrical and intuitive way. We also show a solution algorithm that makes the efficient implementation of this method straightforward, regardless the number of dimensions and without requiring the use of aggregation matrices. The nD-GRAS method was developed within the Eurostat's FIGARO5 project (Remond-Tiedrez & Rueda-Cantuche, 2019) and it is being profusely used in the construction of the European inter-country supply, use and input–output tables (SUIOTs). Other examples of RAS-like algorithm approaches can be found in Gilchrist and St. Louis (1999, 2004) and Lenzen et al. (2006). The TRAS algorithm introduced in the papers of Gilchrist and St. Louis sets up a sort of three-dimensional RAS that accounts for additional aggregation constraints apart from row-wise and column-wise sums. The "cRAS" algorithm proposed by Lenzen et al. (2006) generalizes the bi-dimensional RAS algorithm to account for additional aggregation constraints. In all these papers, as in Tilanus (1976), only the resolution algorithm is presented as a practical way of obtaining the estimated matrix that meets all the desired constraints without mathematical proof. Another three-dimensional approach where valuation matrices, basic and purchasers prices inputs are jointly estimated is in Dalgaard and Gysting (2004). In our opinion, the algorithm proposed by these authors does not fall into the category of multidimensional algorithms, as there is not a multidimensional constraint, such that valuation matrices and basic prices table must add up cell-wise to the purchasers' prices table. Instead, the balancing of the GDP on the output side and the demand side respecting the outputs provided by the supply table is the target of this algorithm. The basic prices and valuation matrices are derived individually using a sort of proportional adjustment and a RAS-like algorithm for balancing, and the purchasers' prices table is the sum of all of them. The algorithm continues until the desired balance is achieved. Holý and Šafr (2020) introduced a multidimensional RAS method (DRAS) that reformulates the bi-dimensional RAS problem in a purely multidimensional set-up that deals with only non-negative arrays. In Holý and Šafr (2020), the method was introduced only in an algorithmic way, similar to Tilanus (1976). Holý and Šafr (2020) prove that this algorithm is the solution of the cross-entropy optimization model in a multidimensional generalization. The DRAS is a particular case of the nD-GRAS when no negative elements exist in the initial matrix. They apply the DRAS in a three-dimensional set-up for the estimation of regional, quarterly and domestic/imported input–output (industry-by-industry) tables of the Czech Republic. Their results show that the addition of a third dimension, apart from ensuring the consistency of national totals among the different layers (either quarterly disaggregation, regional disaggregation or domestic/imported split) allows more accuracy than the standard estimates in terms of the overall input–output structure. The application of the multidimensional RAS method to Isard's interregional input–output model also shows better results than the standard RAS method. Another decisive development in the field of input–output projections is the continuation of methods developed from the KRAS method introduced by Lenzen et al. (2009). In the KRAS method, the vectorization of the target matrix allows the formulation of the optimization problem in a unidimensional set-up. This vectorization allows all potential constraints to be embedded, such as linear constraints on arbitrarily sized and shaped subsets of matrix elements either with unity or non-unity coefficients, in a generalized formulation. The KRAS method also incorporates other features such as the reliability of the information supplied by the constraints and the autonomous management of potential conflicts with external data in the case of inconsistent constraints. The KRAS method is at the core of multiple applications that deal with multidimensional problems. In Geschke et al. (2011) a tool with a custom data processing language (AISHA) for the construction of series contingency tables is described. This tool performs the estimation of large dimension contingency tables. Geschke et al. (2011) describes how a multidimensional representation can be vectorized in a one-to-one correspondence with a unidimensional vector. This methodology is implemented in Lenzen et al. (2013) in the construction of the EORA MRIO database, where an eight-tiered hierarchy is used. This does not actually imply an eight-dimension problem in the same way as in an 8D-GRAS method would work. That is, even though all the elements can be embedded in an 8D-array, the constraints do not add up over all the eight dimensions together in a array. Nonetheless, some of the constraints represent 3D or 4D aggregations in an identical way to the 3D-GRAS and 4D-GRAS problems. Additional features of this methodology include the possibility of using different optimization functions apart from the GRAS optimization function. Among them, we have quadratic programming or barrier and penalty functions (Geschke et al., 2011; Lenzen et al., 2012). In addition, different solvers prepared to deal with large-scale and sparse matrices, including parallelization techniques and Cimmino algorithms (Geschke et al., 2019; Lenzen, Geschke, et al., 2014) are available. Besides, this framework includes the possibility of dealing with tailored and different regional and sectoral aggregations defined as a subset of a fixed classification (Lenzen, Geschke, et al., 2014). This methodological proposal, deriving from the KRAS method paper, goes one step further with the development of virtual laboratories introduced in Lenzen et al. (2012, 2017), Lenzen, Geschke, et al. (2014) and Geschke and Hadjikakou (2017). In these virtual laboratories, researchers can assemble their own MRIO versions in a collaborative research environment using cloud-computing platforms enabling a multitude of input–output applications in carbon, water, ecological footprints, life-cycle assessments and trend or key driver analyses. Undoubtedly, the problems addressed by the nD-GRAS method can be embedded and solved in this KRAS framework, as long as a feasible solution exists, using the same objective function, a set of constraints with unity coefficients and no conflicting information. This also applies, not only to the nD-GRAS method, but also to the rest of the multidimensional methods mentioned in this paper such as those proposed by Holý and Šafr (2020) or Temursho, Oosterhaven, and Cardenete (2020). Even though the advantages of the extended KRAS methodology and virtual labs proposals are obvious, other aspects also have to be taken into account. The nD-GRAS proposal has the advantage of having a closed-form solution, at the cost of using only linear restrictions with unity coefficients. Finding a solution also involves providing non-conflicting exogenous information and a well-posed prior matrix. However, the mathematical derivation of the solution of the multidimensional problem and the associated algorithm enables a large variety of problems to be solved without IT hardware. On the other hand, the multiple features of the KRAS methodology to be handled in the process, such as reliability and clearing up conflicting information, but also of the size of the elements involved in the estimation problem in terms of regions, products and sectoral disaggregation can be a barrier to entry for users. Besides, the vectorization of the target matrix usually leads to an optimization problem that requires the management of very large and very sparse matrices that are especially demanding in terms of computing performance. In addition, the operationalization of some aspects of these methods, such as the construction of the constraints matrix, requires the use of automation and "data mining, processing, and reclassification procedures as much as possible" (Lenzen et al., 2012, p. 8376). Optimization algorithms are also another way to address the kind of problems covered by the nD-GRAS. Jackson and Murray (2004) provide in their article an excellent review bridging between iterative techniques and optimization algorithms. The recent proliferation of RAS-based approaches already mentioned in this article, may cause an impression that these RAS extensions sacrifice simplicity for capability (KRAS is probably a good example of this). Although the generalization of the GRAS problem to multiple dimensions may seem complex, the existence of an analytical solution and an algorithm for its implementation makes it easy to deal with. Besides, this algorithm can be efficiently implemented in an easy way on widely available free software programs like R. Alternatively, more complex numerical optimization techniques may require commercial software that need to be solved with high-performance solvers embedded in this commercial software. However, it is also true that these optimization techniques can deal with a larger potential for setting complex constraints over any number of dimensions, subsets of coefficients, try different weighting patterns, penalties, and more complex objective functions. In the case of infeasibilities, in iterative process like RAS-based approaches, the reason for non-convergence of the algorithm is easier to trace compared to the complex algorithms underlying optimization routines (see Temursho, Cardenete, et al., 2020). It is also possible to inspect some qualitative aspects of the relationship between the prior and targets (as we will see in Section 5) or checking the paths followed by the updating factors in the iterative process. Before introducing the multidimensional GRAS method, we illustrate in Figure 1 how we can frame several practical situations into this methodology. Going from the simpler to the more general, we start with some 3D-GRAS examples. A multiregional framework is a fertile ground for multiple 3D-GRAS applications. Let us assume that we have a matrix of a multiregional input–output framework. In this matrix, row elements (products or industries) are usually arranged by countries/regions of the multiregional framework. The same also applies column-wise. For instance, if we think of a matrix representing bilateral trade for multiple regions. (i.e. the domestic part is voided), rows will denote exports of products by destination partner/user, and columns, imports of products by country of origin. Figure 1 shows a matrix that schematically represents, either a standard multiregional use framework, or a bilateral international trade matrix (in such case with the elements of the main diagonal block—shaded light grey—set to zero). If we concentrate our attention on, say, trading partner (region) (shaded in Figure 1), we see a description of region k's imported products by region of origin. In fact element represents, for region , the imports of product by user coming from region (the main diagonal block equal to zero implies that for every destination region , for ). In this context, the column block of region , represented in Figure 1 by elements , could be embedded in a 3D array, as it is portrayed in the upper right-hand side of Figure 1. If we aggregate by region of origin (i.e., we aggregate ) the result is a typical import use matrix of a national input–output framework, as described in Figure 1. The dark grey elements of Figure 1 depict graphically that . Many practical situations in multiregional frameworks fall into this representation. For instance, balancing a multiregional framework where the totals of products and users by region are known (i.e., a typical biproportional balancing set-up). If an additional set of constraints is added (e.g., implying that the sum of the same elements by region add up to a known total), then this becomes a 3D generalization of a biproportional balancing. In addition, we can estimate a set of regional input–output tables where regional target vectors are known, adding all the regional tables up to a national one. We can solve these problems straightforwardly with the 3D-GRAS method. This is also the situation when estimating a framework consisting of a use table at basic prices jointly with their valuation matrices. As we can see in Figure 2, the use table at basic prices, the trade and transport margin matrix and the taxes less subsidies matrix set up a 3D system with the use table at purchasers' prices being the 3D target. We may also think that the third dimension accounts for the territorial dimension of the problem, and therefore, the other two dimensions are products and industries.6 We can also consider time as a third dimension. In this situation, for instance, the aggregated table would be an annual table, and the third dimension would account for a temporal disaggregation (e.g., quarterly). The same applies for the breakdown of total coefficients into domestic and imported figures (see Holý & Šafr, 2020). When performing this problem, one can interpret that we are performing three bi-dimensional GRAS optimizations simultaneously: one for every layer of the system of matrices. Nonetheless, these bi-dimensional GRAS problems are not accounted for independently; on the contrary, the third dimension constraint—cell-wise sum of basic prices matrix plus valuation matrices adding up to the desired purchasers' prices values—ensures a global approach connecting all the problems into an integrated single problem that performs all the balancing simultaneously. A practical application of this approach is illustrated in Section 6. It will always be possible to project multidimensional arrays in a bi-dimensional setup (see Figure 1, the whole multicountry frameworks constitutes a bi-dimensional representation of a 4D contingency table). However, as the number of dimensions increase, it becomes more complex to define a set of aggregator matrices to meet all constraints across all dimensions. The inclusion of new dimension implies a new different set of independent multipliers (i.e., one cell can only be used once to meet a constraint for one specific dimension and it cannot be used again to meet others). The geometrical intuition makes easier the implementation of this feature and so we describe it in this paper. Finally, we can also find an example of application of the 4D-GRAS technique in the estimation of a global multiregional input–output framework. First, it is very important to note that, as already mentioned, even though we have expressed a multiregional framework in a bi-dimensional matrix, this must not hide the fact that it is a 4D array, since every element has to be described by four sub-indexes. Hence, the multiregional framework should be expressed as where sub-index represents the origin dimension, represents the destination dimension, represents the product dimension, and represents the user dimension. It is important to note the difference between a 4D projection and the estimation of 4D array using 3D projections. The bi-dimensional table represented in Figure 1 is essentially a 4D array. Conditioning on a column block such as destination region , it is a 3D array, that is, a 3D slice of the full 4D array, as represented in Figure 1. Hence, it is possible to split the 4D array into a set of independent 3D arrays for every destination region . As long as the bi-dimensional margins of these 3D slices are known, we could perform independent 3D-GRAS methods for each destination region (i.e., selecting different slices every time). At the end, we will have an estimation of the full 4D array. However, the result is not necessarily consistent with all of the 3D margins , , and . It is exactly the same situation that takes place in a 3D array. Recalling our previous example for the use basic prices and valuation matrices example described in Figure 2, if all the marginal 2D vectors of these matrices are known, it is possible to perform a 2D-GRAS method to estimate each of these bi-dimensional slices of the 3D array independently. However, this does not imply that the result is consistent with the use table at purchasers' prices matrix. Hence, the global integrated estimation in a higher dimension adds consistency to all the elements in the system and, as we will see, this has a positive impact on the quality of the estimations. Another advantage of the nD-GRAS method is that it does not require having a full knowledge of the constraint values, although we have assumed in this paper that they were all known across all dimensions. If some constraint values are missing, the nD-GRAS method skips them in a natural way. For instance, this would be similar to implement a standard bi-dimensional GRAS where some of the columns and/or row totals are unknown. The solution algorithm can be executed leaving such row and/or column vector totals free. In this section, we present the -dimension generalization of the GRAS technique. For the sake of simplicity, and making easy to understand this problem from a mathematical point of view, we initially expose the method for a tridimensional array (3D-GRAS), and we will further generalize it for larger dimensions. As in a standard GRAS setup, the 3D-GRAS problem consists of three elements: the data (prior and constraints), the problem to be solved and the model that sets up the problem in a mathematical way. As usual, a prior tridimensional array is necessary, where ; and . If we take any slice of the 3D array along, say, the third dimension is a bidimensional matrix. We will need two vectors as constraints for every slice, and , that can be easily arranged into two bidimensional matrices, and . A third matrix , that constitutes the constraints in the third dimension (the regional aggregated matrix) is required. This algorithm continues sequentially, initiating a new iteration that repeats steps 1 to 3 for every iteration. This is represented in the flow diagram of Figure 3, being the iteration number, for This solution of the 3D-GRAS method generalizes the concept of biproportion (see De Mesnard, 1994) to tridimensional proportion, but in a standard setup like the standard GRAS with positive and negative figures. This is described in Figure 4, where element (assumed to be positive in this example) is updated by a factor, consisting of the multiplication of three factors . The algorithm for convergence is depicted in the flow diagram represented in Figure 5 that represents the iterative procedure to find a solution till the tolerance margin is achieved. The solution algorithm can be implemented very efficiently in R, since it is very easy to operationalize. We provide the R scripts for the 3D-GRAS and 4D-GRAS methods upon request. They can be easily generalized for any dimension using the same sequence of updating factors for each dimension at once in each step. The convergence of the nD-GRAS method is guaranteed as long as the optimization problem is well defined and a solution exists. As such, this is the case because the target function in (1) is a sum of strictly convex functions and hence, strictly convex. All the constraints are linear functions, and hence convex even though not strictly convex. The Lagrangian function in (2) is also a (strictly) convex functions since it is a sum of strictly convex and convex functions. Hence, our algorithm will converge to this solution if it exists. Besides, if a solution exists, given the characteristics of the optimization function and the constraints, according to Chiang (1984), this solution is unique.9 One of the main advantages of using analytical solutions and simple iterative algorithms is the possibility of controlling for problem resolution and therefore, fixing potential infeasibilities. As a result, there are some basic necessary conditions concerning feasibility that are important to highlight. These are presented as follows. First, it is important to note that the constraints by dimension must add up to the same number regardless dimensions; otherwise, the problem would be infeasible. Bacharach (1970) proved10 this necessary condition for bi-dimensional non-negative matrices. This necessary condition of Bacharach remains valid in the multidimensional generalization, since no solution would exist otherwise. Second, the number of null elements in the prior is another important issue regarding convergence since they existence of zeros reduce the degrees of freedom of the system to find a solution. However, if zeroes happen to split the array into two independent sub-arrays, then the first constraint identified by Bacharach (1970)11 can be applied independently for each sub-array. Bacharach (1970)12 also provided for n
Referência(s)