cite_thesis.txt

{
    "Modeling complex collaboration network for service-oriented software based on execution behaviors": [
        "Zhang Xizhe et al. [13] proposed a service dynamic behavior growth and evolution model that takes into account the characteristics of service be havior. Based on the service structure data of real services, and centering on the parameter association relationship between services, the service structure network was established through parameter matching. This paper is modeled by the deterministic relationship between parameters, and it is not suitable for characterizing the uncertain correlation of cloud services.\n"
    ],
    "A generalized Mandelbrot set based on distance ratio": [
        "M-J sets) play an important role in fractal the-ory [Falconer, 1990]. With the aid of computer, we recognized the complexity of the M-J setsdirectly and got some instructional conclusions byobserving their images. Therefore, the study ofalgorithms to calculate the M-J sets has been ofgreat interest. A great deal of research has beendone on the escape time algorithm, a classicalalgorithm to create the M-J sets [Yuan et al., 2007; Liu et al., 2011; Deng et al., 2009]. Moreover, other algorithms include the inverse function iterative algorithm [Yang et al., 2006; Sun, 2007], distance ratio iteration method [Zhang et al., 2006], deviation plot method [Andreadis & Karakasidis, 2010], three-step algorithm [Pastor et al., 2011], cycle detecting method and the improved timeescape algorithm [Sun & Wang, 2009 a, 2009 b], etc.\n"
    ],
    "A Fast Algorithm for Generalized Arc Consistency of the Alldifferent Constraint.": [
        "We choose the CSP solver as the core automated reasoning engine, because formula (1) and (9) can be optimized as the global constraint \u2018Alldifferent\u2019, and a lot of works are devoted to improving the reasoning efficiency of \u2018Alldifferent\u2019 constraints for CSP solvers [1,12,32].\n"
    ],
    "Community Extraction Algorithm for Large-Scale Online Social Networks": [
        "In order to improve the teaching mode of the sports teaching assistant platform in colleges and universities, a new idea of constructing a new type of sports teachingassistant information platform in colleges and universities based on Internet of Things and mobile platform technology is put forward by using Internet of Thingstechnology and wireless platform technology, which provides a new attempt to use advanced information technology to assist sports teaching. 1-3 In the past, most of the sports network teaching platforms was built by static database and staticresources. The sports network teaching platform based on the Internet of Thingsis directly oriented to the actual classroom teaching, collecting and storing real-time classroom data, which is a new change in the data acquisition of the original information-based assisted teaching mode. 4-6\nBased on the sensor technology of the Internet of Things, remote sensing technology can transform students or teachers in teaching activities into \u2018living points\u2019, which is equivalent to storing the process of movement in a digital way andpreparing data for future simulation, analysis and modelling. 7,8 This lays a data foundation for reproducing the actual teaching activity mode, and the transfor-mation of this mode is relative to the original teaching mode. The model is a new change. Dynamic self-feedback perfect system, using the data of teaching theory to build theoretical teaching reference model, using the dynamic data collected fromreal-time sports classroom teaching activities to build the model, will get the actualsports teaching model, and then compare the two models, will get the difference between the theory and the actual model, according to the difference, we can analyze the problem and guide the improvement. 9-11 Teaching activities, thus continuously self-feedback, iteration and improvement, will form the dynamicinteraction and feedback optimization of theoretical model and practical model. The data source of this dynamic model is collected by the Internet of Things, which is also the important significance of the Internet of Things and data modelling and simulation. From the actual teaching activities, evaluating the actual teachingactivities and improving the actual teaching activities, with the theoretical modeland the actual dynamic model, we can formulate a scientific and reasonableevaluation index system according to the actual situation, which can be used to guide and evaluate the teaching activities in each classroom, and even to carefully study the learning activities of each student, and grasp the important index data inthe teaching activities. 12,13 Speaking with the actual data and making improvement plans based on the actual data are very helpful to guide every student and every teaching activity.\n"
    ],
    "Combining multiple clustering methods based on core group": [
        "The algorithms for cluster ensembles consist of two main components. The first component is the method for generatingthe base clusterings (the individual clusterings in the ensemble), including the source of diversity (differences among the base clusterings). One possibility is to use different clusteringalgorithms for each base clustering. This is the approach in [9], where the goal is to identify \u201crobust\u201d clusters, each definedas a set of patterns that are in the same cluster in every baseclustering. When a single algorithm is employed to produceall the base clusterings, the most common choices includeexpectation maximization (EM), as in [10] and [11], andk-means, as in [12]-[14]. Both EM and k-means have the builtin source of diversity from different initializations and, as in [12], different numbers of clusters in the base clusterings. Otherpossible sources of diversity include different orderings of thepatterns for online clustering algorithms [15] and differentlinkage types for hierarchical agglomeration [16]. The differentsubsampling of data has been studied as well [17]. The relationbetween the degree of diversity and the quality of the combinedclustering is the subject of [18].\nFor this purpose, we introduce here the concept of core groups, defined as subsets of Xthat satisfy the following condition: Two patterns x iandxjbelong to the same core group if and only if \u03bbi=\u03bbj. L e t\u039bbe the set of all the different W ANG: CA-TREE: HIERARCHICAL STRUCTURE FOR CLUSTER ENSEMBLES 689 TABLE I CORE GROUP LABEL VECTORS FOR FIG. 1 label vectors in Xgiven the base clusterings. Since each core group has a unique label vector, the number of core groupsis simply /bardbl\u039b/bardbl, which is the cardinality of \u039b. It is easy to understand that /bardbl\u039b/bardbl\u2264N. The number of core groups depends on the distribution of patterns, as well as the algorithms andparameters used to generate the base clusterings. The procedurefor identifying the core groups from the base clusterings isincluded in the next section. We note here that the concept ofcore groups has been previously mentioned in [16], althoughthe method there for identifying the core groups is very differentfrom ours and is only designed to work with a particular clusterensemble algorithm.\n"
    ],
    "A new position updating algorithm for moving objects": [
        "The main contribution of the present paper is a Kalman filter method that can be used to estimate a moving object\u2019s position and velocity. The estimated values can be used to predict the moving object\u2019s future positions. For future predictions, most techniques use some mathematical formulas of motion derivedfrom recent movements. However, an object\u2019s movements are not simple. In this regard, recent studies have used objects\u2019 trajectory patterns for predicting their future positions. Jeung, Liu, Shen, and Zhou (2008) introduced a hybrid prediction model in which both mathematical formulas and trajectory patterns are considered. Wang, Wang, Wang, Lv, and Zhang (2006) proposed an asynchronous position updating algorithm for continuous queries regarding moving objects; the algorithm was found to improve accuracy, reduce the communication cost, and balance the server load more evenly. An example of a continuous query is \u2018\u2018All vehicles that appear in the region R in the next 10 min. \u2019\u2019\n",
        "Updateofuser'spositionhasalsobeenaddressedin relationtodifferenttypesofqueries (location-dependent and continuousqueries). Wangetal. [4] proposeanasynchronous positionupdatingalgorithmthatenhancesthequery accuracy, reducesthecommunication cost, andbalancesthe serverload. Updatingtheuser'spositionisrelatedtosafe region, dependingonthetypeofcontinuousquery. Lamet al. [5] proposeanadaptivemonitoringmethod (AMM) for generationofupdates. Athresholdbounds, upperandlower, aredefined, andfromthemtheactualupdatethresholdfora movingobjectisgenerated.\n"
    ],
    "The pan-cancer landscape of netrin family reveals potential oncogenic biomarkers": [
        "For the use case with Smul TCan, we chose the netrin family of axon guidance molecules and their receptors from the DCC and UNC-5 families [21] since their involvement in cancer progression has gained recent attention. Netrins promote cell survival, proliferation, and differentiation; they are also involved in migration, invasion, and angio-genesis [22]. In a study of netrin mutation, expression, and methylation profiles across TCGA-PANCAN, several netrins were identified as potential diagnostic markers for endocrine tumors [23]. However, an approach focusing on multivariate, multi-cancer survival analysis remains unexplored with this gene family. The gene expression dataset of netrins and their receptors (netrins-receptors) consisted of the six netrins NTN 1, NTN 2, NTN 3, NTN 4, NTN 5, NTNG 1, and NTNG 2; the receptors of the UNC-5 family UNC 5 A, UNC 5 B, UNC 5 C, and UNC 5 D; and receptors DCC and DSCAM [21]. Altogether the dataset comprised twelve genes.\nFig. 6. Screenshot of the cumulative hazard plot of KIRC for DSS indicates distinct prognostic outcomes for low and high-risk sample groups. A. Ozhan et al. Computers in Biology and Medicine 137 (2021)1047938 findings indicated unique HR GSs for this gene set, especially in neural and renal cancers such as LGG and KIRC (see also Supplementary Fig. S 10). In addition, the best subsets of netrins-receptors identified with lasso could significantly differentiate between high-and low-risk prognostic outcomes. Given the role of netrins and their receptors in the nervous system as axon guidance molecules, their involvement in LGG is expectable. Our results with kidney cancers KIRP and KIRC seem to corroborate those by Hao et al. [23], who studied the impact of netrins on survival via methylation studies, e. g., for NTN 4 and NTNG 1. However, our unique multivariate approach also allows for understanding the proportional contribution of genes within the netrins-receptors gene set on survival. Therefore, the netrins-receptors and their best subsets can be analyzed further experimentally in renal cancers and LGG to determine their biological role in these disorders.\n"
    ],
    "A robust hierarchical clustering algorithm and its application in 3D model retrieval": [
        "Benchmark [7], which costs too much and is not suitable for dealing with the numerous models or varying-time models. Because the core research in 3 D model retrieval is feature extraction, the work of organizing 3 D model database is not much. However, most of the existing methods for organizing 3 D model database are based on traditional clustering algorithm. Meanwhile, they do not take new incremental models into account. For example, [8,9] focus on agglomerative hierarchical clustering algorithm to classify 3 D models. The algorithm called ASCAR in [8] is developed based on CURE and a distance-based outlier detection, which can stop clustering automatically. CAOIC+Kmeans proposed in [9] makes use of outlier information and adopts the concept of core group to reduce the influence of parameters. However, the number of clusters still needs to be predefined. Based on above work, a platform with several traditional clustering technology is developed in [10]. But the clustering results are not very good. [11] proposes a new feature vector Multi-Fourier Spectra Descriptor (MFSD), augmented with spectra clustering, whose core work still is the feature extraction. [12] proposes a mutual information distance measurement to perform the similarity comparison of 3 D models. And the statistical learning algorithm, support vector machine (SVM) is employed to solve the 3 D model classification problem. III. A DYNAMIC CLUSTERING ALGORITHM BASED ON ARTIFICIAL IMMUNE SYSTEM\nIn this section, the algorithm, called AIS-based in brief due to space, will be compared with both ASCAR [8] and CAIOC+Kmeans [9]. Firstly, a general comparison will be drawn from their performance and characteristics as shown in Table III. Then the experimental comparisons are listed in\n"
    ],
    "A service mining approach for time estimation of composite web services": [
        "In [16], the performance of WSs at different levels (i. e., user level, network level, hardware resource level and software design level) is analyzed. The user level isthe one corresponding to the execution time of the WS, which is calculated by R. Angarita et al. / Electronic Notes in Theoretical Computer Science 302 (2014)5-2812 the mean value during a certain period by invoking the actual WS. T h e r ei sa l s o researchintheestimationoftheexecutiontimefor CWSs. I n [25], atimeestimation method for CWSsis proposed based on mining historical execution information of the component WSsin the dynamic environment. It takes into account the WS execution time and the network transmission time.\n",
        "In this section, we characterize the CWS and the environment in terms of execution time and user outputs. Themodel considers the Qo S of component WSs, the execution state at the failure moment, and the effect of the recoverystrategies on the global CWS Qo S. We only consider execu-tion time in this study, but the model can be easily extended to consider other Qo S parameters. Qo S values describe nonfunctional WScharacteristics (e. g., executiontime, price, re-liability), but they can vary during the WS lifecycle. Hence, there exist several techniques to keep these Qo S parameter values as most as possible updated. Execution time estima-tion for WSs can be done with analytic, simulation, or test based techniques [10,11,24]. 4.1 Decision Criteria\n"
    ],
    "An algorithm for Distilling the skeletons of the works of Chinese Calligraphy": [
        "Skeleton Extracting We choose an effective thinning algorithm based on binary images, which fit for distilling the skeletons of works of Chinese calligraphy and paintings [44]. The skeletons produced by this algorithm nor only have good symmetry and connectedness, but it records the width of each pixel on the skeletons. This is an effective method except for pin lines which is not allowed. To solve this problem, we use the method in [35] to eliminate these pin lines as Figure 3 shown. (a) (b) (c) (d)\n",
        "Image thinning means extracting the image skeleton. Skelet on has important image target geometrical features; if how to quickly get the non-distortion of the binary image is the applied premise, then we can analyze the shape of the image target, compress information, extract feature, pattern recognition and so on. Generally speaking, skeleton mainly has three characteristics: continuity, minimum width is 1 and centralness [7].\n"
    ],
    "A division based composite service selection approach": [
        "In order to improve the convergence performance and speed of the algorithms, researchers tried to combine the his-torical data with these methods. Zhang et al. [55] p r o p o s e da Int J Adv Manuf Technolnew selection approach by recording the execution information of composite services into a log repository and then discovering knowledge of execution sequential patterns by datamining. In [58], composite service selection was performed by taking the divided composite service dots as selection units, and the dots were the set of efficient composite service exe-cution instances extracted from the log repository. Kang et al. [13] presented a Web service recommendation approach based on users \u2019interests and Qo S preferences mined from the service usage history. Nevertheless, these studies just concentrat-ed on the impact of usage historical data on the service composition and service recommendation, only a few of whichcombined service prior knowledge with the design of algorithm model and searching strategy. In this paper, we divide the candidate service set based on apriority to reduce thesearch space.\n"
    ],
    "An active and opportunistic service replacement algorithm orienting transactional composite service dynamic adaptation": [
        "Accordingly, it is efficient to set up backup methods for the unstable/failed services and replace them with the backup ones when frequent service composition interruptions occur in dynamic network. A fresh concept of service interruption index and a framework for service composition recovery based on this concept is proposed [8,9]. According to the frequency and duration of service interruptions, they set up a comprehensive evaluation mechanism to reduce the frequency of the service interruptions and prolong the survival time of the service composition. However, because more paths of service composition have been selected, the load of the network has been increased. Zhou et al [10] have proposed a recovery strategy for service composition use backup services, but the study has not been able to point out the exactly ways to realize and further, also lacks of Qo S guarantee. Based on \u201ctransaction support\u201d, an active and opportunistic replacement algorithm [11] is put forward according to the feature of transaction of web service. However, the research only focused on simple situation of service composition interruption (e. g., the failure of single task), and could not well satisfy the actual demands of service composition recovery in dynamic network.\n",
        "In the following simulation experiment, the e fficiency of the proposed method and another service substitution algorithms (Li et al. 2011; Yin, Zhang, and Zhang 2010) is compared. Assume that the service clusters have been built. Many services are generated by the method in the reference (Tan, Fan, and Zhou 2009). The experiments are implemented in C# programming language. The database of the simulation experiment is SQL Server 2008. A computer with 64-bit CPU, 2.59 GHz and 3.98 G memory is used.\nThefirst experiment compares atomic service substitution of the proposed algorithm with the one in the reference (Li et al. 2011), denoted by \u2018Li-Algorithm \u2019, the one in the reference (Yin, Zhang, and Zhang 2010), denoted by \u2018Yin-Algorithm \u2019, and the experimental results are shown in Figure 6.\n"
    ],
    "XML clustering based on common neighbor": [
        "In clustering XML documents, the problem of measuring structural similarity among XML documents was generally addressedin the literature from the perspective of tree/graph-matching (Lv et al., 2006; Flesca et al., 2002; Tagarelli and Greco, 2006;\n",
        "Because of their rich and flexible format that can be widely used in various applications, the size of in formation collections increases rapidly. The data mining tech niques, which have used to apply with text of relational da tabase, need to be adapted and invented in properly XML min ing [20]. Basically, an XML document is known as semistructure data. This semi-structure data format can be seen as trees, the complex data structure. Based on trad itional data mining techniques, there are many researches s tudied, adapted and invented new paradigm and algorithm in XML mining for examples, decision trees in XML document s [21] [22], tree pre-processing [23] [24], tree clus tering [20] [23] [25], extracting association rules from XML da ta [26] [27] [28], and also tree pattern matching.\n"
    ],
    "Strategy for community control of complex networks": [
        "In Gaoet al. \u2019s study [39], a greedy algorithm has been developed to identify steering nodes which are sufficient for target control. Several further studies developed algorithms to identify steering nodes for target controllability by reducing the number of steering nodes or considering realistic constraints. Instead of using the greedy algorithm, Zhang et al. [52] developed an algorithm which elaborately rearranges the matching order of the nodes such that the required number of steering nodes for target control can be significantly reduced. The comparison results on model generated networks and real networks indicate that the proposed algorithm outperforms Gao et al. \u2019s algorithm [39]. Because the functions of network systems intensively depend ontheconnectionsbetweennodes, Liu et al. [53] investigated the target controllability of giant connected components of directed networks by selecting target nodes from giant connected components, which are the connected components of networks that have constant fractions of nodes in networks. In the study, the relationships between the number ofsteering nodes for controlling giant connected components and the parameters of model generated networks are explored. Piao et al. [54] considered controlling a subnetwork called target community of a complex network when the whole topological structure of the network is not available. They argued that though a target community is controllable with steering nodes identified by structural controllability analysis, determining input control signals that are able toachieveagivencontrolgoalcanbe verydifficult.\n"
    ],
    "An observer deployment algorithm for locating the diffusion source timely in social network": [
        "There have also been some works on onlinesensors selectiontobeabletotakeintoaccountthespreadevolvingdynamics [32,34,35]. However, weconsidersuchapproachto be beyond the scope of our study because it introduces new issues and challenges.\n"
    ],
    "Altering indispensable proteins in controlling directed human protein interaction network": [
        "Furthermore, among the key regulators in PCa, 17 regulators (ACCT 4, EIF 3 A, HSPA 5, NOP 2, RANBP 2, RPL 11, RPL 15, RPL 19, RPL 23 A, RPL 3, RPL 5, RPL 6, RPLP 0, RPS 11, RPS 8, RPSA and SNU 13) are essential genes responsible for survival in human [35,37]. Since the highest enrichment of essential genes was obser ved among the top 100 highest degree nodes, hubs (90%), largest number of kinless hubs (16 out of total 29) in PCa PPI network were also found among these top 100 hubs (Fig. 7). Therefore, the high degree nodes of PCa PPI network corresponded to majority of the kinless hubs and essential genes indicating their importance in the sustenance of the system and ultimately survival of the PCa cells [5,14]. Loss of such important kinless and connector hubs from the PCa PPI network corresponding to the essential gen es may induce a shift in the interactions between the proteins and ultimately disintegration of the functional/pathway modules rendering the PCa cells difficult to survive. Hence, kinless hubs are the most important hubs in PCa PPI network which could be p otential target candidates for development of drugs against PCa. Mutations in such modular hubs and dysregulation of their expressions could alter the interactions between proteins affecting the multi protein complex formations and signalling pathways dist urbing the dynamics of the system leading to tumour development in PCa and other related diseases [61]. Thus, these modular hubs could serve as the driver nodes of the PPI networks and perturbing them could induce changes in the network dynamics altering t he phenotypic characteristic of the biological system [62].\n",
        "Bipartite networks describe various systems with complex structure, as diverse as social science, Internet technologies and bi-ological systems. In a pioneer work of 1941, a bipartite network was applied in the Southern Women dataset representing the par-ticipant of 18 women in 14 social events over 9 months [1]. In recent advances of social science, user-content bipartite networks are employed to identify in\ufb02uential users in online social networks [2,3]. Bipartite networks are particularly applied in Internet tech-nologies such as job-server load balancing in data center [4], de-tecting network traffic anomalies [5,6], and user-item interactions in recommendation systems [7,8]. In addition, bipartite networks successfully describe various biological systems ranging from drug-target interactions [9], gene regulation interactions [10], protein-protein interactions [11] and metabolic pathways [12]. In ecolog-ical systems, bipartite networks are widely employed to represent consumer-resource (such as prey-predator) and mutualistic (such as plant-pollinator) interactions within large ecological communi-ties and reveal critical mechanisms on the maintenance of biodi-versity observed in nature [13-18]. \u2217Corresponding author.\n"
    ],
    "Analysis on key nodes behavior for complex software network": [
        "The global evaluation results provide the engineers a preliminary understanding of the importance of software class from one side. However, the results of global evaluation are not sufficient considering a complex and uneven software structure. At the local aspect, the importance of the software class depends on the interactions between classes and their neighbors and the complex nature of the class. The metrics are defined as follows: z Strength [17], denoted as s, measures the number of nodes that directly depend on node i, which represents how strong the class i connected with other classes.\n",
        "With complex network theory, the investigation of software system is based on the exploration of components and their dependencies called components dependency network (CDN) [32]. Researchers make contribution to the model and topological analysis of software network in different granularities: packages, class, method or function level and multi-granularity with the combination of the three mentioned above [32]-[34]. The structure of software network has beenstudied with the utilization of community analysis combined with functionality in different software and multiple versions of the same software [35], [36]. The software executionprocesses have been first modeled as evolving complex network [1] and been promoted by analyzing the key node behavior in software execution network [37]. Except for the research mainly focusing on some specific software, the study of OS in complex network perspective has also been performedrecently. Yan et al. [38] analyzed the topology and evolution of Linux-kernel network compared with genomes by building the hierarchical structures. Two new network growth modelswere developed by Zheng [39] to analyze Gentoo Linux whose network cannot be easily explained by existing complex network models. The investigation of Gao et al. [40] shows that critical nodes in kernel\u2019s complex network tend to have large in-degree by conducting percolation analysis to the whole network. The coupling relationships between Linux\u2019s components were uncovered by Wang et al. [41]. They also compared networks extracted from normal and failurestatus of Linux OS to perform effects analysis of system failures on networks. Xiao et al. [42] studied the evolution of 62 major releases of the Linux kernel by utilizing complexnetwork theory and analyzed changing events among versions in network perspective. III. P RELIMINARIES\n",
        "Zhang et al. 13 recognized the node which had the maximal value of the betweenness or the clustering coe\u00b1cient. The betweenness represented the highly active role for one node and the clustering coe\u00b1cient represented the hierarchical structure of the entire network. They found that some nodes played an important role in the execution process of the software system and had an important signi\u00afcance to ensure the software quality. Wang et al. 14 de\u00afned the in\u00b0uential node by its in-degree and outdegree value. They proved that the important node was closely related to the fault propagation in the software network. These nodes with high value of degree had a strong ability of connectivity. So the fault information could widely spread through G. Huang et al. 1650085-2\n",
        "Complex networks modeling software are no exceptions to this regard: it was shown in a wide variety of cases that they have the same properties as other real-world complex networks, see for instance [18,21,7]. As a consequence, a large stream of studies were devoted to the characterization and descriptionof software networks with the tools of network science. Some of these works go further: they use network statistics to estimate the complexity of softwaresource codes, theirlevelofmodularity, theirrobustness, orevento detectpiecesofcodesthat should be separate modules [14,23,16,11,19,10,13,7,24]. Complex Networks and Link Streams 45\n",
        "Using the typical methods in network, we can extract useful information in software analysis [8]. In the function call network, circulation path usually corresponds to recursion relation, and in the flow chart network, it corresponds to loop structure.\n"
    ],
    "Altering control modes of complex networks based on edge removal": [
        "Onecanfurtheridentifytheroleofanodeincontrol, node jforexample, bytemporallyremovingnode j\u2212andall itslinksinthebipartitenetworkandredothemaximummatching [6,9,10]. Ifnode j\u2212isoriginallymatchedandits removalreducesthenumberofmaximallymatchednodes, node jisredundant, asnode j\u2212needstobealwaysmatched 2 X. Yu, Y. Liang, X. Wang et al. Physica A 572 (2021)125868 intheoriginalmaximummatchingconfiguration. Correspondingly, node jdoesnotneedanyexternalcontrol. Itisselfcontrolledoncethesufficientsignalsareputonothernodesinthenetwork. Thefractionofredundantnodes nris determinedbythedegreedistribution. Ithasbeenfoundthat nrwouldfollowabifurcationdiagramwhentheaverage degreeincreases [6,9,10]. Inthiswork, wemainlyfocusonthebehaviorof nr.\nTobringmoresupportstoourproposedmeasure, wecalculatethe Sout/Sinof 105 realnetworks. Wealsomeasurethe nroftheoriginalnetworkanditstransposednetworks (denotedby n T r). If nroftheoriginalnetworkfollowstheupper branch, thenitwouldfollowthelowerbranchinthetransposednetwork. Hence, \u2206nr=nr\u2212n T rcanbeusedtogaugeif thenetworkisontheupperorlowerbranchofthebifurcation [6,9,10]. Thepredictionwemade, whichisalsosupported bynumericalresultsofmodelnetworks, isthat \u2206nr>0 when Sout/Sin<1 and \u2206nr<0 when Sout/Sin>1 (Fig. 7).\n"
    ],
    "Altering control modes of complex networks by reversing edges": [
        "Previousworkshavediscoveredabimodalityfeatureincontrol [6,8-10]. Forneutralnetworkswithsymmetricoutandin-degreedistribution, whentheaveragedegree \u27e8k\u27e9ofthenetworkexceedsacriticalvalue kc, nrwouldfollowa bifurcationdiagram, whichissimilartothepitchforkbifurcation [11,12]. Thebifurcationindicatestheexistenceoftwo distinctcontrolmodes. Inonemodel, correspondingtotheupperbranchofthebifurcationdiagram, mostnodesinthe networkareredundantandonlyafewnodescanbeincludedinthe MDS. Intheothermodel, nrisonthelowerbranch ofthebifurcationdiagramandmostnodescanparticipateincontrol. Onthecontrary, fornetworkswithasymmetric out-andin-degreedistribution, thebifurcationwouldbeeliminatedintoasinglebranch. Specifically, anetworkwhose out-degreedistributionismoredivergentthanitsin-degreedistributionwillstayonthelowerbranch, andviceversa.\nOnecanfurtheridentifytheroleofanodeincontrol, node jforexample, bytemporallyremovingnode j\u2212andall itslinksinthebipartitenetworkandredothemaximummatching [6,9,10]. Ifnode j\u2212isoriginallymatchedandits removalreducesthenumberofmaximallymatchednodes, node jisredundant, asnode j\u2212needstobealwaysmatched 2 X. Yu, Y. Liang, X. Wang et al. Physica A 572 (2021)125868 intheoriginalmaximummatchingconfiguration. Correspondingly, node jdoesnotneedanyexternalcontrol. Itisselfcontrolledoncethesufficientsignalsareputonothernodesinthenetwork. Thefractionofredundantnodes nris determinedbythedegreedistribution. Ithasbeenfoundthat nrwouldfollowabifurcationdiagramwhentheaverage degreeincreases [6,9,10]. Inthiswork, wemainlyfocusonthebehaviorof nr.\nTobringmoresupportstoourproposedmeasure, wecalculatethe Sout/Sinof 105 realnetworks. Wealsomeasurethe nroftheoriginalnetworkanditstransposednetworks (denotedby n T r). If nroftheoriginalnetworkfollowstheupper branch, thenitwouldfollowthelowerbranchinthetransposednetwork. Hence, \u2206nr=nr\u2212n T rcanbeusedtogaugeif thenetworkisontheupperorlowerbranchofthebifurcation [6,9,10]. Thepredictionwemade, whichisalsosupported bynumericalresultsofmodelnetworks, isthat \u2206nr>0 when Sout/Sin<1 and \u2206nr<0 when Sout/Sin>1 (Fig. 7).\n"
    ],
    "Web service community discovery based on spectrum clustering": [
        "It is very necessary to manage Web services through applications which are based Web services such as service discovery [2-3], service composition [4-5], and so on.\n",
        "Web service document is obtained, then they are used as the input of the SVM for classi /uni FB 01 cation prediction. Web Services Classi /uni FB 01 cation with Topical Attention Based Bi-LSTM 401 [3] Bi-LSTM: We regard the last hidden vector from Bi-LSTM as the service document representation. Then we feed it to a linear layer whose output length is the number of categories. Finally, a softmax layer is added to output the probability distributions of all candidate categories. [4] Tag-Bi LSTM: On the basis of Bi-LSTM, it exploits tags as auxiliary information of the Web service description document to perform Web services classi /uni FB 01 cation. The tag contains a wealth of information about the Web service features, which helps improve the result of service classi /uni FB 01 cation. [5] LAB-Bi LSTM: The proposed method in this paper, which incorporates topic modeling into the Bi-LSTM architecture through an attention mechanism for service classi /uni FB 01 cation. 3.4 Experimental Results\n",
        "With the advent of Web 2.0, new information technologies represen ted by service computing [1], cloud computing [2] and big data [3-5] have been developed rapidly. As a most technology in Service-oriented Architecture (SOA), Web services have also made great progress. Even though Web services published on the network is increasing gradually, these Web services are difficult to meet people's complex functional requirement. Web services are online application services published by organizations to perform certain functions and be designed online to support the interaction between interoperable machines on the network with low cost and high efficiency [6]. There are many Web service description languages, for example, WSDL [7], OWL-S [8], RESTful structured API services, and so on. It has become a challenge problem to immediately and exactly discover the user-desired Web services, and it also becomes more difficult for softwa re developers to discover and reuse service resources effectively. According to the survey, it is very essential to perform the management of Web services through the technologies of service discovery [9], service composition [10]. The classification of We b services with similar functionality is an important approach to support service management, which can extremely decrease the search time and the scope for Web services, thus promote\n",
        "Qo S parameters of a service composition is by dynamic service selection at run time [7], [8], [9], [10], [11], [12], [13]. In a dynamic service composition a set of functionally equivalent services exists for each service invocation and actual services are incorporated into the execution configuration depending on their most recent Qo S parameters. However, two dominant issues limit the application of dynamic compositions on a larger scale: service selection and detection of equivalent services. Since service selection at run time is bonded by additional constraints, like statefullness and composability [7], statebased reliability models need to be applied. However, such models are prone to state explosions [14], making it difficult to support more complex compositions. The other commonly used approach treats service selection as an optimization problem. However, such optimization problems are complex [15], often NP-hard, which limits their use. Moreover, discovering and maintaining sets of functionally equivalent services, i. e. homogeneous service communities [8], [16], [15], also proves to be a nontrivial task [17], [18], [19].\n",
        "Recently, these efforts have been extended to semantic community and network discovery of resources that are used to analyse, visualise and explore such community and network for potentially improving the resource discovery process. The article addresses the problem of how to discover web services community formed by closely interactive web services (zhang et al., 2009). They proposed a novel approach which constructs a web service execution network from logs and clustering it using spectrum clustering. Semantic link network (SLN) is a loosely coupled semantic data model for managing web resources. Its nodes can be any types of resources. Its edges can be any semantic relations (Zhuge, 2012). By studying the intrinsic relationship between semantic communities and the semantic space of SLN, approaches to discovering reasoning-constraint, rule-c onstraint, and classificationconstraint semantic communities are proposed (Zhuge, 2009). Approaches m ake use of the semantic communities and the emerging semantic relations in a dynamic complex network of learning resources to support effective learning. Xia and Bu (2012) proposed an approach to community detection based on a semantic network. They constructed a semantic network using the semantic information extracted fro m comment content, which attaches nodes and sides w ith computer-understandable semanteme. And they considered the impact of the weight on every edge and focused on the \u2018giant component\u2019 of the online social network to redu ce computational complexity.\n"
    ],
    "Input graph: the hidden geometry in controlling complex networks": [
        "Section S 3.2). Unfortunately, enumerating all maximum matchings is extremely expensive from a computational perspective\u2014a network with a couple dozen species has several hundred million unique max-imum matchings. To solve this problem, we employ a recently-de-veloped algorithm that reveals the control correlations between the nodes in the graph while requiring considerably less computational resources (Zhang, Lv, & Pu, 2016). Using this algorithm, we are able to identify species that are possible control inputs\u2014those that belong to the minimum driver-node set in at least one of the possible control configurations. Here, we extend this algorithm such that it is possible to calculate a highly accurate approximation of the control capacity \u03d5 of every species in the network (Supporting Information Section S 3.3). In the networks that contained reciprocal links (because there was no asymmetry in the dependences of a species pair), we averaged a species' control capacity \u03d5 across every possible \u201cnon-recip-rocal\u201d version of the network (Supporting Information Section S 3.4).\n",
        "Role of nodes and edges in network controllability. When generating a minimum control configuration, it is the maximum matching that dictates the unmatched nodes that receive direct input at the base of the stems of the corresponding cacti (these nodes have often been called \u201cdrivers\u201d). Alternative minimum control configurations can be generated by sampling or enumerating alternative maximum matchings 4,5,12 or by using the notion of the input graph to find alternatives 9. Because this has largely been an approach based on sifting through the degeneracy, a number of studies have quantified the role that certain nodes and edges play in controlling the network. The DCS concept concisely explains these categories because DCSs are by definition mutually disjoint.\nInput graph. The input graph of a directed network is defined as an undirected graph with the same nodes and whose edges encode alternative driver nodes that make the directed network structurally controllable 9. It is built by connecting two nodes xi and xj in the input graph if there exists a node xk which has an edge (in the original network) to both xi and xj, where one node is matched and the other is not. Doing so requires effectively enumerating through the alternative matchings. The utility of this construction is that it identifies interchangeable driver nodes and with an extension identifies nodes that are never selected as driver nodes across all possible maximum matchings. cv=\u03c7v \u03c7=\u03c7D 1 \u03c7D 2... \u03c7 Di\u22121 \u03c7Di, v\u03c7Di+1... \u03c7 Dk \u03c7D 1 \u03c7D 2... \u03c7 Dk=\u03c7Di, v \u03c7Di, 9\n",
        "Interferon regulating gene 83%Ackerman et al. BMC Bioinformatics (2019)20:297 Page 11 of 13 nodes for the bipartite representation of the network, ND, is found using a maximum matching algorithm such as Hopcroft-Karp [36]. For all ND, control adjacent nodes were identified iteratively and an input graph was created as dictated in Zhang et al. [72]. The input graph was used to classify nodes as critical (in all minimum input sets), neutral (in some minimum input sets), or redundant (in no minimum input sets).\n"
    ],
    "A self-healing composite web service model": [
        "The composite service problem is recognized to exhibit a dynamic nature particularly when the services are provided by Web services [11], [12], [39], [40]. A self-healing approach is presented in [41], [42] to correct the selection of the composite service. Another self-healing algorithm that bases the selection on a performance prediction scheme is proposed in [39]. The changing conditions of the environment are also addressed in [40], where a guided heuristic based on clustering techniques is proposed and evaluated in a set of experiments. The stochasticity inherent to the composite service problem is modeled as a multi-objective program in [43], where the results show the potential risk of simplifying the problem to the commonly used deterministic model. 8 C ONCLUSIONS\n",
        "Considering response time and cost enables the selection of faster and cheaper services, providing a competitive advantage [6]. Both parameters have been used in other approaches, like those presented in [3,10,22,24].\n",
        "Composite services often run in dynamically changing environments [24]. In this paper, we consider the following types of dynamic changes that may occur during the execution ofcomposite services. 1) The loss of Qo S caused by the variability issue of Web services\u2019 Qo S attributes [25]. 2) The unavailability of Web services due to many reasons such as network failure or user mobility. 3) The detection of new Web services with better Qo S and recoverability are detected. 4) The addition of new composition needs as well as ATS and SLA constraints.\n",
        "The importance of self-healing approaches and the role they play in fault detection, system recovery, adaptivity, reliability, security, sustainability, complexity and cost reduction, to name a few, have been pointed out by many researchers [3]-[8] during their attempt to develop self-healing strategies, models and methods for a specific domain. Ghosh et al. [7] in their survey and synthesis of Self-Healing Systems (SHS) presents a classification of existing work. The outcome of this survey is that the structure of SHS is quite modular (classification with respect to failure detection, system recovery and maintenance of health). SHS are based on different paradigms of computational models such as architectural models, repair plans, agentbased models. So, the opportunity and inspiration to develop systems based on new sources is substantial. Since there is a high demand for new sources, we sought to establish a novel methodology for modeling SHS based on the formal speci-fication by predicate logic of a given system. Therefore, we present in this section the related literature based on existing formal methods for modeling multi-disciplinary systems like the smart grid and self-healing approaches based on formal specification.\n",
        "In [5], a Markov Decision Process is used to take the best action due to quality of services (Qo S) failure of the provider service, by initiatin g service replacement earlier to minimize losses to the consumer from the viewpoint of 2014 IEEE 11 th International Conference on e-Business Engineering 978-1-4799-6563-2/14 $31.00 \u00a9 2014 IEEE DOI 10.1109/ICEBE. 2014.331392014 IEEE 11 th International Conference on e-Business Engineering 978-1-4799-6563-2/14 $31.00 \u00a9 2014 IEEE DOI 10.1109/ICEBE. 2014.33139 the consumer. And only sequential structure is considered. In [6], a self-healing model integrates the flexible compensation service in selection and reselecting in execution, mining cascading scope of replacement in advance by considering full multi-relation among transaction web services. In [7], the structural and temporal dimension are considered. Th is approach proposes to analyze the fault-impact temporal region, where the potential candidates are identified to assist in reducing the number of service changes in handling the fault situ ation. In [8], the paper focuses on the faults diagnosis. The diagnosis algorithm is based on the error dependent matrix by the computation of the error propagation degree. And the diagnosis is based on fuzzy r easoning. In [9], a long-term benefit driven adaptation tec hnique is presented. A stateaction-revenue model is designed to capture the causeeffect dynamics in adaptation. The adaptation decision problem is modeled as a sequential decision problem realized by the POMDP. For the Qos-aware service processes, [10] proposes to select the supplementary services during composition to be the backup source for efficient recovery. A reconfigurati on region including all failed services is identified to mini mize the reconfiguration cost.\n",
        "Although several ways could be used to improve the reliability of the whole workflow, including changing single web services\u2019 reliability, re-adjus ting the workflow structure in runtime [1], reselecting services in execution [2], re-configuri ng the service pool appears to be a practical way among them. Service pool, formed up by a number of services with the same functionality and similar quality properties [3], is a common fault-tolerant mechanism in service oriented architecture to improve the service reliability. An example of serv ice-oriented workflow with service pools is shown in Fig. 1. As shown in Fig. 1, S 11 is selected as the primary service working in the workflow and S 12, S 13 are registered as the hot stand-by backups for S 11. Thus S 11, S 12 and S 13 form a service pool in the workflow, S 41 and S 42 form another service pool and so is the case for S 61, S 62, S 63 and S 64. \n",
        "The technique in [5] uses dynamic description logic (DDL) to describe the preconditions and effects of web services. The DDL-described preconditions and effects are transferred into diagnosing processes embedded in a BPEL execution environment. When a web service is executing and a diagnosing process identifies mismatched predictions or effects, the web service will be isolated for path healing. The technique in [11] proposes a heuristic algorithm to evaluate whether the remaining resources (e. g., time and budget) are sufficient for the un-finished sub-path. If the answer is negative, the unfinished sub-path should be re-planned. The technique in [2] predicts seven types of web service failures and proposes solutions to recovery the failures. The recovery solutions are embedded in the BPEL execution environment. The tech-nique in [12] uses semi-Markov model to predict the performance of an executing web service. If the predicted performance failed to fulfill the Qo S criteria, the re-planning process is initiated. In this case, the re-planning process and the web service path are executed simultaneously, which reduce the time of re-planning. The tech-nique in [3] proposes that the failure of an executing web service may cause other executing ones to fail because more than one close related web service of a path may be executed in parallel. It uses direct compensation dependency (DCD) and indirect compensation dependency (ICD) to identify the scope of web services affected by a failure. It then reselects web services to replaces those within the scope. The reselec-tion should offer the least compensation. The technique in [4] belongs to the WS-DIAMOND project [13]. It proposes a Qo S-driven, connector-based proactive healing 672 S.-C. Chou and C.-Y. Chiang architecture. Connectors can intercept the message sent to a web service and add Qo S requirements to the message. After the execution of a web service, the Qo S values are recorded in a log. The diagnostic engine periodically checks the log. When a possible violation of Qo S criteria is detected, the technique initiates a re-configuration me-chanism to self-heal the web service path. The technique in [14] compares the exact Qo S values with the estimated ones. When a loop of web services is executing, the exact Qo S values of the web services are used to evaluate whether the loop may vi-olate the Qo S-based SLAs. If the answer is positive, the technique uses an algorithm to identify the path slice (i. e. sub-path) that should be re-planned. The technique in [15] defines the scope of failed web service and embeds the scope in BPEL. When failure occurs, the BPEL engine initiates the healing process for the scope, which selects new web services to replace the failed ones. During the selection, the local and global business rules are used to filter out infeasible web services. 3 TLPRP TLPRP is depicted in Figure 1. The dotted blocks are the meta and physical level replanning mechanisms, respectively. Below we explain the figure.\n"
    ],
    "A composite web services discovery technique based on community mining": [
        "Web service field. Kil [19] used a real-world dataset to analyze the topological landscape of Web service\u2019s networks and concluded that the network exhibited small world network and power-law-like structure to some extent. Zhang et al. [20] analyzed the logs of an execution engine to elucidate the Web service community and combined the closely interactive\n",
        "Research has also examined the management method of the cloud service community from the perspective of data mining. For example, in [15] the authors discover the web service community structure by mining the execution history information of the web service. In the SOA system, the interaction between the services to generate a certain interaction relationship, through the mining of such historical call and execution information, to achieve the discovery of the community structure of the web service, the author uses a method of spectral clustering to discover the service community.\n",
        "In fact, the invoking relationship can be considered as the collaboration among the web services combined to provide with output parameters. To assess the invoking reputation, it depends on community structure in WSCN, the vertex\u2019s importance analysis has been studied in community structure [12]. In addition, the community structure is applied into the web service field. Literature [13] used the real-world data set to analyze the topological landscape of web service\u2019s networks, and concludes that the network exhibit small world network and power-law-like to some extent. Literature [14] analyzed the logs of the execution engine to discover the web service community, and applying these closely interactive web services into composition process. Literature [15] presentda novel web service management based on collaboration network, where the network is undirected and has a weighted edge. In addition, it introduces some metrics to re\ufb02ect the web service properties. Distinguish to the above construction method of WSCN, the community detected in our WSCN can effectively re\ufb02ect the collaboration relationship, and is more suitable for our reputation evaluation. III. T RUSTWORTHY WEBSERVICE MODEL\n",
        "Some works have been done towards addressing the problem of how to discover and quantify Web Service s community. For insta nce in [11], Zhang et al. proposed an approach that forms WSC by grouping closely interactive\nA consensus of many researche rs ( [10], [11], [12], [13], [14]) was that the Web Service community concept is basically created in order to sustain high availability of its\n",
        "Some works have been done towards addressing the problem of how to discover and quantify Web Services community. For instance in [8], Zhang et al. proposed an approach that forms WSC by grouping closely interactive Web\n",
        "Evidently, to realize above service composition model, the intelligent service discovering and selecting mechanisms are indispensable for considering that the quality of the service composition result much depends on the selected concrete services. Some current service discovery solutions mainly rely on the technologies like Ontology [14], lexical database (e. g. Word Net), and data mining [15]. In order to complete our proposed model, a further investigation for the service discovery will be performed in our future research by referring to these technologies. 6. CONCLUSION AND FUTURE WORK\n"
    ],
    "QoS-driven transactional web service reselection for reliable execution": [
        "Service. In [10], authors proposed a Qo S-driven transactional service reselection model for reliable replacement. The model reselects Web Services not only according to their\n",
        "A set of mapping rules are then defined to propagate the changes, and a Petri-net-based model is used for change reaction. Ontologies are used for locating services from an exploratory service space. Agents are employed to assist in detecting and managing changes to the enterprises. To add self-healing capabilities to services-based systems, Mikic-Rakic et al. [2002] presented some architectural requirements. It is stated that adaptability, awareness, autonomy, dynamicity, and robustness are essential for services to achieve a self-healing architecture style. We have addressed these concepts atdifferent levels in our model. In Friese et al. [2005], self-healing execution of business ACM Transactions on Internet Technology, Vol. 15, No. 2, Article 7, Publication date: June 2015. ii ii i ii i Bottom-Up Fault Management in Service-Based Systems 7:29 processes is supported through a configurable add-on layer to the BPEL execution engine, which focuses on service replacement in case of faults. Other similar works that primarily employ service replacement include Salas et al. [2006], Yin et al. [2010], Nasridinov et al. [2012], Li et al. [2012], and Ben Halima et al. [2008]. Similarly, in Naccache and Gannod [2007], the notion of differentiated services in order to maintain required performance characteristics is presented, while Sadjadi and Mc Kinley [2005] use autonomic computing concepts to address self-management of compositesystems. The latter\u2019s work focuses on creating two different but equivalent Web services (images from a surveillance system) for supporting fault-tolerance. Liao et al. [2004] also use autonomic computing to manage self-healing in service compositions. Their system is agent-based, where the federation of agents is employed to \u201csimplify the control of Web services composition, sharing, and interaction. \u201d Similar to our work, their proposal includes some rules for the selection of alternative Web services in caseswhere the current Web service is nonresponsive. Maximilien and Singh [2004] also de-fine a similar agent-supported system for Web services selection. Other research into self-healing systems ranges from total server failure scenarios [Candea et al. 2003] totransactions-based models for systems recovery of failed systems [Eddon 2005].\n",
        "Web Service availability and present a framework, WS-Replication which usesmulticast to communicate with the replicas of Web Service. In [7], authors proposed a Qo S-driven transactional service reselection model for reliable replace-ment. The model reselects Web Services not only according to their Qo S characteristics but also to their transactional properties. Mentioned approaches [5,7] provide recoverability by reselecting the failed services. In order to reduce number of reselections, in [2], the authors proposed a method to predict the Qo S and performance of Web Services in composition, in order to backup alternateservices in selection and then reselect in execution. In their paper, they used asemi-markov model to predict the data transmission speed of the network which Web Services are run. While this approach is novel, it is applied to only one of themetrics of Qo S and assumes equivalence of services in terms of functionality.\n"
    ],
    "Efficient target control of complex networks based on preferential matching": [
        "However, it is often infeasible or unnecessary to fully control the entire large-scale networks. For example, in aneconomic network, we may only aim to control a certaingroup of individuals who are essential to our task, rather than controlling all the individuals. Therefore, the control of a prescribed subset of each layer is motivated. This problemis termed target controllability of multiplex networks. To the best of our knowledge, target controllability of multi-plex networks remains largely an outstanding challenge facedin various real-world applications in the areas of biology, chemical engineering, and economic networks [29]. Although target controllability of single layer networks has been studied [29]-[34], the results are hard to be extended to multiplex networks. This is because in multiplex networks, the controlleris usually limited to interact with only one layer, while the target controllability of nodes in other layers also has to be guaranteed.\n",
        "Traditionally, we assume that the topological structure remains static and then study its influence on the net-work dynamics 1-6. However, the topological structure is not necessarily static. One of the relatively new research themes is how dynamic processes affect on the evolution of topological structures. In particular, the controlla-bility of a complex network, which we assess by determining whether the network can be driven from any initial state to any desired state within a finite time, is considered one of the focal research subjects and has been studied massively. Liu et al. convert the problem on structural controllability into a graph maximum matching problem 6 and propose the scheme of calculating the number of minimum driver nodes. Yuan et al. later expand the results to undirected networks 7 and propose the exact controllability theory. Y an et al. analyse the observational uncer-tainty and the energy required for control and achieve a series of fundamental yet important theories 8. Prof\u00e1i applies structural controllability to temporal networks 9. Additionally, Gu et al. study the application of network controllability to brain networks 10. These problems address how to control the whole system efficiently. However, for certain large systems, we only need to control some target nodes of the network that are associated with spe-cific tasks. Thus, Gao put forward an approximate algorithm to calculate the fewest driver nodes that can be used to control the target nodes, which effectively reduces the usage of driver nodes 11. Zhang et al. make improvements to this algorithm 12. The above theory pertains to the dynamic process on the nodes.\n",
        "Target controllability is an NP-hard problem [24], and a number of improvements has been proposed to find the minimum number of dri-vers for target control [ 25-28]. In the process of target control, some nodes besides the targets might be unwillingly controlled. Actually, one often needs to have some mediator nodes between the drivers and targets. The mediators are also controlled by the drivers, and the smaller the number of the mediators is, the fewer the unwillingly controlling actions will be. In many cases, it is important to reduce the side effects through ensuring that the network states are transferred to the desired states by minimizing the number of the mediators. In this manuscript, we aim to present a computationally efficient algorithm for target control, which will make sure that the number of the mediators is kept minimal. The proposed algorithm is based on the length of paths between the nodes and meets the Kalman's controllability rank condi-tion Thus, it does not have the restrictions of the structural controll-ability. Structural controllability is when matrices are structural, i. e. their elements are either fixed zeros or independent free parameters.\n",
        "Structural analysis is a widely used framework for the analysis of large-scale LTI dynamical systems for solving various controltheoretic problems. Specifically, structural analysis is used in optimal input selection [8], [9], [10], optimal feedback selection [11], [12], [13], and optimal selection of interconnection pattern [14], [15] before. Target controllability and actuator selection for target control are addressed using structural analysis in many papers. A greedy algorithm is proposed in [2] to find an approximate solution to the minimum input selection for target control. Note that, though [2] applied the greedy algorithm on various real-time systems, it does not provide any theoretical guarantee for the performance of the algorithm. A preferential matching based algorithm is given in [16] for finding the input nodes for target control: this algorithm is experimentally shown to perform better than the greedy algorithm 201918 th European Control Conference (ECC)\n",
        "In Gaoet al. \u2019s study [39], a greedy algorithm has been developed to identify steering nodes which are sufficient for target control. Several further studies developed algorithms to identify steering nodes for target controllability by reducing the number of steering nodes or considering realistic constraints. Instead of using the greedy algorithm, Zhang et al. [52] developed an algorithm which elaborately rearranges the matching order of the nodes such that the required number of steering nodes for target control can be significantly reduced. The comparison results on model generated networks and real networks indicate that the proposed algorithm outperforms Gao et al. \u2019s algorithm [39]. Because the functions of network systems intensively depend ontheconnectionsbetweennodes, Liu et al. [53] investigated the target controllability of giant connected components of directed networks by selecting target nodes from giant connected components, which are the connected components of networks that have constant fractions of nodes in networks. In the study, the relationships between the number ofsteering nodes for controlling giant connected components and the parameters of model generated networks are explored. Piao et al. [54] considered controlling a subnetwork called target community of a complex network when the whole topological structure of the network is not available. They argued that though a target community is controllable with steering nodes identified by structural controllability analysis, determining input control signals that are able toachieveagivencontrolgoalcanbe verydifficult.\n"
    ],
    "An efficient algorithm for finding all possible input nodes for controlling complex networks": [
        "Control 25,1192-1196 (1980). 18. Mayeda, H. On structural controllability theorem. IEEE Trans. Autom. Control 26,795-798 (1981). 19. Poljak, S. On the generic dimension of controllable subspaces. IEEE Trans. Autom. Control 35,367-369 (1990). 20. Murota, K. & Poljak, S. Note on a graph-theoretic criterion for structural output controllability. IEEE Trans. Autom. Control 35,939-942 (1990). 21. Uno, T. Algorithms for enumerating all perfect, maximum and maximal matchings in bipartite graphs. In International Symposium on Algorithms and Computation, 92-101 (Springer, 1997). 22. Uno, T. A fast algorithm for enumerating bipartite perfect matchings. In International Symposium on Algorithms and Computation, 367-379 (Springer, 2001). 23. Tassa, T. Finding all maximally-matchable edges in a bipartite graph. Theor. Comput. Sci. 423,50-58 (2012). 24. Ghosh, S. & Ruths, J. Structural control of single-input rank one bilinear systems. Automatica 64,8-17 (2016). 25. Wang, B., Gao, L. & Gao, Y. Control range: A controllability-based index for node significance in directed networks. J. Stat. Mech.\n",
        "Duringrecentyears, therewereasignificantamountofworksaddressingthecontrollabilityofcomplexnetworks, ourability todriveanetworkfromanyinitialstatetoanydesiredfinalstatewithinfinitetime [1-3]. Ageneralframeworkbased onstructuralcontrollabilityoflinearsystemswasfirstproposedtoidentifytheminimumsetofdrivernodes (MDS) [4], whosecontrolleadstothecontrolofthewholenetwork. Followingit, relatedproblemsunderthisframeworkwere alsoinvestigated, rangingfromthecostofcontrol [5-7] totherobustnessandoptimalofcontrollability [8-11], from themultiplicityfeatureincontrol [12-15] tothecontrollabilityinmulti-layerortemporalnetworks [16-19], andmore [20-23]. Thisframeworkwasalsoappliedtodifferentrealnetworkedsystems, suchasfinancialnetworks, powernetworks, socialnetworks, protein-proteininteractionnetworks, diseasenetworks, andgeneregulatorynetworks [24-31]. Inthe meanwhile, researchondifferentdirectionsofcontrolwerealsostimulated, suchasedgecontrollability [32,33], exact controllability [34,35], strongstructuralcontrollability [36] anddominatingsets [37-39], whichsignificantlyadvancedour understandingonthisfundamentalproblem. \u2217Correspondingauthor.\n"
    ],
    "Research on service selection approach based on composite service execution information": [
        "Although Qo S-aware Web service selection has been made a lot of research results, but with the complexity of Web services application development, still need to conduct in-depth study: \u2022 Trusted service selection. The openness of network environment, the autonomy of service providers, and the evolution of services etc. such characteristics lead to a large number of service unavailable. For example, Lu [32] got the following conclusions by a large number of experiments from 11 research team: the available rates of services registered in UDDI and Google below 55%. As in the search, people are willing to look more authoritative (more general, more reliable) resources, rather than finding the most appropriate or the best match results, therefore, how to depict the needs of service selection from trust, and to form trusted service composition, will become an important research trend. \u2022 Service evaluation and selection supported multi-user collaboration. In the multi-user collaborative affairs, the Qo S assessment of a Web Service often requires multiple users to participate in and coordinate. The user Qo S value orientation, often with a certain personality preferences, therefore, the selection of services not only to meet the objective accuracy of the information service quality, but also reflects the subjective preferences of multiple users, for example, some users want to give priority to the response time, while some users will want to give priority to the implementation cost of such inspection, and so on. In this case, under multi-user collaborative environment, a Web Service selection process, in essence, is a multi-objective decision making process with multi-user participated. For these complex, collaborative context application environments, to conduct dynamic service selection based on multi-objective decision making in a multi-user environment is an important research trends. \u2022 Service selection support personalization. Together with the user preferences with the Qo S into service selection process, give full consideration to the user\u2019s personalization, such service selection algorithm, is still a future research priorities, which will broaden the scope of application of the algorithm. \u2022 Reliable service selection considering the dynamic changes of Qo S. On the one hand, conducting dynamically Qo S prediction, on the other hand, improving the robustness, stability, reliability and other aspects of performance of composite service combining the research of selection algorithm that during the operation the combined service should have fault tolerance ability and failure recovery ability. Now more and more scholars introduced the trusted computing research results to the service computing. \u2022 Service-correlation aware service selection. In the actual Web application scenario, the service Qo S may have a correlation with other service [31,33]. Therefore, the research on service-correlation aware service selection method is highly targeted and practical. \u2022 New mixed selection algorithms. Currently, the service selection algorithm use of hybrid selection strategy is still relatively small. Learning the merits of multiple algorithms to conduct algorithm fusion is an important way to improve the performance of service selection algorithm, there is still much work to do.\n",
        "Zhang et al. (2008) proposed a co mposite services execution inf ormation-based service selection approach. This approach includes three steps: 1 Data producing step: recording the execution information of c omposite services into a log repository and extracting related datasets. 2 Data mining step: discovering path fork association rules and service execution sequential sequence patterns by data mining. 3 Service selection step: select ing proper services based on th e discovered knowledge.\n",
        "Context-Aware\u201d interaction framework in pervasive computing. In algorithms, Sun [12] proposes a solution for distribution of multi-terminal business flow. Wang [13] etc. gives us the ways to promote the accuracy and reliability of mobile social web service selecti on, avoiding the blindness and randomness. In addition, there are also some researches containing self-learning [14-17]. Indeed, socially aware computing achieves many signi ficant results in mobile computing field.\n",
        "As a key step of command business services combinat ion, command business services selection has been the precondition and important safeguard f or meeting individual command needs of command entity [1] [2]. For now, the common methods of Web services selection include local optimization based method and global optimization b ased method. However, we cannot select command business services in accordance with Web se rvices selection method because reliability and execution efficiency of command business services a re put particular emphasis in dedicated command information system [3]. So a method for com mand business services selection based on execution efficiency and reliability restraint put forward by introducing idea of software fault-toler ant in services selection stage. Within this method, re liability and execution efficiency had been seen as primary constraints and the threshold of execution efficiency would be set for every command business. Multiple alternative services to execute would be selected for each task in composite services process on the precondition of meeting the threshold constraints. Then reliability of composite services process can reach its highest le vel.\n",
        "Under the adaptive web services selection process of information based on user constraints, when all the Qo S of specific services can't meet users' constraints, it is necessary to select the smallest degree bound constraint C (C including 313 C; n and Ch I) from the user's constraints, and then re-select the service work. The amount of relaxation constraints can be defined as follows.\n"
    ],
    "Structural controllability of complex networks based on preferential matching": [
        "A maximum matching in the bipartite graph can be solved by the Hoproft-Karp algorithm [41]. Then the MDS is corresponding to the nodes in Rthat are not connected to any matching edges (see Fig. 5 (b)). It can be verified that if each node in an MDS is actuated by an input control signal, which means adding a control node uifor each node iin the MDS, the resulting graph G (A, B) will have no dilation. Since MDSs of a network are not unique, Zhang et al. [42] proposed a preferential matching algorithm to identify MDSs that have a specific degree property. Zhang et al. \u2019s work provides an inspiration that the MDS can be selected with preference, by which realistic information can be taken into consideration.\n",
        "Digital Object Identifier 10.1109/T AC. 2019.2908311 Fig. 1. A voltage divider circuit. parameterization of the pair (A, B) were studied in [5]-[12]. Results on structural controllability of linear time-varying systems were presented in [13]-[21]. One line of research deals with the structural controllability of composite systems [22]-[26]. Current interest stems from the realization that structural controllability is a key property of interest in swarming behavior and in the modeling and understanding complex networks [27]-[35].\n",
        "Kalman 12 showed that a full rank of the controllability and observability matrices ensures the controllability and observability of a system. Lin 13 extended the notion of controllability to structural controllability and derived conditions that ensure the controllability of a network based only on the knowledge of connections without explicit information regarding their weights. Importantly, a minimum set of nodes, called controls, the stimulation of which would ensure full controllability of the system, can be obtained from the graph structure 14-17. Similarly, the smallest 1 Bernstein Center Freiburg and Faculty of Biology, University of Freiburg, 79104 Freiburg, Germany. 2 Computational\n",
        "Over the past few years there has been a resurgence of interest in the question of structural controllability posedby Lin in 1974 [1]. As defined by Lin, a pair of matrices (A n\u00d7n, bn\u00d71) with each entry either a fixed zero or a distinct scalar parameter is structurally controllable if there is a real matrix pair (\u00afAn\u00d7n, \u00afbn\u00d71) with the same pattern of zero entries as (A, b) which is controllable. Thus if (A, b) is structurally controllable, then almost every real matrix pair (\u00afA, \u00afb) with the same pattern of zero entries as (A, b) will be controllable. Lin was able to give an explicit graphtheoretic condition for such a matrix pair to be structurallycontrollable in terms of properties of a suitably defined directed graph determined by the given matrix pair. Lin\u2019s resultwas extended to multi-input matrix pairs (A n\u00d7n, Bn\u00d7m) in linear algebra terms by Shields and Pearson [2] andreexplained in graph theory terms by Mayeda [3]. Generic properties and design problems of Lin\u2019s parameterization ofthe pair (A, B) were studied in [4]-[8]. Results on structural controllability of linear time-varying systems were presentedin [9]-[12]. One line of research deals with the structuralcontrollability of composite systems [13]-[15]. Current in-terest stems from the realization that structural controllabilityis a key property of interest in swarming behavior and inthe modeling and understanding complex networks [16]-[19]. For example, identification of driver vertices or steeringvertices in biomedical networks [20]-[22], which tend tohave strong ability to in\ufb02uence other vertices, may enlightenus on critical underlying relations or mechanisms; the studyof robustness of structural controllability to vertex and/or arcfailures [23]-[25] may give us an insight in the evolutionof complex social networks and various issues of networksecurity [26], [27]. *This work was supported by National Science Foundation grant n. 1607101.00 and US Air Force grant n. FA 9550-16-1-0290.1 F. Liu and A. S. Morse are with the Department of Electrical\n",
        "Fig. 2 Graphical representation of the NIO 1 matrix A Rendiconti Lincei. Scienze Fisiche e Naturali 1 3 corresponding node recurring often in NIO 1 MSDNs. Sev-eral techniques were used to perform this further analysis: an algorithm to enumerate all maximum matchings of a net-work (Uno 1997), preferential matching algorithm (Zhang et al. 2014) and \u201call possible input nodes\u201d algorithm (Zhang et al. 2017). The enumeration algorithm is so far the only one capable of enumerating all the MSDNs along with the corresponding maximum matching. Conversely, the \u201call pos-sible input nodes\u201d algorithm is much faster and can provide alternative nodes for each node in an MSDN, while it does not provide the corresponding maximum matchings. Since enumerating all possible maximum matchings is computa-tionally expensive, the preferential matching algorithm was applied to the NIO 1 case as follows (Antoni et al. 2018): the algorithm can build different maximum matchings based on the preferences expressed as inputs. By varying the input preferences, the algorithm can be used iteratively to find different maximum matchings. The results obtained with the preferential matching algorithm were compared with the results of the enumeration algorithm and of the \u201call possible input nodes\u201d algorithm. The latter two methods cannot han-dle self-loops (that is, a node pointing directly to, and thus influencing, itself), but it can be shown that removing self-loops from the network does not change its controllability (Zhao et al. 2015). Remarkably, all of the three techniques show a good accordance in predicting the distribution of driver nodes among the network, which corroborates the analysis of NIO 1 network.\n",
        "A maximum matching in the bipartite graph can be solved by the Hoproft-Karp algorithm [41]. Then the MDS is corresponding to the nodes in Rthat are not connected to any matching edges (see Fig. 5 (b)). It can be verified that if each node in an MDS is actuated by an input control signal, which means adding a control node uifor each node iin the MDS, the resulting graph G (A, B) will have no dilation. Since MDSs of a network are not unique, Zhang et al. [42] proposed a preferential matching algorithm to identify MDSs that have a specific degree property. Zhang et al. \u2019s work provides an inspiration that the MDS can be selected with preference, by which realistic information can be taken into consideration.\n"
    ],
    "Predicting gamma passing rates for portal dosimetry\u2010based IMRT QA using machine learning": [
        "Treatment planning and delivery techniques are subject to a wide range of uncertainties that may contribute to decreased gamma pass rates (GPR), such as phantom/patient setup, detector resolution and calibration, beam output and profile, beam modelling, and especially plan complexity [125].\n",
        "Lam et al. applied 3 tree-based machine learning algorithms (Ada Boost, Random Forest, and XGBoost) to train the models and predict gamma passing rates using a total of 1,497 IMRT beams delivered with portal dosiemtry (Lam et al., 2019). They reportedthatboth Ada Boostand Random Foresthad 98 \u00b13%of predictionswithin 3%ofthemeasured 2%/2 mmgammapassing rates with a maximum error <4% and a MAE <1%. XGBoost showed a slightly worse prediction accuracy with 95% of the predictions <3% of the measured gamma passing rates and a maximum error of 4.5%. The three models identified the same nine features in the top 10 most important ones that are relate d to plan complexity and maximum aperture displacement from thecentralaxisorthemaximumjawsizeinabeam. Theirresul ts demonstrated that portal dosimetry IMRT QA gamma passing rates can be accurately predicted using tree-based ensemble learningmodels.\nLam et al. (2019) Eclipse/Varian Portal Dosimetry 1,497 IMRT Beams Ada Boost, R andom\n",
        "Lam et al proposed three ML algorithms able to predict the results of 2%/2 mm gamma passing rates obtained with portal dosimetry for IMRT QA. The authors observed a prediction accuracy of these algorithms ranging from 95% and 98% [187]. Li et al proposed a ML architecture consisting of a Poisson Lasso (PL) regression model that predicts the individual gamma passing rate, followed by a RF classification model that classifies the QA result as \u201cpass\u201d or \u201cfail\u201d. The system was reported to have reliable results once the original data was pre-processed to compensate for the original tendency of PL model in overestimating the gamma passing rates of challenging VMAT plans [188]. As discussed, these ML approaches also have the advantage of interpretability and they allow to discriminate the features that have the highest impact on the results. On the other side, these models have low portability as they are highly dependent on different factors that are institution-dependent, such as the systems used for pre-treatment verification, the evaluation methodology chosen and the treatment unit. DL methods have also been proposed to extract features from the dataset that can be used to model and classify pre-treatment verification results [189-192]. Lastly, it is worth mentioning that the integration of DL and radiomics could have great potential in the growth of automation in QA, as reported by a recent experience by Nyflot et al, which demonstrated that a DL model can detect the presence of treatment delivery errors from the radiomic analysis of patient-specific QA dose maps. In the future, it is reasonable to expect that thanks to the support of modern AI-based systems it will be possible to automate QA procedures in MRg RT as well, so reducing the time needed for plan specific verifications without compromising safety and quality of online adaptive procedures.\n",
        "Receiver operating characteristic (ROC) curves can be used to evaluate the ability of IVD systems to correctly classify observed dose deviations and to determine optimal action limits [77]. ROC analysis, however, requires a statistically significant size of error and no-error samples as input. The experimental acquisition of EPID measurements to produce such samples is typically a cumbersome process, which ex-plains why there are only a few studies on the topic in the IVD literature [78-81]. Recently, use has been made of synthetic EPID images to eliminate the need for phantom error introduction and positioning [82]. Another alternative is to model possible errors by introducing modifications in the TPS [83]. Vendors, in collaboration with the clinical community, are requested to promote research activities related to the error detectability of their IVD systems and to facilitate colla-boration within a user group. Additional research is also required to investigate the use of alternative measurement analysis techniques, e. g., use of exploratory data analysis, radiomics, and/or machine learning [84-86]. 5.4. Online systems\n",
        "As described above, the plan complexity features affect delivery accuracy. Regarding the plan complexity features, the MLC aperture-based features such as the AA was associated with plan complexity and impacts on prediction accuracy. The dosiomics features categorized by GLCM, GLDM, and GLRLM have a large impact on predictions of the GPR value, so the effects of these dosiomics features were observed for plan complexity. Many researchers have reported that AA and leaf speed are crucial factors in\ufb02uencing the GPR prediction for VMAT [8-10,19,21], and several others have reported that the texture feature calculated by GLCM was the indi-cator of the complexity of dose distributions [11-13]. In this study, we have confirmed that AA, GLCM and GLDM were also important factors that feature in the prediction model. Moreover, the effect of disease site parameter was also important to classify GPR action level. The disease site parameter was identified as an important feature of the plan and dosiomics features at c 2%/2 mm. Classification performance would be affected if the dataset included many disease sites. In other words, it is indicated that plan complexity would depend on disease sites as well as plan and dosiomics features.\n",
        "Valdes et al. 33 built a Poisson regression ML model trained on complexity features of IMRT plans to predict the gamma passing rate results at 3%/3 criteria mea-sured with two-dimensional (2 D) array detector. The results showed that the gamma passing rate results could be predicted within 3% accuracy on a single in-stitution dataset. The same research group 34 extended their work to examine the generalizability of their previ-ously developed Poisson regression on another insti-tution dataset with patient-specific QA measurement performed using EPID. The results revealed the gener-alizability of the Poisson regression with gamma pass-ing rate prediction within 3.5% accuracy. Lam et al. 38 assessed tree-based (Ada Boost, random forest, and XGBoost) ML models for gamma passing predictions with patient-specific QA measurements performed using EPID. Random forest and Ada Boost models exhibited the best performance, with gamma passing rate prediction error less than 4% for 2%/2 mm gamma criteria.\nFilm dosimetry Dropout technique 4 features: MU values, the PTV volume, the rectum volume, and the overlapping region volume Tomori et al. 353 D array detector Pearson's correlation coefficient 28 features: plan complexity parameters (n = 18), machine type (n = 4), and photon beam energy (n = 6). Ono et al. 463 D array detector SVM recursive feature elimination 30 features: Linac output, MU factor, total number of control points, and others Granville et al. 472 D array detector Manually 54 features: MU value, union aperture area, plan area/ irregularity/modulation, average leaf gap/dose rate/travel distance, modulation index of leaf speed/acceleration, and others Li et al. 45 EPID Manually 10 features: modulation complexity score, beam irregularity, MUs/control point in a beam, maximum of x-y jaw positions, edge metric, and others Lam et al. 383 D array detector Manually 54 features: plan modulation-complexity and delivery-characteristics Wang et al. 512 D array detector Extra-trees, mutual information, and linear regression 100 features: aperture score, MU factor, edge metric, leaf gap/ travel/motion, plan irregularity, plan modulation, and others Wall and\n",
        "Dosim. 2017;42:203-209.187. Boylan C, Rowbottom C. A bias-free, automated planning tool for technique comparison in radiotherapy-application to na-sopharyngeal carcinoma treatments. J App Clin Med Phys. 2014;15:213-225.188. Winkel D, Bol GH, Van Asselen B, et al. Development and clin-ical introduction of automated radiotherapy treatment planning for prostate cancer. Phys Med Biol. 2016;61:8587.189. Speer S, Klein A, Kober L, Weiss A, Yohannes I, Bert C.\n",
        "Recently, several machine learning approaches for patient QA and for the feasibility of predicting GPR values were 1003 Med. Phys. 48 (3), March 20210094-2405/2021/48 (3) /1003/16 \u00a9 2020 American Association of Physicists in Medicine 1003 developed. 10,11 In 2016, Valdes et al. reported a machine learning-based \u201cvirtual IMRT QA \u201dmethod, which was conducted with a large set of complexity metrics to predict local GPR at the 3%/3 mm criterion measured using a two-dimen-sional (2 D) diode array detector. 12 In 2017, they reported that their model can be applied to another institution that usesportal dosimetry for measurement. 13 In 2018, Interian et al. developed a deep learning convolutional neural network (CNN)-based model inspired by the VGG-16 Image Net model. The model used a fluence map as an input. Theauthors compared their results with the results obtained by Valdes et al. 14 Our previous study reported the feasibility of an original deep learning-based prediction model using truecomposite dose distribution (2 D-sagittal) as input data. Gaf-chromic EBT 3 film was used for measurements, and global GPR values 2%/2 mm, 2%/3 mm, 3%/2 mm, and 3%/3 mm were evaluated. 15 In 2019, Ono et al. compared the results of three machine learning models, which used 28 plan parame-ters. The models were measured with a helical diode array, and the global GPR values 2%/2 mm and 3%/3 mm wereevaluated. 16 Li et al. presented two machine learning models that predicted the GPR value and determined whether thefinal QA result passed or failed. They used 2 D ion chamberarray and ancillary QA phantom and evaluated the global GPR values 2%/2 mm, 3%/2 mm, and 3%/3 mm. 17 Lam et al. developed three tree-based machine learning models topredict the GPR values 2%/2 mm, which were measured viaportal dosimetry. 18 Shiba et al. developed a dose uncertainty potential accumulation model to predict GPR values 2%/2 mm, 3%/2 mm, and 3%/3 mm measured using a semicon-ductor detector. 19 In 2020, Wall et al. developed and evaluated several machine learning algorithms to predict local GPR 3%/3 mm measured using a 2 D diode array detector and an ancillary phantom. 20 On the other hand, another machine learning approach for detecting multileaf collimator (MLC) positioning error has also been developed. In 2016, Carlson et al. reported an MLC positioning error detectionmodel using leaf motion parameters from the plan. 21 In 2018, Nyflot et al. developed a CNN model with triplet learningand a handcrafted approach using texture features, and machine learning classifiers were used to classify the presence or absence of introduced treatment delivery errors frompatient-specific gamma images. 22 In 2020, Osman et al. developed an artificial neural network model with log file forpredicting individual MLC positioning deviations. 23 Kimura et al. reported a CNN model with dose difference map forclassifying the presence or absence of MLC positioningerrors. 24\n",
        "We conducted the present study to investigate whether the radiomics-based method combined with machine learning iseffective in detecting errors in MLC modeling, that is, the MLC TF, and DLG errors and an MLC positional error dis-tinguished from the error-free condition, by evaluating flu-ence maps of clinical IMRT plans measured with an EPID. The spatial resolution of the EPID is superior to that of detec-tor arrays, and the EPID shows high sensitivity in the tuning or quality control of MLC modeling parameters and in detecting MLC positional errors in patient-specific IMRT QA. 5,27-32 In this study, we used calculated and measured fluence difference maps instead of gamma maps, since it was shown thatsimple subtracted maps were more sensitive to MLC posi-tional errors than gamma maps. 26\n",
        "Advanced approaches using machine learning (ML) methods could be used for predicting the MLC leaf positionaldeviations during the IMRT/VMAT delivery priori. There are very few studies in the literature about the prediction ofthe MLC positional errors with ML methods. Carlson et al. 18 examined various ML models such as linear regression, ran-dom forest, and cubist to predict the MLC positional errorsduring VMAT delivery using log files data from multipleinstitutions and the impact of these errors on the QA anddosimetry results. The study results showed that leaf posi-tional errors are predictable in an accurate manner with MLalgorithms, with a positive impact on enhancing the gammapassing rates for patient-specific QA and consequently reducing errors in delivered doses. ML models were also used in predicting the patient-specific IMRT/VMAT QA, by predict-ing the gamma passing rate. 19-26 Some recently published studies 19-26 investigated the feasibility of Poisson regression with LASSO regularization, 19-21 deep learning based on convolutional neural networks (CNN), 22-24 ensemble learning based on decision-tree, 25 random forest, 21 and support vector machine 26 ML models. All these proposed ML-based models demonstrated a capability in accurately predicting the IMRT/ VMAT QA gamma passing rates. These studies were carriedout for patient-specific IMRT/VMAT QA; however, they didnot consider predicting the individual MLC leaf positionspriori then predicting the gamma passing rate.\n",
        "Recently, the machine learning approach has been used in patient-specific QA 4,5. Several studies have developed machine learning models for predicting the results of patient-specific QA 6-15. Valdes et al. reported the \u201cvirtual IMRT QA\u201d method. Their model predicted the gamma pass rate in patient-specific QA for static beam IMRT by analyzing the plan complexity metrics with the machine learning approach 6. Interian et al. developed a convolutional neural network (CNN) model that uses a fluence map as input to predict the gamma pass rate 7. Several other studies have reported models that predict gamma pass rate from the dose distribution generated from a treatment planning system 8-10. In VMAT QA, Tomori et al. developed a CNN model to predict the gamma pass rate from two planar dose distributions in a cylindrical phantom 10. Granville et al. developed a machine learning-based model to predict the result of VMAT QA 12. Their model used Linac performance metrics and treatment planning information to predict the result based on the actual condition of the treatment device. Li et al. constructed two prediction models, the\n",
        "Several investigators have applied machine learning models for predicting QA outcomes based on treatment plan characteristics of \u201cfixed-gantry \u201d IMRT. Valdes et al. were able to predict GPRs within 3% accuracy for IMRT plans using a generalized Poisson regression model with Lasso regularization trained on a selection of 78 plan complexity features [10,11]. Interian et al. developed an ensemble of convolutional neural networks trained to predict IMRT gamma passing rates from fluence maps with results comparable to the Poisson regression model [12]. More recently, Lam et al. used tree-based machine learning algorithms to predict GPRs for portal dosimetry-based IMRT beams with a mean absolute error of less than 1% [13]. Such models can provide a priori information regarding the deliverability of plans during the optimization stage, which could provide many benefits that include minimizing wasted time in measuring and adjusting treatment plans that are likely to fail QA.\n",
        "There have been a few applications of machine learning techniques in patient-specific QA with a majority of themfocusing on predicting GPR. 19-21 For example, Valdes et al. developed a Poisson regression model with Lasso regularization that predicts IMRT QA GPRs with errors <3%. 20 Lam et al. extended their work by using multiple tree-basedmachine learning methods to improve the prediction accu-racy. 21 Interian et al. used transfer learning to take advantage of the sophisticated VGG-16 Image Net model for GPR predic-tion. 19 The main clinical benefit of predicting GPR at treatment planning stage is to avoid QA measurements for those plans predicted to have failing GPR. However, it is apparent that mis-match between the planning system and delivery system can-not be detected as the measurements are omitted. Woottonet al. 22 and Nyflot et al. 10 were among the first groups to use machine learning techniques to detect errors based on IMRTmeasurements. In their first study, they treated gamma distribu-tions as images and extracted standard radiomic features (such as mean, variance, and skewness) to build logistic regression models for error detection. 22 In their next work, they used a CNN with triplet learning in the hope of extracting moremeaningful features to improve error detection. 10 The main difference between their approach and ours is that they focusedon feature engineering (extracting radiomic features fromgamma distribution manually or via deep learning), while wedirectly used the DDH and the s DTA maps as features. The FIG. 6. (a)-(g) Horizontal profiles taken from Figs. 5 (a)-5 (g), through the center of the middle leaf, along with dose gradient and beam intensity profiles (h) vertical profile taken from Fig. 5 (h), through the center. The signed distance-to-agreement (s DTA) values were divided by 3 for better visualization. In the low dose gradient area, the s DTA was set to zero.\n",
        "Radiotherapy and Oncology journal homepage: www. th egreenjournal. com accuracy across two institutions and found only about 86.33% of plans with the prediction error within 3.5% [8]. Ono et al. [9] trained the convolutional neural network (CNN) model with VMAT plans complexity metrics, and achieved much better results. Lam et al. [10] demonstrated that the GPRs could be predicted accurately with 31 features of plan complexity metrics and machine metrics using Ada-Boost, Random Forest (RF), and XGBoost algorithms. Granville et al. [11] also used features that describe Linac performance combined with treatment plan complexity characteristics to train ML models, resulting in improved prediction accuracy for VMAT PSQA.\n",
        "Several investigators have applied machine learning models for predicting QA outcomes based on treatment characteristics of fixed-gantry IMRT [19,27-29], and more recently, VMAT plans [30,31]. For instance, Valdes et al. were able to predict GPRs within 3% accuracy for fixed-gantry IMRT plans using a generalized Poisson regression model with Lasso regularization trained on a selection of 78 plan complexity features [19,27]. More recently, an investigation of machine learning algorithms observed that a support vector machine (SVM) was best for predicting GPRs for VMAT QA from 100 treatment planning features selected based on relative importance [32]. Such models can provide a priori information regarding the deliverability of plans during the optimization stage, which could provide many benefits that include minimizing wasted time in measuring and adjusting treatment plans that are likely to fail QA.\n",
        "Lam et al. [9] Random forest Bidimensional patient-specific quality assurance results\nMore recently, the specific performance of a linear accelerator (the results of machine quality assurance) was added to a treatment plan\u2019s characteristics in a dataset that was used to train a machine learning tool. A support vector classifier was then used to predict the gamma index pass rate results [8]. Using a similar methodology but with a random forest algorithm, Lam et al. obtained 98% of predictions within 3% of the measured 2%/2 mm gamma index pass rate [9].\n",
        "For our radiation oncology department, these staffi ng challenges are exacerbated by the wide array of special procedures, diverse equipment, and a uniquely divid ed campus composed of three separate and distinct c linical facilities. In addition to clinical risk assessment s for patients, migrating the majority of the clini cal physics team to a remote working environment was a priority, and requ ired an assessment of the essential needs for diffe rent equipment and procedures to ensure continued safe p ractices during this pandemic, particularly in the context of patient-specific intensity modulated radiotherapy (IMRT) QA. For patient-specific IMRT QA, it is commo n practice to measure point dose or 2 D and 3 D dose distributio ns prior to treating patients, and then compare the se measurements with treatment planning system (TPS) c alculations [2]. The identification of plans that r equire measurement has been previously discussed and justi fied clinically in the context of Virtual IMRT QA [2-6]. To reduce the necessary on-site resources and to limit COVID exposure risk to the team performing IMRT QA, while maintaining a high-level of confidence in the safet y and accuracy of patient-specific IMRT treatments, use of\n",
        "Therefore, it is a crucial task to understand the effect of DCGS on the physical and biological doses in radiotherapy for cervical cancer. Gamma analysis is currently the most common and generally accepted method for quantitatively assessing the difference between the two dosedistribution (DD) [10,11]. It detects the difference between the two DD by a designed gamma standard (e. g. 3.0 mm, 3.0,10%) and it will provide a report on passing rate [12,13]. The standard of 3.0 mm, 3.0 and 10% is the most widely used, in which 3.0 mm refers to the consistency of distance, 3.0% refers to the maximum allowable dose difference, and 10% is the threshold (In the following parts of the article we skipped it as it never changed) and when the dose is less than 10% of the reference dose, which does not participate in gamma analysis. Selecting 10% is widely recommended [14,15]. In intensity modulated radiation therapy (IMRT), gamma analysis is usually used to analyze the difference between the TPS-outputed and the actually measured dose distribution to evaluate the degree of dose deviation caused by various reasons during the execution of the plan, further to determine whether a plan is to execute based on the evaluation. However, previous studies have shown that different dose quality assurance (QA) system (dose QA system refers to the collection of measurement and analysis software and hardware used to ensure that the radiation dose is achieved at the target volume and the OARs as expected) have different abilities to detect errors based on the dose distribution output by TPS.\n",
        "Opening Statement AI is enabling breakthroughs in cancer screening, automated planning, outcome modeling, and QA. Many of theseadvances will have far reaching impact, however, not all proposed applications make sense in a clinical setting. Predictingpassing rates for measurement-based pretreatment PSQA isone such case. Published results show acceptable accuracy, 6,9 thus there is no question that AI is capable in this context. However, the more important question is what clinical benefitis achieved by doing so?\n"
    ],
    "Identification of efficient observers for locating spreading source in complex networks": [
        "To minimize the estimation error of source and analy ze the rumor centrality maximum likelihood estimator (Shah & Zaman, 2011) is used that examines the asymptomatic behavior of infected nodes in detail for regular trees, general trees and general graphs. Along with the infection source the infection region i. e. a subset of nodes infected by each source in a network is identified considering SI propagation model with homogeneous spreading rates based on approximations of the infection sequence count. Choi et al. (Choi et al., 2017), (Choi, Moon, Shin, & Yi, 2016) identifies rumor sourc e using different approaches such as batch queries, interactive queries, Maximum-A-Posteriori-Estimator (MAPE). Zhu and Ying (Zhu & Ying, 2016) gtries to identify source using a path-based approach and (X. Zhang, Zhang, Lv, & Yin, 2016) estimates spreading source in network based on observer nodes. Fake news/Rumour dataset along with nodes (users) and edges (relationship) Data\n",
        "Zhangetal. haveanalyzedcentralitybasedmethodsand showedthatnoneisbeaclear-cutwinnerintermsofperformance [33]. Whileintheirworktherangeofsensordensities isrespectable, theydonotconsidereffectsofstochasticityat\nThe algorithm, proposed by Zhang et al., selects a set of the sensors /u 1 D 446 Coveragewhich has the maximum number of unique neighbors [33]. The nodes which have an sensor as aneighborare covered, andthefractionofcoverednodesin the network is called coverage rate. This method saturates, Paluch et al.: Preprint submitted to Elsevier Page 2 of 28 Journal Pre-proof Journal Pre-proof Optimizing sensors placement since usually the density of sensors which gives coverage rate equal one is significantly smaller than one.\nFig. 1). Theexperimentwasconductedfor Erd/uni 0151 s-R\u00e9nyinetwork with constant density of sensors /u 1 D 70 C= 0.05 and average degree /uni 27 E 8/u 1 D 458/uni 27 E 9=8. High Coverage Rate has the lowest computational complexity among the tested methods. The authors state that the complexity of this algorithm can be reduced to/u 1 D 442 (/u 1 D 45 B+/u 1 D 45 A) [33], but our version, which is resistant to the saturation effect, has complexity /u 1 D 442 (/u 1 D 44 F/u 1 D 45 B). On the opposite extreme, there are High Variance Observers and\nHigh Coverage Rate [33] /u 1 D 442 (/u 1 D 44 F/u 1 D 45 B) 1.80 (2) No\n",
        "Thomson focused on the analysis of the rumors spreading model based on mathematical theory and developed the DK model [8]. A number of scholars researched rumor spreading considering the topology properties of social networks [9-14]. Zanette established a rumor spreading model on the small-world networks and 2 put forward the rumor spread threshold [15-16]. Some scholars studied applications of stochastic version of the DK model on scale-free network and showed that the uniformity of the network had a major impact on the dynamic mechanism of rumor spreading [17-19]. Besides, scholars found that networks structure influenced spreading dynamics, the most of which was that heterogeneous structures lead to the absence of any diseases outbreak threshold [20-22].\n",
        "Fig. 1. Thetransmissionsketchofthe SHPRSmodel. basedontheirmodelswereproposed. Manyresearchersstudiedrumorspreadinganddynamicbehaviorsbasedon mathematicaltheoryandconnectedthemtotopologicalpropertiesofsocialnetworks [12-18]. Zanettefoundedarumor spreadingmodelonsmall-worldnetworksandobtainedthecriticalspreadingthresholdofrumor [19,20]. Zhaoetal. [21] providedadetaileddescriptionofrumorspreadingprocessesbyintroducingtheforgettingmechanismina SIRmodelon small-worldnetworks. Lietal. [22] studiedthedynamicsofadelayedreaction-diffusionrumormodelwithgovernment controlononlinesocialnetworks.\n",
        "Daleyand Kendall [6] firststudiedtherumorspreadphenomenoninthe 1960 s, andputforwardtherumorspread 1 mathematicalmodel (D-Kmodel). Inrecentyears, thespreadofrumorresearchhasmadenewbreakthrough, causedby 2 theupwardtrendincomplexnetworkresearch [7-14]. Severalauthorshaveusednetworkmodelstostudytheprocessof 3 rumorsspreadingacrossnetworks [15-19]. Anumberofresearchidea, whichmostlyarequalitative, pointsoutthesocial 4 harmofrumor, andrecommendpreventandcontroltheemergenceandspreadofrumor [20]. Someresearchersthink 5 rumorsreflectwhatthepublicreallypayscloseattentionto. Accordingtothiscollectivebehavioronspreadingrumor, people 6 releasethepossibleconcernsontheindividuallevelandgaininsightandalleviatecontradictiononthesociallevel [21]. Meng 7 etal. [22] proposedadynamicrumorspreadingmodel (DRSIR) basedon OSN (onlinesocialnetwork) fordescribingthe 8 complexpatterns. Inordertoidentifyinfluentialnodesindirecting OSN, theyusedthe DRSIRmodeltoperformcomputer 9 simulationsbytheinitializationofeverynodetobethesinglespreadertoexaminethespreadinginfluenceofthenodes 10 rankedbydifferentcentralitymeasures. Withthedevelopmentoftechnologies, newmediabecomeamainstreammedia, we 11 cannotsimplytakethetraditionalframework, whicharestaticandlimitedandsimplerulesandregulations. Itisimportant 12 forreachestousebigdataanalysistechniquesonthetrendofrumorcontrol [23]. Zhaoand Wang [24] establishedan ISRW 13 dynamicalmodelconsideringthemediumasasubclasstoexplorethemechanismofspreadingofindividual-to-individual 14 andmedium-to-individual. Formakingtherumorspreadingprocessmorerealisticandapparent, Zhaoetal. [25] modified 15 aflowchartoftherumorspreadingprocesswiththe SIR (Susceptible, Infected, and Recovered) modelbyconsideringthe 16 prevalenceofnewmedia. 17\n",
        "Aclassicrumormodelwas DKmodelproposedby Daleyand Kendallin 1965 [8], intheirmodel, thepopulationwas dividedintothreegroups: peoplewhoknewandspreadtherumor, peoplewhodidnotknowtherumorandpeoplewho knewbutdidnottransittherumor. Makiand Thomsonmodifiedthe DKmodelintothe MKmodel [9]. Anumberofscholars researchedrumorspreadingconsideringthetopologypropertiesofsocialnetworks [10-14]. Besides, Zanettestudiedthe spreadingmodelonsmall-worldnetworksandfoundtheexistenceofthecriticalthresholdforrumorspreading [15,16]. \u2217Correspondingauthors.\n",
        "Aclassicrumormodelwas DKmodelproposedby Daleyand Kendallin 1965 [8]. Intheirmodel, thepopulationwas dividedintothreegroups: peoplewhoknewandspreadtherumor, peoplewhodidnotknowtherumorandpeoplewho knewbutdidnottransittherumor. Makiand Thomsonmodifiedthe DKmodelintothe MKmodel, whichassumedthat aspreaderchangedtoastifler (whoknewbutdidnottransmittherumor) [9]. Anumberofscholarsresearchedrumor spreadingconsideringthetopologypropertiesofsocialnetworks [10-14]. Besides, Zanettestudiedthespreadingmodelon small-worldnetworksandfoundtheexistenceofthecriticalthresholdforrumorspreading [15,16]. Morenoand Nekovee etal. examinedthemean-fieldequationstodescriberumorpropagationinsmall-worldnetworks [17]. Besides, scholars foundthatnetworksstructureinfluencedspreadingdynamics, themostofwhichwasthatheterogeneousstructureslead totheabsenceofanydiseasesoutbreakthreshold [18,19]. \u2217Correspondingauthor.\n",
        "When we adopt a static approach, the only degree of freedom we have is the choice of the static sensors. It isknown that this choice has an impact on the performance of source localization [44], [48], [53] and that the optimal choice depends on the variance parameter [48]. Figure 5 (a) compares the performance of KDRS and KMED sensors in terms of the final size of the set of candidate sources\nTransmission delays. Several models of how the epidemic spreads have been studied [30]. Discrete-time transmission delays were initially very common (see Assumption (B. 5)) [2], [36], [41]. Then, to better approximate realistic settings, continuous-time transmission models with varyingdistributions for the transmission delays have been adopted; e. g., exponential [35], [46], Gaussian [33], [34], [40], [53] or truncated Gaussians [48]. We mainly consider continuousbounded-support distributions that are tractable yet versatile. 9 F UTURE WORK\n",
        "Observer Placement for Source Localization: The Effect of Budgets and Transmission Variance Brunella Spinelli, L. Elisa Celis, Patrick Thiran Abstract\u2014 When an epidemic spreads in a network, a keyquestion is where was itssource, i. e., the node that started theepidemic. If we know the time at which various nodes wereinfected, we can attempt to use this information in order toidentify the source. However, maintainingobservernodes thatcan provide their infection time may be costly, and we may haveabudgetkon the number of observer nodes we can maintain. Moreover, some nodes are more informative than others due totheir location in the network. Hence, a pertinent question arises: Which nodes should we select as observers in order to maximizethe probability that we can accurately identify the source? Inspired by the simple setting in which the node-to-nodedelays in the transmission of the epidemic are deterministic, we develop a principled approach for addressing the problemeven when transmission delays are random. We show that theoptimal observer-placement differs depending on thevarianceof the transmission delays and propose approaches in bothlow-and high-variance settings. We validate our methods bycomparing them against state-of-the-art observer-placementsand show that, in both settings, our approach identifies thesource with higher accuracy. I. INTRODUCTIONRegardless of whether a network comprises computers, individuals or cities, in many applications we want to detectwhenever any anomalous or malicious activity spreads acrossthe network and, in particular, where the activity originated. 1 We call the spread of such activity anepidemicand theoriginator thesource. Clearly, monitoring all nodes is not feasible due to cost andoverhead constraints: The number of nodes in the networkmay be prohibitively large and some of them may be unableor unwilling to provide information about their state. Thus, studies have focused on how to estimate the source basedon information from a few nodes (calledobservers). Givena set of observers, many models and estimators for sourcelocalization have been developed [26], [21], [32]. However, theselectionof observers has not yet received a satisfac-tory answer: Most of state-of-the-art methods are based oncommon centrality heuristics (e. g., degree-or betweenness-centrality) or on more advanced heuristic approaches thatdo not directly optimize source localization (see [32] for asurvey) or are limited to simple networks such as trees (e. g., [16]). Moreover, such methods consider only the structureof the network when placing observers. However, dependingon the particular epidemic, the expected transmission delaybetween two nodes, and its variance, can differ widely. Weshow that different transmission models require differentobserver placements: This is illustrated in Figure 1: As thevariance of the transmission delays changes, the optimal set 1 In effect, we wish to answer questions such aswhat was the origin of aworm in a computer network?, who was the instigator of a false rumor ina social network? andcan we identify patient zero of a virulent disease? of observers also changes (see also Figure 2 for a concreteexample).\n",
        "Shen et al. which provides the source estimator via minimizing the variance of a time vector of a backwards spread signal [11]. More recent research shows that there are possible solutions for locating a source in temporal networks [1] and that an efficient placement of observers is not a trivial task because different common strategies do not significantly differ in resulting accuracy [14]. The aforementioned PTV method has also been recently improved by\n",
        "Zhang et al. [14] compare the sampling strategies for the source locating estimator proposed in [12]. These works do not study the scenario that when the forwarding probability is less than 1. Besides, assumption (3) is usually violated in the real world, and in our paper we assume that the communication between observers and their neighbors is unknown.\n",
        "Transmission delays. Several models for how the epidemic spreads have been studied [17]. Discrete-time transmission delays were initially very common (see Assumption (B. 5)) [22,27,1]. Then, to better approximate realistic settings, continuous-time transmission models with varying distributions for the transmission delays have been adopted; e. g., exponential [28,21], Gaussian [26,20,19,36] or truncated Gaussians [30]. We consider general continuous bounded-support distributions that are tractable but yet versatile.\n",
        "The present paper is devoted to the estimation of the parameters and states in the complex networks with identical nodes being the chaotic systems. This estimation is provided by using an adaptive observer as well. Different types of observers can be used for the state or/and parameter estimations of the parameters of the systems [22]-[27]. For example, the decentralized observers for a network system, consisting of interconnected identical linear subsystems are presented in [28]. In [29], the distributed tracking problem for complex dynamical networks with nonlinear dynamics is investigated. In [30], the several placement strategies of the observers based on the centralities of nodes of the complex networks were analyzed. Synchronization and parameter identification of a unidirectional star-network constructed by discrete spatiotemporal chaos systems with unknown parameters are studied in [31] and adaptive integral sliding mode control design method for parameter identification and hybrid synchronization of chaotic systems connected in a ring topology is studied in [32]. [33] investigates the identification of unknown system parameters and network topologies in an uncertain fractional-order complex network with time delays (including coupling delay and node delay).\n",
        "As budgeted observer placement (even in the zero-variance setting) is NP-hard, there is no optimal algorithm to compare against. Instead, we evaluate the performance of our algorithmagainstasetofnaturalbenchmarksthathaveshowntohavegoodperformance in other works (Seo et al. 2012; Berry et al. 2006; Zhang et al. 2016) (see \u201cComparison against benchmarks\u201d section for a discussion of these benchmarks, Figs. 10-12 for the results).\nThe Coverage Rate heuristic outperforms Adaptive Betweenness Centrality on all three networks (confirming what found by by Zhang et al. (2016)) but is consistently less effectivethan K-Mediansandthanourmethods.\nTransmission delays. Many transmission models for epidemics have been studied (Lelarge 2009) and considered for source localization. Although discrete-time transmissiondelaysarecommon (Luoetal. 2014; Prakashetal. 2012; Altarellietal. 2014), inorder to better approximate realistic settings, much work (including ours) adopt continuoustimemodelswithvaryingdistributionsforthetransmissiondelays; e. g., exponential (Shah and Zaman 2011; Luo and Tay 2012) or Gaussian (Pinto et al. 2012; Louni and Subbalakshmi 2014; Louni et al. 2015; Zhang et al. 2016). In the same line of the latter class of works, weuse truncated Gaussianvariables, whichgivesustheadvantageofensuringthat infectiondelaysarestrictlypositive.\nThe problem of minimizing the number of observers required to detect the precise source (asopposedto maximizing theperformancegivena budgetofobservers) hasbeen considered in the zero-variance setting. For trees, given the time at which the epidemic starts, theminimizationproblemwassolvedby Zejnilovicetal. (2013). Withoutassuming a tree topology and a known starting time, approximation algorithms have been developed towards this end (Chen et al. 2014) (still in a zero-variance setting). However, in a network of size n, the number of observers required, even if minimized, can be up ton\u22121, hence, a budgeted setting is practically more interesting. For trees, the budgeted placement of observers was solved by using techniques different from ours (Celis et al. 2015). However these techniques heavily rely on the tree structure of the network and do not seem to be extendible to other topologies. In a recent work, Zhang et al. (2016) consider selecting a fixed number of observers using several heuristics such as\n",
        "Many dynamical systems are studied in the context of network reconstruction [8] [9]. Methods which are based on phase synchronization [11], Lyapunov 45 exponent [12] [13] [14], feedback control [15] [16] and compressed sensing theory (CST) [18] [19] have been developed previously to reconstruct the underlying networks [20] [21]. Reconstruction of the network structure using phase oscillator is based on deterministically known coupling strength and nodal dynamics in advance [10]. Later, the authors considered some uncertainty as an improve-50 ment and modified the procedure for network reconstruction accordingly [11]. CST is used to reconstruct the interaction pattern of a network of coupled oscillators and games from time series data [22] [23] [24] [25]. Further, CST is also used to uncover the interaction pattern of nodes in a network given that the time series has only binary data [26]. In [27], network reconstruction is suc-553 cessfully done by phase synchronization with linearly and non-linearly coupled systems based on Kuramoto phase oscillators.\n",
        "Network diffusion is used to model spreading of different phenomena over networks, such as virus propagation in human populations and information diffusion in social networks. The point in the network where the diffusion originated, such as the first infected individual or a blogger who initiated a rumor is denoted as the source node and in order to curb the infection or prevent further malicious activity, it is crucial to identify it [1-6]. The source can be localized by observing the time stamps when nodes in the network became first infected (informed). Due to network size, limited resources and privacy issues, the infection times cannot be observed for all the nodes [2,3,6,7]. As the choice of observed nodes impacts the accuracy of source localization, how to select the most informative ones is an important question.\n"
    ],
    "Identifying node role in social network based on multiple indicators": [
        "Huang et al. (2014) identify the node role in a network using multiple indicators. Similar ly, we use multiple centrality measures to evaluate the importance of a supplier node and indicate its criticality in a supply network. However, instead of using norm alized simple average s of the multiple indicators (i. e., \u03b1 = \u03b2 = \u03b3 = \u03c3 = 1) as in Huang et al. (2014), we find appropriate weights \u03b1, \u03b2, \u03b3 and \u03c3 for the ir respective centrality measures to derive the NSI score. In doing so, we address the open problem of a utomatic optimization of weights previous researchers have put forth (Huang et al., 2014). As noted by Sherman and Zhu (2006), DEA gives each unit the benefit by making itself look as good as possible when compared with other units through its own weight selection. Our DEA-based NSI approach differs from a simple ratio measure by treating weights as independent variables that are objectively decided when optimizing the NSI score. This weight solution represents a de parture from previous studies, which either ACCEPTED MANUSCRIPTACCEPTED MANUSCRIPT \nFinally, by letting the data speak for itself, our DEA-based model suggests an objective approach to deciding the weights when combining individual measures for the optimization of NSI scores. Our approach avoids the equal and subjective weighting schemes commonly used in existing studies (e. g., Huang et al., 2014). In other words, our DEA-based model does not assume equal importance for the ACCEPTED MANUSCRIPTACCEPTED MANUSCRIPT \n",
        "Social network analysis (SNA) is an approach to discovering aspects of a specific networked structure consisting of nodes and edges. In addition, the development status/trends of such a network can be visualized, or forecast, by the SNA method [18-21]. The methodology defines a network where each participant is a node (or vertices), and each node in the network has a relationship (i. e., an edge), e. g., through conjunctions, with other nodes. Furthermore, knowledge transfer phenomenon can be discovered through SNA methodology [22]: researchers and organizations can use the information to boost research-and-development programs and to increase trans-or intra-organization cooperation opportunities [23]. Different centrality measures can also be used to investigate the \u2018player role\u2019 of each node with patent citation data [24-27]. SNA can not only lead to an under-standing of the technological invention status, but also can forecast future development trends, [28,29]. For example, Choe, Lee [24] applied SNA with patent citation data to analyze development status in organic solar cells. They discovered that important companies occupy the most critical position and are more influential than others in the network. Additionally, \u2019 betweenness\u2019 is another centrality measure to indicate the number of times a node locates in-between other nodes on the shortest path. The betweenness centrality can be used to assess the bridging role of one node among others, and is seen as an indicator of the influences flowing around the system [30].\n",
        "Existing literature on identifying in\ufb02uential users in a social network has mainly concentrated on using the knowledge of the underlying network topology [8]-[11]. A few attempts have been made to quantify the in\ufb02uence of users in Twitter by proposing different in\ufb02uence measures as well as using various centrality measures [8], [12]-[15]. Furthermore, somerecent state-of-the-art methods have combined diverse structural information by performing k-core or K-truss decomposition of the graph for iden tifying in\ufb02uential nodes with better spreading behavior [16]-[18]. In addition, some works have relied on computing pairwise in\ufb02uence among users to rank in\ufb02uential nodes [19], [20] as well as detecting structural holes playing key in\ufb02uential roles in diffusing information over the network [12], [21], [22]. A brief review of the state-of-the-art has been provided in Section II.\nIdentification of in\ufb02uential users has been an important problem in the domain of social networks that has attracted widespread research interest over the years. There have beenseveral endeavors to identify in\ufb02uential users in the network from the underlying topological structure. For instance, few works have quantified user in \ufb02uence in Twitter by proposing different in\ufb02uence measures [13] or using various centrality measures [14]. Some works have identified in\ufb02uential nodes based on their network roles [8], [12]. In addition, Huang and Yu [23] have identifi ed in\ufb02uencers in a temporal social network taking into account the network dynamics while\n",
        "Cj k,, that reflects the participation of vj at time step tk. Different indicators can assess the participation of a member in an OSN (Costa et al., 2007) A. Bartal, et al. Social Networks 59 (2019)61-7664 including, for example, degree, betweenness, and eigenvector centrality (Shi et al., 2009). However, any single indicator is not sufficient to identify multiple and complex characteristics of member participation (Huang et al., 2014). Hence, using a variety of centrality measures can help analyze member online-participation. A participation shift of vj between time steps tk 1 and tk ( jk) is measured in this study by the\nIn this work, the centrality type in a directed network is prefixed by adding either an \u201cout\u201d or an \u201cin\u201d when calculating the following centralities: Clustering coefficient (CC), Eigenvector centrality, Authority, Hub, Page Rank, In KCores, Out KCores, Indegree, Outdegree, Betweenness, In Closeness, Out Closeness. Since some centralities can be highly correlated (Huang et al., 2014), measures with the largest mean absolute correlation above a pre-defined cutoff were removed, as elaborated upon in Section 6. In addition, the centrality vectors are normalized to facilitate working with observations of different sizes (Mc Auley et al., 2007; Newman, 2003).\n",
        "A recent study [7] examines structural roles of concepts alternatively using raw centrality measures. The paper of Huang, Lv, Zhang, Yang, Zheng, and Wen [33] proposes a combination of three strongly correlated social network analysis (SNA) metrics (degree, ego-betweenness centrality, and eigenvector centrality) to evaluate only those top-ranked nodes (core nodes and bridge nodes) in undirected binary networks. The distinctive feature of our approach is the identification of four structural roles based on the combination of two structural measures, and thus, it is not merely focused on node popularity.\nWhile high correlations among measures can be used to identify obviously important network nodes (e. g., Huang et al. [33]), our structural space approach would be limited if degree and betweenness centralities in semantic networks (including our own) were highly correlated and it would fail to identify subtle roles of network nodes.\n",
        "Visual analysis was conducted through sociograms generated using Gephi (version 0.8.2, Paris) (16) and whole network measures of cohesiveness calculated using UCINET (Version 6.556, Harvard MA) (17). Exponential random family graph models (ERGMs) were applied using STATNET within the programming software R (18) to identify social processes and node covariates significant to the observed network structure. Identification of important individuals respective to each network were calculated using the Evaluation of Importance based on Multi-Indicator (EIMI) method developed by Huang et al. (19).\nOther studies in the Australian hospital setting and an Italian nephrology department identified doctors to be central to medicine decision-making, a s well as the tendency to communicate within one\u2019s own professional group (15,36). In this study, key indiv iduals were identified using the EIMI method (19), which determines node importance as a sum of different centrality measures. Based on their position in the nework a person\u2019s importance is derived from the number of their relations (degree centrality), their capacity to act as a broker between others (betweenness centrality), as well a s their access to other important individuals (eigenvector centrality). Therefore, targeting junior doctors, senior nurses and ward pharmacists may facilitate efficient uptake in future dissemination of new interventions, policies and/or cultures.\n",
        "The motivation behind our choice is that node importance measures the role of a node in the network [31]-[33], while its value comes from the contribution of the nearest neighborhood.\n",
        "Several diverse applications of this problem can be foundin a great deal of recent literature, involving, among others, social [18], [19], criminal [20]-[22], bibliographic [23], [24], patent [25], [26], and economic [27], [28] networks.\n",
        "Research studies about identifying influential users in a Twitter network have largely focused on applying topological approaches of the underlying network (Goldenberg et al. 2018; Huang et al. 2014). Although a few attempts have been made to discover structural gaps that link parts in the network and to identify structural gap fillers or gap bridgers as influentials, these efforts mainly concentrated on applying topological structures to detect influentials in online networks (Bhowmick 2019; He et al. 2016 August). This study builds on the topological approach by utilizing multiple methods to examine impacts of influential users (influentials) through various quantifiable measures as well as qualitative analysis (Al-garadi et al. 2017; Huang et al. 2014; Zhang et al. 2014 June). Further, though many studies explored diffusion on the Twitter network, they rarely examined the phenomenon with a theoretical framework (Bhowmick et al. 2019; Romero et al. 2011 March; Stefanone et al. 2015 January). Some studies attempted to employ the Diffusion of Innovation Theory to explain the meme or message diffusion during Boston bombing in 2013 on the Twitter network, but the theory was not fully discussed in those studies (Johann and B\u00fclow 2019;\n",
        "Gianni Costa costa@icar. cnr. it 1 ICAR-CNR, Via P. Bucci, 8/9 C, Rende, CS, Italy Community and role discovery are two primary tasks in network analysis. Community discovery [29,30,41,43,44,46,51,57,73,78] unveils the underlying organization of nodes into latent structures that are also useful for predicting future interactions [32,36,48,73]. Role analysis [17,40,53,62,63,72] identifies behavioral classes. Such two tasks mutually refine each other. Indeed, role analysis reveals the behaviors exhibited by nodes inside communities, in order to contribute to the pursuit of the respective functionalities. Dually, community discovery detects a natural contextualization to identify node roles.\nIn this section, we review previous research on community discovery [29,41,43,44,46,51,57,73,78] and role analysis [17,40,53,62,63,72], in order to situate NOODLES in the current literature. We also discuss the connections between NOODLES and some recent developments in the field of\n",
        "There were two types of roles of nodes, nam ely as a core role that occupied a central position and a bridge role as a liaison between the nodes [14].\n"
    ],
    "Composite service selection based on dot pattern mining": [
        "Zhang et al. [56] presented a pattern-based composite service selection approach, which divided a composite service into aseries of composite service patterns shared among users, thusregarding the pattern as a selecting unit to facilitate the construction of new composite services. These studies make use of the similarity between services to improve the effect ofservice recommendation and composition. Unlike these researches, this study attempts to divide the candidate service space based on the similarity of services so as to narrow thesearch scope of the algorithm.\n",
        "The simulated annealing theory was presented and changed as a typical heuristic method to tackle the problem. Faruk et al. [17] proposed a novel enhanced GPSO to address the SOC, and the performance of the proposed algorithm was evaluated and exemplified. Huo et al. [18] formalized the SOC as a typical nonlinear integer programming and proposed the DGABC. This model simulates the process of searching for each corresponding solution. However, these SOC methods do not consider how to make use of SFSD to greatly improve the effect of SOC. Therefore, several research works on service composition task with the effective guidance of SFSD have arisen. In [19], statistic correlation and its important in\ufb02uence on SOC are analyzed, and an approach is proposed to greatly improve the corresponding performance. In [20], the formal descriptions of quality correlation and selection correlation are provided, and a novel selection approach that can perceive correlation is proposed. In [21], similarity measures are used to effectively measure how close two composite schemes are. In [22], a correlation and dot-patternbased approach is proposed to select services. In [23], the recurring patterns in a composition process are mined using an approach according to these execution logs.\n",
        "In early stage of the service computing research, a service pattern is regarded as a high-level abstract template such that it can be instantiated to different forms [16]. Subsequently, it is realised that the frequently-occurring segments in historical service solutions are likely to appear in the follow-up ones, thus mining patterns from history by data mining techniques becomes a dominating approach [17]. Hu et al. define the reusable logic in the business processes as the patterns, which can be shared among users to facilitate the construction of new service solution [18]. Zhang et al. extract the repeated 450 process fragments in the service \ufb02ow to expect service reuse in new solution construction [19]. In [20], frequent process fragments appeared in the service composition (including a set of frequently co-occurring services and the control \ufb02ows among these services) are extracted to prepare for the followup service composition be requisitioned by using data mining technique. Huang et al. further utilise fragment service process for composite service selection including control \ufb02ow and bound services [21]. However, it is important to figure out how to reuse these diverse frequent fragments at the right time. So, we need to analyse the distribution of service requests.\n",
        "There are two types of service pattern mining methods: top-down and bottom-up. The former identifies patterns from service process models, and the latter does from service execution logs. Both try to automatically find out repetitive process segments, identify structurally and functionally similar segments, and then combine them as patterns in a higher level of abstraction [10]. Data mining approaches such as sequential pattern mining or frequent itemset mining are popular methods for this issue [8] [9].\n",
        "Given the dataset of previously proposed scientific workflows, it can always find out some frequently occurred frag-ment patterns from history and then based on these patterns make service composition in the future. A great number of service composition approaches based on this idea areproposed and proved to be very successful. These frequentpattern usually give rules that summarize the relationships of correlations, collaborations, and complements between services in historical compositions. Original research on\u201cpatterns\u201d emphasized on the \u201cabstraction\u201d characteristic with the objective of \u201creuse\u201d [9]. Developers write patterns in terms of high-level functionalities and decide onthe function signature they want to associate to a functionality appearing in a pattern [10]. Hu et al. define the reusable logic in the business processes as the patterns, which can be shared among users to facilitate the construc-tion of new service solution [11]. In [12], frequent process fragments appeared in the service composition (including a set of frequently co-occurring services and the controlflows among these services) are extracted to prepare for the follow-up service composition be requisitioned by using data mining technique. In developing service-oriented appli-cations, similarly, a service pattern is defined as a set of services that are repetitively used together in various applications and records the best practices [13]. Zhang et al. extract the repeated process fragments in the service flowto expect service reuse in new solution construction [22].\n",
        "Currently, scientific work\ufb02ows are publicly accessible on the repository, which have been uploaded by scientists in many different domains (ranging from life science to textanalytics or astronomy). Considering the fact that developinga novel scientific work\ufb02ow from scratch is typically aknowledge-and effort-intensive, and error-prone mission, reusing and repurposing mature best-practices which have been evidenced by legacy scientific work\ufb02ows in the repos-itory is considered as a cost-effective and error-avoidingstrategy [1], [2]. Frequently-occurring fragments in legacyscientific work\ufb02ows are likely to appear in the follow-upones, thus, the pattern mining work from history becomes adominating approach [3], [4], [5]. Understanding the usagepatterns of services can facilitate the creation of service com-positions. Inspired by experimental functionalities, a novelscientific experiment may be satisfied by the composition of (i) fragments that correspond to general functionalities, and (ii) other fragments that are personalized somehow, in thedomain. It can be seen that each requirement is representedin the format of a function-oriented, personalized way, anda service pattern does not exist independently but has tobe bundled with specific functionalities. Thus, service com-position study upon patterns should not focus on frequentservice fragments only. Discovering functional patterns withdifferent types and characteristics is an appropriate wayto implement personalized requirements. Unfortunately, inmost state-of-the-art approaches, researchers seldom focuson mining works of different pattern types. To address thisproblem, this paper proposes a novel work\ufb02ow fragmentdiscovery mechanism considering basic and personalizedpatterns. The contribution of this paper is as follows: \u2022We present the discovery strategy of basic patterns. Fragments with similar content are assumed to solvesimilar functionalities and abstracted as basic patterns. \u2022We present the discovery strategy of personalized pat-terns. Through the work\ufb02ow clustering, personalizedfragments are discovered from clusters and assumed aspersonalized patterns under respective contents. \u2022Given a requirement in the format of the work\ufb02ow, wedesign a novel work\ufb02ow fragment discovery strategy. II. B ASIC PA TTERN DISCOVERY\nAs the number of available Web services and scientific work\ufb02ows is rapidly increasing, the discovery of relevantservices and work\ufb02ow fragments from massive candidatesdemands a lot of effort [1], [2]. Frequent pattern-basedcomposition approaches receive a great deal of attention. In [3], frequent fragments that appeared in the composition areextracted to prepare for the follow-up service composition. In developing service-oriented applications, a pattern isdefined as a set of services that are repetitively used togetherin various applications and records the best practices [4]. Huang et al. further utilize the service process for composite service selection including control \ufb02ow and bound services [5]. However, requirements are usually showed as a function-oriented way with specific demands. Thus, service compo-sition studies upon patterns should not focus on frequentsolutions only. In this paper, we aim to discoveries of basicand personalized patterns, and leverage them to compositethe optimal solution fragments for requirements. VIII. C ONCLUSION\n"
    ]
}