Citations or dollars? Early signals of a firm’s research success
Abstract
Scientific and technological progress is largely driven by firms in many domains, including artificial intelligence and vaccine development. However, we do not know yet whether the success of firms’ research activities exhibits dynamic regularities and some degree of predictability. By inspecting the research lifecycles of 7,440 firms, we find that the economic value of a firm’s early patents is an accurate predictor of various dimensions of a firm’s future research success. At the same time, a smaller set of future top-performers do not generate early patents of high economic value, but they are detectable via the technological value of their early patents. Importantly, the observed predictability cannot be explained by a cumulative advantage mechanism, and the observed heterogeneity of the firms’ temporal success patterns markedly differs from patterns previously observed for individuals’ research careers. Our results uncover the dynamical regularities of the research success of firms, and they could inform managerial strategies as well as policies to promote entrepreneurship and accelerate human progress.
I Introduction
In most technological sectors, corporate actors are the main drivers of innovation. For example, in the artificial intelligence (AI) domain, recent years have witnessed a dramatic increase in corporate expenditures on related research projects [1]. These efforts have led to outstanding breakthroughs, such as the detection of proteins’ 3D structure by DeepMind’s AlphaFold [2], as well as many failed products, such as Google glasses. During the ongoing pandemic, more COVID-19 vaccines are being developed by companies than by academic actors [3]. Because of the prominent role played by companies for scientific and technological progress, understanding the regularities and predictability of firms’ research success is vital for diverse players. It can indeed help managers to identify effective innovation strategies [4] as well as high-potential investment opportunities [5], and policymakers to design effective policies that promote entrepreneurship and accelerate human progress [6].
However, potential regularities behind the success of the research outputs produced by corporate actors and its predictability remain unknown. Most recent efforts to understand the success dynamics of research actors have indeed focused on individual scientists and teams of scientists or inventors [7; 8; 9; 10; 11]. Like scientific discoveries are linked to the previous body of knowledge through citations between academic papers [12; 11], novel inventions are linked to previous ones via citations between the corresponding patents [13]. Despite differences between the nature of scientific and patent citations [13], this compelling analogy has recently unveiled common patterns behind the dynamics of scientific and technological innovation. Common patterns exist, for example, in how successful scientific and technological research build on prior knowledge [14; 15; 16; 17; 18], how the impact of papers and patents evolves over time [19; 20], and how team size predicts research impact and disruptiveness [21]. These studies emphasize the similarities between scientific and technological innovation, and they point to the potential benefits of patent analysis to understand the dynamics of firms’ research success.
Inspired by recent studies on scientists’ careers [7; 8], we represent a firm’s research lifecycle as the time-ordered sequence of its issued patents (see Fig. 1). Based on this representation, we ask the previously-unexplored questions: Do firms exhibit similar research success patterns as academic actors? Is the firms’ future research success predictable from their earliest outputs? Which mechanisms lie behind the observed predictability? An obstacle toward answering these questions is the ambivalence of the success of patents from the applicant firm’s standpoint. Quantitative studies of science often define the scientific success of a paper as a one-dimensional construct determined by its received citations [19; 12; 11]. However, defining the success of a patent as its received citations would only capture its technological value, but not its economical value which drives firms investment decisions [22; 23].
To overcome this obstacle, we analyse the patenting history of firms in the United States Patent and Trademark Office (USPTO) dataset from 1926 to 2017 [24]. A recently-collected dataset [22] offers the unique opportunity to quantify simultaneously both the technological and the economic value of firms’ patents via metrics based on the number of citations received [25] and the firms’ stock-price movements following the patent’s announcement [22]. To compare patents issued in different years, we normalize both metrics by requiring that the score of a patent is not biased by its issuing year [26; 27]. We aim to quantify the predictability of firms’ research success, and understand the different implications of patents’ economic and technological value for a firm’s research success.
We find that the economic value of a firm’s early patents is predictive of the economic and technological value of its later patents. On the other hand, the technological value of a firm’s early patents is only predictive of the technological value of the firm’s subsequent patents, but not of their future economic value. To test potential mechanisms behind these findings, we perform a matched pair analysis [28; 29; 9]. The results provide evidence in favor of a fitness explanation where early success is a manifestation of the capability to produce high-value patents, and against a pure competitive advantage explanation where future success is caused by early success alone. Among firms without top-economic value patents in the early stage, “hidden gem” firms that are later granted high-economic value patents differ from those that are not (i.e., “non-top” firms) in the technological value of their early patents. The typical lifecycles of hidden gem firms markedly differs from that of “predictable” firms that are among the top ones by economic value in both the early and late stage. This further reveals the non-random timing of a firm’s best research markedly differs from the random timing of scientists’ highest-impact papers [7], which calls for new models to describe the dynamics of firms’ research success.

II Results
II.1 Quantifying the value of patents and firms

We start by defining the metrics of economic and technological value at the patent and firm level. We consider two dimensions of patents’ value: their technological value and their economic value. To quantify the technological value of a patent, we measure its number of received citations [25]. A potential shortcoming of the citation count (even when restricted to a -year temporal window [7]) is its strong bias by patent age (see Fig. 2A and Fig. S1 in SI). To eliminate this bias, we compare each patent’s citation count only against the citation counts of patents issued in the same year. Hence, we define the age-normalized citation value (NCV) of a patent as its ranking position by citation count among all patents issued in the same year (see Fig. 2B and Methods for details).
To quantify the economic value of a patent, we rely on a recent measure based on the firm stock price movement over a narrow time window after the patent is issued [23]. The core idea of this metric (denoted as ) is that the market’s reaction to a patent is a combination of the dollar value of the patent and the investors’ ex-ante probability assessment on the patent’s success [23] (see Methods for details). Differently from the time-varying citations of patents, a patent’s is determined shortly after the patent’s issuance and does not change over time. Similarly to citation count, the economic value metric also exhibits strong bias by patent age, as shown in Fig. 2C. Again, to prevent this bias from influencing our firm-level results, we define the age-normalized economic value (NEV) of a patent as its ranking position by among all patents issued in the same year (see Fig. 2D and Methods for details).
Technological and economic value do not always coincide. A patent may represent a major technical advance, but its announcement might fail to restrict competition or attract the attention of investors, thereby generating a modest impact on the company’s stock price. For example, patent US from Sanders Associates (see Table 1) reported the invention of the first video game that could be played on a home television. This can be considered as a substantial technological advance compared to computer games, and the patent was highly cited, resulting in a high technological value (). At the same time, the patent failed to capture market interest shortly after its issuance, likely because of the recession in the cable TV industry at that time111http://www.pong-story.com/sanders.htm, which resulted in a low economic value (). We refer to Tables 1 and S1 in SI for the and of a set of expert-selected historically significant patents [31], and to Tables S2–S3 for a list of top patents by NCV and NEV, respectively.
Overall, the Pearson correlation between patents’ technological and economic value is as low as , and the correlation between the two non-normalized variables is also low (, see Fig. S2 in SI). To explain the discrepancy of our finding and previous claims of high positive correlation between technological and economic value [32; 25; 22], we show that such correlation increases as patents are grouped into increasingly-large sets of patents with a similar citation value (see Fig. S2 in SI). Therefore, whereas previous works demonstrated that groups of patents with higher citation impact exhibit higher economic value [22], the low correlation reported here indicates that there is little predictability of economic value from citation value at the individual patent level.
To quantify the research success of a given firm, one could average or sum the value of all its patents. However, it is well-known that patents’ quality is highly heterogeneous [33], and prior works placed a greater emphasis on a firm’s most prominent innovation than on ordinary innovations [34; 35; 36]. For this reason, we focus on a firm’s patents with the highest technological and economic value, which we refer to as its technological and economic hit [34; 37], respectively. The two hits coincide for a minority of firms, which account for of the analyzed firms (see Fig. S3 in SI for the correlation details), and we show below that the value of early economic and technological hits have substantially different implications for the firms’ future research success.
Based on the hits, we define the two dimensions of the innovation value of a given firm : its technological value () and economic value (). We define the and of a given firm as the technological value of the firm’s technological hit and the economic value of its economic hit, respectively. In formulas, and , where denotes the set of patents that were granted to firm . Note that to simplify exposition, in the following, we refer to a “firm’s value” as a shorthand for its innovation value, i.e. the value of its patents. This should not be confused with the firm’s stock price or other measures of firm’s performance, which are not considered here.
We divide firms into three groups according to their technological value and economic value. Specifically, we consider the top- firms as high-value firms, the bottom- as low-value firms, and the intermediate as medium-value firms. All our results do not strongly depend on the exact choice of these separation thresholds (see Figs. S14 and S15 in SI). These three groups of firms exhibit markedly different productivity (in terms of number of issued patents) and value dynamics (see SI Fig. S4). High-value firms exhibit a sustained advantage over medium and low-value firms in terms of both productivity and value. This gap is evident even in the very early stage. Motivated by this finding, in the following, we will test whether early activity data can be used to predict firms’ future value.
Patent # | Issue year | Applicant firm | NCV | NEV | Title/description |
---|---|---|---|---|---|
2895584 | 1959 | INTERNATIONAL BUSINESS MACHS COR | 0.77 | 0.96 | Selectric typewriter printing head |
3728480 | 1973 | SANDERS ASSOCIATES INC | 0.98 | 0.17 | First video game |
3821715 | 1974 | GENERAL ELECTRIC CO | 1.00 | 0.83 | Intel 4004 microprocessor |
4504982 | 1985 | OPTICAL RADIATION CORP | 0.99 | 0.41 | An intraocular lens for permanent implantation into a human eye |
6469012 | 2002 | PFIZER INC | 0.66 | 0.99 | Viagra |
II.2 Early economic value predicts future research success
We start by examining whether firms’ early value predicts future research success. To this end, we split each firm’s research lifecycle into a -year initial window of early activity and a later window composed of all its remaining years. Our results are qualitatively unchanged for different choices of the initial period’s duration and the later period’s duration (see SI S3.4). A firm’s early technological (economic) value is defined as the technological (economic) value of the early technological (economic) hit (i.e., the highest-value patent among the patents issued within the initial -year window). We can define firms’ subsequent technological and economic value in a similar way.
We find a strong predictability: firms among the top- by early economic value are times more likely to be among the top- by subsequent economic value than the other firms; firms among the top- by early technological value are times more likely to be among the top- by subsequent technological value than the other firms (see Fig. S5 in SI). These initial findings motivate the question: Is early technological or economic value more predictive of firms’ subsequent research success? We answer this question by first quantifying the predictive power of both variables, and then mimicking an experiment by creating treatment and control groups of firms that differ by their early technological or economic value.
To quantify the predictive power of firms’ early technological and economic value, we study a set of classification problems where we use information on firms’ early patents to predict which firms will subsequently be among the top- by two dimensions of future success: the technological value of the future technological hit (i.e., the highest-value patent among the patents issued in the late window), and the economic value of the future economic hit. Based on the literature, we consider various metrics of firms’ early performance that might be predictive of future success: not only the firms’ early technological and economic value, but also their early productivity (in terms of the total number of early patents) [38; 39], total citations of early patents [40; 41], and other aggregate measures of early patent value (in terms of cumulative , NCV, and NEV). For each of these early performance metrics, we measure various predictive accuracy metrics, including precision, recall, area under the precision-recall curve [42], for a Naïve Bayes Classifier that classifies a firm as successful if and only if it is among the top- by the metric, where is a parameter that can be tuned to achieve a desired value of recall (see SI S3.2).


We find that a firm’s early economic value is the strongest predictor of both high-economic value firms and, more surprisingly, high-technological value firms in the future (see Fig. 3 for results and Fig. 4 for examples). By considering classifiers with , the precision of the classifier based on early economic value reaches and for the prediction of high economic and technological value firms in the future (10.3-fold and 6.4-fold increase compared with a random classifier, respectively), as opposed to the smaller precision of the classifier based on early technological value (-fold and -fold increase compared to a random classifier, respectively: see Figs. 3A and B; the results based on raw accuracy metrics are shown in Fig. S6 in SI). By summing over all possible values of , the area under the precision-recall curve (AuPRC) of the classifier based on early economic value is times and times larger than that of the classifier based on early technological value in the prediction of future high economic value firms and high technological firms, respectively (see Figs. 3C and D).
The predictive power of firms’ early economic value is substantially stronger than that of other predictors from the literature (such as early productivity and total citations), and significantly larger than that of a random classifier (see the dashed black lines in Fig. 3). These conclusions are robust with respect to alternative choices of the prediction evaluation metric (see Figs. S6 in SI) and variations in the duration of the early window (see SI Fig. S7) and subsequent window (see SI Fig. S8). Combining all the early performance metrics via a binary logistic regression model can moderately improve the predictive accuracy only for the prediction of high-technological-value firms, at the cost of increasing model complexity (see Figs. S6 in SI). For this reason, in the main text, we only show the result of single performance metrics.
Importantly, the stronger predictive power of early economic value holds as well when restricting the analysis to individual industrial sectors: by considering 10 macrosectors based on the first two digit of firms’ Standard Industrial Classification (SIC) code222https://siccode.com/, we find that the early economic value is the strongest predictor of future success for all industries except for the Transportation & Public Utilities sector (see SI, Fig. S13 for details). This exception might occur because in this sector, the economic value of generated research is a weaker determinant of governments’ and agencies’ investment decisions.
II.3 Explaining predictability: Competitive advantage or fitness?
Two underlying mechanisms could explain the observed predictive power of the economic value of a firm’s early patents, which we refer to as competitive advantage and fitness mechanism, respectively. According to the competitive advantage mechanism, in the long-term, a firm might succeed because of the economic value she derives from her early patents. According to this interpretation, early economic success might allow firms to invest more in research, produce more patents in the future, and as a consequence, have more attempts to produce higher-value hits. This mechanism would align with the Matthew Effect found in many other systems [43]. On the other hand, according to a fitness mechanism, the value of a firm’s early hits might be a manifestation of the firm’s ability to produce successful research, which could be interpreted as the firm’s fitness. This mechanism would align with recent theories on the success dynamics of scientists that assume that a researcher’s ability to produce high-impact papers is constant over time [7; 11].
Recent works on success predictability [44] and the science of science [28; 11] pointed out that disentangling the two mechanism in observational data is challenging. A definitive answer would indeed require a randomized controlled experiment [45; 11], which is impossible in our case. Nevertheless, to move an initial step toward disentangling the two mechanisms, we implement a widely-used technique to approximate randomized experiments in quantitative social sciences: matched pair analysis [28; 45].

Specifically, we use propensity-score matching [46] (see Methods) to split pairs of same-industry firms (according to the 10 SIC macro-sectors) that are similar in terms of their early productivity and technological value, but differ significantly in terms of their early economic value (see Fig. 5A), which leads to two groups of firms. We find that firms in the “treatment” group with a larger early economic value exhibit a significant advantage in terms of number of issued patents (), technological value () and economic value () in the late window (see Fig. 5A and Table S5). To verify whether a similar result would hold for early technological value, we perform a complementary experiment where the same-category firms’ pairs to be split into two groups exhibit similar early productivity and economic value, but significantly different early technological value. In this scenario where only early technological value differentiates the two groups of firms, the difference in late technological value is significant (), but that in late economic value is not (, see SI, Table S5).
These results might be naively interpreted as evidence in favor of the competitive advantage mechanism. But if firm’s late success is entirely due to increased late productivity, success differences among the treated and controlled firms would disappear if the late number of issued patents is added to the covariates in the matching procedure. However, we find that this is not the case: When adding the late productivity to the covariates, treated firms still exhibit a significant advantage over controlled firms in terms of technological value () and economic value () in the subsequent years (see Fig. 5B and Table S6 in SI). This finding rules out the possibility that firms with early high economic value succeed in the future merely because of increased late productivity. Taken together, these findings indicate that the competitive advantage mechanism alone cannot explain the observed predictive power, and the fitness mechanism may play a main role in the research success of firms.
II.4 Early patent value predicts hidden gems

The observed predictability relies on the hypothesis that early top-firms are more likely to be among the high-value ones in the future. We refer to firms with early high economic value that maintain high economic value in the subsequent years as predictable firms. Despite the high precision of the resulting classifiers, there exist of firms that are not initially among the top-performing ones (i.e., top- by early economic value) and later end up among the top- (see Fig. 6A). These “low-to-high value” firms, which we refer to as hidden gems, are reminiscent of sleeping beauty papers in science [47]: They are only able to be granted an economic hit after a relatively long time after their first patent issuance. Here, we aim to quantify the early detectability of the set of hidden gem firms that transition from medium or low value to high value.
Top late | Non-top late | |
---|---|---|
Top early (top ) | Predictable | Declining |
Non-top early (bottom ) | Hidden gem | Non-top |
Both predicable firms and hidden gems exhibit high late economic value. Besides, we refer to of firms that never reach high economic value as non-top firms; to of firms that start from high economic value and descend to a lower value level as declining firms (see Table 2 for the classification of firms). We show the heterogeneous economic-value trajectories of 14 well-known firms in Fig. 6A. Among them, Microsoft, General Electric, AT&T, eBay and Apple maintained a high value (predictable firms according to our definition), while Amazon fell from high to medium value (declining firm). By contrast, Intel, IBM, and HP went up from medium to high value, and Applied Materials rose from low to high value; these four firms are hidden gems according to our definition.
Applied Materials is an outstanding example of hidden gems. The firm was unable to produce high economic-value patents within its earliest years of patenting activity, although it was granted high-technological value patents in the early stage. After 1982, its economic value exhibited a steady growth, and subsequently, the firm became able to produce high economic-value patents (see Fig. 6B and Fig. S9 in SI for more examples). This transition is reflected in the company’s history. Applied Materials went public in 1972. In the subsequent few years, the company followed a diversified business strategy. During this period, its technological value was high, while its economic value was low. In 1976, it changed CEO and refocused to its core business of semiconductor manufacturing equipment333https://en.wikipedia.org/wiki/Applied_Materials. After that, its economic value rapidly increased, whereas its technological value stayed at a high level. At the time of writing, the company is a global leader in its core industry.
The existence of hidden gems raises the question: Are they predictable? The Applied Materials example suggests that early high technological value might predict transitions from low or medium early economic value to high late economic value. We confirm this conjecture in two ways. First, we compare the early technological value for four groups of firms with distinct economic value dynamics (see SI, Fig. S10). We find the average for hidden gem firms is 0.957 (s.e.m. 0.009), which is markedly larger than that for declining firms (0.939 (s.e.m. 0.022)), non-top firms (0.876 (s.e.m. 0.003)), and even slightly larger than predictable firms (0.954 (s.e.m. 0.016)). Subsequently, we perform a matched pair analysis in which we only consider firms with non-top early economic value, and the early technological value is used to split pairs of firms among a treatment and control group. Among same-industry firms with similar non-top early economic value and early productivity, those with high early technological value exhibit –higher late economic and technological value than those do not (see Fig. 6C and SI, Table S7). These findings indicate that among firms with non-top early economic value, an early advantage in early technological value translates into a late advantage in terms of economic value.
We further evaluate our ability to early detect the hidden gems via their early economic and technological value. To this end, we measure firms’ economic and technological value within the earliest years, and we evaluate the predictive performance of a Naïve Bayes classifier that classifies a firm as a hidden gem if and only if it is among the top- by a given metric, where is a parameter that can be tuned to achieve a desired value of recall. We consider various performance metrics, including early productivity, , early economic value, , early technological value, , and the sum of early economic and technological value, .
We find that the alone achieves a fold increase in AuPRC compared to a random classifier (see Fig. 6D). This signals that, unsurprisingly, firms that are nearer the top threshold in early stage are more likely to transition to high value. More interestingly, the alone achieves a fold increase in AuPRC compared to a random classifier, and a combination of the and achieves the most accurate predictions, leading to a fold increase in AuPRC compared to a random classifier (see Fig. 6D and Fig. S10 in SI for shortening the duration of early window), which confirms the key role of early technological value in the transition to high economic value.
II.5 The timing of firms’ hit patents is not random

The observed predictability of firms’ future hits from early patents motivates us to study the temporal dynamics of firms’ patent value. Do firms tend to be granted their hits at the beginning of their research lifecycles? Or are firms’ highest-value patents randomly distributed along a firm’s lifecycle, similarly to the highest-impact works for scientists, artists, and musicians [7; 8; 30]? How do these patterns differ for predictable and hidden gem firms? We find that differently from results for individuals’ creative works [7; 8; 30], the temporal position of a firm’s hits is markedly non-random.
Specifically, we study the distributions and of the relative position of a firm’s technological hit () and economic hit (), respectively, compared to the firm’s total number of issued patents, [7; 30]. Both types of hits are significantly more likely to occur among earliest patents than expected by chance, which is demonstrated by the left peaks of the two distributions (Figs. 7A and B). The observed peaks cannot be explained by randomized patenting histories where for each firm, patents’ value scores are randomized, while the total number of patents is preserved (see the shadowed area in Figs. 7A and B) [7]. At the same time, whereas the probability to achieve the technological hit steadily decreases as a firm is granted more patents (Fig. 7A), the probability to achieve the economic hit exhibits a second peak around the end of the lifecycle (at , see Fig. 7B). These results hold not only when considering all firms together, but also when considering separately high-value (top 5% by their hits’ value), medium-value (middle 60 % by their hits’ value), and low-value firms (bottom 35% by their hits’ value), and when considering firms from different industries, see Figs. S11 and S12 in SI.
The heterogeneity of firms’ hit position is well-illustrated by a few key case studies in Fig. 7C (see Table S4 in SI for details). IBM achieved its economic hit (about an integrated circuit with dielectric insulation) in 1976, whereas it achieved its technological hit significantly later (in 2002) with a patent on controlling access to shared storage devices. On the other hand, the Apple’s technological hit appeared in 1992 (on a powered manager for a portable laptop computer), whereas its economic hit was issued substantially later (in 2006, on an improved method for generating multimedia non-linear effects).
The different behavior of and is a reflection of firms’ heterogeneous value dynamics, which is linked to the predictive problem studied above. To demonstrate this point, we consider the previously-defined four groups of firms: predictable, hidden gem, non-top, and declining firms. For the four groups of firms, we find that the average technological value of their patents tends to steadily decrease over time (Fig. 7D), which matches the higher probability of early appearance of technological hits. The only exception is the group of hidden gem firms, which exhibits a stabler trend. This suggests that the hidden gems’ innovation ability does not diminish as they mature, which could be the key for their later transition. By contrast, the dynamics of average economic value exhibits heterogeneous patterns. Whereas the average economic value of predictable and declining firms’ patents tends to remain stable or decrease over the firms’ lifecycles, the economic value of hidden gem and non-top firms sharply increases over time (Fig. 7E). This different behavior is reflected in the behavior of : predictable and declining firms only contribute to the early peak, whereas hidden gem and non-top firms only contribute to the late peak (Fig. 7F).
The observed early peak of supports previous studies which claimed that newcomer firms are more likely to produce innovations of high technical quality [48; 49]. This is because as firm age, they might gradually refine their innovation competence and organizational routines [50]; in this phase, benefits from new technological advances might reduce [49]. Hence, inventions by experienced firms are more likely to be the extension and improvement of their established innovative domains and technologies [50]. Based on our previous results, we conjecture that the second economic peak of hidden gem firms might be due to the increasing ability of technologically-competitive firms to attract interest from the market. In some cases, like Applied Materials, this might be due to organizational transformations. In other cases, the late peak might be due to factors that have been associated with late success in innovation research, including time-consuming knowledge acquisition [51], experience [52], and reputation [53].
Taken together, these findings indicate that the firms’ hits are not uniformly distributed along the firms’ research lifecycles, which markedly differs from previous findings on the timing of success for scientists [7], artists [30], and musicians [30]. This discrepancy indicates that previously-identified mechanisms (such as the -model [7]) are unable to explain the value dynamics in firms’ research lifecycles, which calls for new modeling approaches.
III Discussion
Our work aims to uncover patterns behind the predictability and dynamics of firms’ research success. By viewing each firm as a collection of its granted patents, we quantify firms’ research success according to the economic and technological value of their patents in two periods (an early and a late stage). We demonstrate that the economic value of a firm’s early patents is highly predictive of both the economic value and technological value of the firm’s late patents. By contrast, surprisingly, the early technological value of a firm’s patents is only predictive of the technological value of the firm’s late patents. Among firms with late patents of high economic value, we distinguish among “predictable” and “hidden gem” firms (namely, firms with and without high early economic value, respectively). We identify early signals that enable the early detection of the hidden gems. Specifically, for firms with relatively low economic value in the early stage, high early technological value can facilitate late economic value. Besides, we find that predictable firms and hidden gem firms exhibit considerably different patterns of research success over time: The patents by predictable firms exhibit an approximately stable average economic value, whereas hidden gems’ patents exhibit a sharply increasing average economic value. Similarly, the economic hit patents by predictable firms tend to be among the earliest patents, whereas the opposite is true for hidden gems. These results are strikingly different than those found for researchers in academia [7; 30; 11], which indicates that models for the dynamics of human achievements are not applicable to firms’ lifecycles.
The predictive power of the economic value of firms’ early patents raises the question of whether early value determines future success (competitive advantage mechanism) or whether it unveils a firm’s “fitness”, i.e., its ability to produce high-value research. A similar dilemma arose in recent studies on the predictability of scientists’ future success from their early collaborations with already-established top-scientists [28] and their early funding [54], and on the predictability of online viral content from its early popularity momentum [55]. The results of our matched pairs analysis provide evidence against the pure competitive advantage mechanism and in support of the role played by the fitness mechanism. The obtained findings suggest that firms’ ability to be granted to successful research might be reflected in the economic value of their early patents. At the same time, a small set of hidden gems exhibit a slower progression toward research success, which could be early detected by analyzing both the technological and the economic value of their early patents.
To conclude, the obtained findings contribute to both the management literature on drivers of firms’ performance [48; 56; 6], and the recent cross-disciplinary literature on success in human activities [7; 8; 21; 28; 9]. While recent strides in the science of science have deepened our understanding of the success trajectories of academic researchers [7; 8; 21; 28; 9; 11], our results provide the first step toward a quantitative understanding of the evolution of firms’ research success from a complexity science standpoint. Beyond firms, the research approach developed here might find application to the prediction of the research success of other players, such as cities, regions, and nations. This can help forecast promising regions and companies, identify bottlenecks in research and innovation activities, and inform resource allocation strategies.
Acknowledgement
This work is supported by the National Natural Science Foundation of China (Grant Nos. 61673150, 11622538). L.L. acknowledges the Science Strength Promotion Programme of UESTC, Chengdu. M.S.M. acknowledges financial support from the University of Zurich through the URPP Social Networks, the Swiss National Science Foundation (Grant No. 200021-182659), the UESTC professor research start-up (Grant No. ZYGX2018KYQD215). The findings, interpretations, and conclusions expressed in this paper are entirely those of the authors and do not necessarily reflect the views of the European Commission.
References
References
- idc [2019] “Idc’s worldwide artificial intelligence spending guide,” https://www.idc.com/getdoc.jsp?containerId=IDC_P33198 (2019).
- Tunyasuvunakool et al. [2021] K. Tunyasuvunakool, J. Adler, Z. Wu, T. Green, M. Zielinski, A. Žídek, A. Bridgland, A. Cowie, C. Meyer, A. Laydon, et al., “Highly accurate protein structure prediction for the human proteome,” Nature , 1–9 (2021).
- Le et al. [2020] T. T. Le, J. P. Cramer, R. Chen, and S. Mayhew, “Evolution of the covid-19 vaccine development landscape,” Nature Reviews Drug Discovery 19, 667–8 (2020).
- Hauser, Tellis, and Griffin [2006] J. Hauser, G. J. Tellis, and A. Griffin, “Research on innovation: A review and agenda for marketing science,” Marketing Science 25, 687–717 (2006).
- Nanda, Samila, and Sorenson [2020] R. Nanda, S. Samila, and O. Sorenson, “The persistent effect of initial success: Evidence from venture capital,” Journal of Financial Economics (2020).
- Guzman and Stern [2015] J. Guzman and S. Stern, “Where is silicon valley?” Science 347, 606–609 (2015).
- Sinatra et al. [2016] R. Sinatra, D. Wang, P. Deville, C. Song, and A.-L. Barabási, “Quantifying the evolution of individual scientific impact,” Science 354, aaf5239 (2016).
- Liu et al. [2018] L. Liu, Y. Wang, R. Sinatra, C. L. Giles, C. Song, and D. Wang, “Hot streaks in artistic, cultural, and scientific careers,” Nature 559, 396 (2018).
- Wang, Jones, and Wang [2019] Y. Wang, B. F. Jones, and D. Wang, “Early-career setback and future career impact,” Nature Communications 10, 1–10 (2019).
- Yin et al. [2019] Y. Yin, Y. Wang, J. A. Evans, and D. Wang, “Quantifying the dynamics of failure across science, startups and security,” Nature 575, 190–194 (2019).
- Wang and Barabási [2021] D. Wang and A.-L. Barabási, Science of Science (In press, 2021).
- Fortunato et al. [2018] S. Fortunato, C. T. Bergstrom, K. Börner, J. A. Evans, D. Helbing, S. Milojević, A. M. Petersen, F. Radicchi, R. Sinatra, B. Uzzi, et al., “Science of science,” Science 359 (2018).
- Jaffe and De Rassenfosse [2019] A. B. Jaffe and G. De Rassenfosse, “Patent citation data in social science research: Overview and best practices,” in Research Handbook on the Economics of Intellectual Property Law (Edward Elgar Publishing, 2019).
- Uzzi et al. [2013] B. Uzzi, S. Mukherjee, M. Stringer, and B. Jones, “Atypical combinations and scientific impact,” Science 342, 468–472 (2013).
- Mukherjee et al. [2017] S. Mukherjee, D. M. Romero, B. Jones, and B. Uzzi, “The nearly universal link between the age of past knowledge and tomorrow’s breakthroughs in science and technology: The hotspot,” Science Advances 3, e1601315 (2017).
- Shi and Evans [2019] F. Shi and J. Evans, “Science and technology advance through surprise,” arXiv preprint arXiv:1910.09370 (2019).
- Pugliese et al. [2019a] E. Pugliese, L. Napolitano, A. Zaccaria, and L. Pietronero, “Coherent diversification in corporate technological portfolios,” PLOS ONE 14 (2019a).
- Pugliese et al. [2019b] E. Pugliese, G. Cimini, A. Patelli, A. Zaccaria, L. Pietronero, and A. Gabrielli, “Unfolding the innovation system for the development of countries: co-evolution of science, technology and production,” Scientific Reports 9, 16440 (2019b).
- Wang, Song, and Barabási [2013] D. Wang, C. Song, and A.-L. Barabási, “Quantifying long-term scientific impact,” Science 342, 127–132 (2013).
- Higham et al. [2017] K. W. Higham, M. Governale, A. Jaffe, and U. Zülicke, “Fame and obsolescence: Disentangling growth and aging dynamics of patent citations,” Physical Review E 95, 042309 (2017).
- Wu, Wang, and Evans [2019] L. Wu, D. Wang, and J. A. Evans, “Large teams develop and small teams disrupt science and technology,” Nature 566, 378–382 (2019).
- Kogan et al. [2017] L. Kogan, D. Papanikolaou, A. Seru, and N. Stoffman, “Technological Innovation, Resource Allocation, and Growth*,” The Quarterly Journal of Economics 132, 665–712 (2017).
- Stoffman, Woeppel, and Yavuz [2020] N. Stoffman, M. Woeppel, and M. D. Yavuz, “Small innovators: No risk, no return,” Kelley School of Business Research Paper (2020).
- [24] “Patent–crsp match, 1926-2017,” Dropdox. Available at https://paper.dropbox.com/doc/Patent-CRSP-match-1926-2017-W3aHAj0Ce4CzKZayqCASj. Deposited 30 November 2019.
- Hall, Jaffe, and Trajtenberg [2005] B. H. Hall, A. Jaffe, and M. Trajtenberg, “Market value and patent citations,” RAND Journal of Economics , 16–38 (2005).
- Waltman [2016] L. Waltman, “A review of the literature on citation impact indicators,” Journal of Informetrics 10, 365–391 (2016).
- Mariani, Medo, and Lafond [2019] M. S. Mariani, M. Medo, and F. Lafond, “Early identification of important patents: Design and validation of citation network metrics,” Technological Forecasting and Social Change 146, 644–654 (2019).
- Li et al. [2019] W. Li, T. Aste, F. Caccioli, and G. Livan, “Early coauthorship with top scientists predicts success in academic careers,” Nature Communications 10, 1–9 (2019).
- AlShebli, Rahwan, and Woon [2018] B. K. AlShebli, T. Rahwan, and W. L. Woon, “The preeminence of ethnic diversity in scientific collaboration,” Nature Communications 9, 1–10 (2018).
- Janosov, Battiston, and Sinatra [2020] M. Janosov, F. Battiston, and R. Sinatra, “Success and luck in creative careers,” EPJ Data Science 9 (2020), 10.1140/epjds/s13688-020-00227-w.
- Strumsky and Lobo [2015] D. Strumsky and J. Lobo, “Identifying the sources of technological novelty in the process of invention,” Research Policy 44, 1445–1461 (2015).
- Cremers et al. [1999] K. Cremers, D. Harhoff, F. Narin, F. Scherer, and K. Vopel, “Citation frequency and the value of patented inventions,” The Review of Economics and Statistics 81, 511–515 (1999).
- Silverberg and Verspagen [2007] G. Silverberg and B. Verspagen, “The size distribution of innovations revisited: an application of extreme value statistics to citation and value measures of patent significance,” Journal of Econometrics 139, 318–339 (2007).
- Ahuja and Morris Lampert [2001] G. Ahuja and C. Morris Lampert, “Entrepreneurship in the large corporation: A longitudinal study of how established firms create breakthrough inventions,” Strategic Management Journal 22, 521–543 (2001).
- Fleming and Sorenson [2003] L. Fleming and O. Sorenson, “Navigating the technology landscape of innovation,” MIT Sloan Management Review 44, 15 (2003).
- Dunlap-Hinkler, Kotabe, and Mudambi [2010] D. Dunlap-Hinkler, M. Kotabe, and R. Mudambi, “A story of breakthrough versus incremental innovation: Corporate entrepreneurship in the global pharmaceutical industry,” Strategic Entrepreneurship Journal 4, 106–127 (2010).
- Srivastava and Gnyawali [2011] M. K. Srivastava and D. R. Gnyawali, “When do relational resources matter? leveraging portfolio technological resources for breakthrough innovation,” Academy of Management Journal 54, 797–810 (2011).
- Ahuja and Katila [2001] G. Ahuja and R. Katila, “Technological acquisitions and the innovation performance of acquiring firms: A longitudinal study,” Strategic Management Journal 22, 197–220 (2001).
- Zhang et al. [2020] S. Zhang, N. Zhang, S. Zhu, and F. Liu, “A foot in two camps or your undivided attention? the impact of intra-and inter-community collaboration on firm innovation performance,” Technology Analysis & Strategic Management 32, 753–768 (2020).
- Trajtenberg [1990] M. Trajtenberg, “A penny for your quotes: patent citations and the value of innovations,” The Rand Journal of Economics , 172–187 (1990).
- Turkina, Oreshkin, and Kali [2019] E. Turkina, B. Oreshkin, and R. Kali, “Regional innovation clusters and firm innovation performance: An interactionist approach,” Regional Studies 53, 1193–1206 (2019).
- Powers [2011] D. Powers, “Evaluation: From predcision, recall and f-factor to roc, informedness, markedness & correlation,” Journal of Machine Learning Technologies 2, 37–63 (2011).
- Perc [2014] M. Perc, “The matthew effect in empirical data.” Journal of the Royal Society Interface 11, 20140378–20140378 (2014).
- Hofman, Sharma, and Watts [2017] J. M. Hofman, A. Sharma, and D. J. Watts, “Prediction and explanation in social systems,” Science 355, 486–488 (2017).
- Salganik [2019] M. J. Salganik, Bit by bit: Social research in the digital age (Princeton University Press, 2019).
- Rosenbaum and Rubin [1983] P. R. Rosenbaum and D. B. Rubin, “The central role of the propensity score in observational studies for causal effects,” Biometrika 70, 41–55 (1983).
- Ke et al. [2015] Q. Ke, E. Ferrara, F. Radicchi, and A. Flammini, “Defining and identifying sleeping beauties in science,” Proceedings of the National Academy of Sciences 112, 7426–7431 (2015).
- Huergo and Jaumandreu [2004] E. Huergo and J. Jaumandreu, “How does probability of innovation change with firm age?” Small Business Economics 22, 193–207 (2004).
- Balasubramanian and Lee [2008] N. Balasubramanian and J. Lee, “Firm age and innovation,” Industrial and Corporate Change 17, 1019–1047 (2008).
- Sørensen and Stuart [2000] J. B. Sørensen and T. E. Stuart, “Aging, obsolescence, and organizational innovation,” Administrative Science Quarterly 45, 81–112 (2000).
- Jones [2009] B. F. Jones, “The burden of knowledge and the “death of the renaissance man”: Is innovation getting harder?” The Review of Economic Studies 76, 283–317 (2009).
- Sekara et al. [2018] V. Sekara, P. Deville, S. E. Ahnert, A.-L. Barabási, R. Sinatra, and S. Lehmann, “The chaperone effect in scientific publishing,” Proceedings of the National Academy of Sciences 115, 12603–12607 (2018).
- Petersen et al. [2014] A. M. Petersen, S. Fortunato, R. K. Pan, K. Kaski, O. Penner, A. Rungi, M. Riccaboni, H. E. Stanley, and F. Pammolli, “Reputation and impact in academic careers,” Proceedings of the National Academy of Sciences 111, 15316–15321 (2014).
- Bol, De Vaan, and De Rijt [2018] T. Bol, M. De Vaan, and A. V. De Rijt, “The matthew effect in science funding,” Proceedings of the National Academy of Sciences 115, 4887–4890 (2018).
- Shulman, Sharma, and Cosley [2016] B. Shulman, A. Sharma, and D. Cosley, “Predictability of popularity: Gaps between prediction and understanding,” in Tenth International AAAI Conference on Web and Social Media (2016).
- Gupta, Guha, and Krishnaswami [2013] P. D. Gupta, S. Guha, and S. S. Krishnaswami, “Firm growth and its determinants,” Journal of Innovation and Entrepreneurship 2, 15 (2013).
- Leydesdorff et al. [2011] L. Leydesdorff, L. Bornmann, R. Mutz, and T. Opthof, “Turning the tables on citation analysis one more time: Principles for comparing sets of documents,” Journal of the American Society for Information Science and technology 62, 1370–1381 (2011).
- Carley, Hedge, and Marco [2015] M. Carley, D. Hedge, and A. Marco, “What is the probability of receiving a us patent,” Yale JL & Tech. 17, 203 (2015).
IV Methods
IV.1 The USPTO dataset
We analyze the patents granted to firms by the United States Patent and Trademark Office from [24]. The average number of patents per firm is , and the largest number is (granted to IBM). For each patent, the dataset includes an ID, date of filling and grant date (we employ the latter date), IDs of the applicant firms, and the list of its cited patents. Note that this dataset only includes patents whose assignee has been matched to a firm in CRSP (Center for Research in Security Prices), so that each patent’s applicant is a firm listed in the US stock market. All other patents are not included in the original data.
A potential issue when measuring the number of citations received by a patent is that it might be unreliable for patents issued near the end of the data. To prevent this issue, we make a conservative choice and limit our analysis to patents issued up to (and their applicant firms). In this way, patents’ citation counts are measured over a time-window of at least 10 years, thereby avoiding short-term fluctuations (results stay qualitatively the same if we consider patents issued up to ). At the same time, we are interested in firms with a sufficiently productive research activity. For this reason, in the main text, we limit the firm-level analysis to firms that have at least patents, which includes firms. In the SI, we show that our main results are qualitatively the same when filtering the firms based on their number of years of activity, see Fig. S16 in SI.
IV.2 Measuring technological and economic value
IV.2.1 Patents’ technological value
Citation count is traditionally used to gauge the scientific impact of papers [26] and patents [13]. However, citation count should be used with caution because of its biases that make it unreliable to compare patents issued in different years [27]. To fairly compare the impact of patents issued at different times, inspired by the percentile ranks in [57], we measure patent ’s normalized citation value (NCV) as ’s relative ranking position by citation count compared to all the patent issued in the same year as . The definition reads
(1) |
where is the number of patents issued in the same year as patent , and denotes the ranking of by citation count among the patents of the same age ( if is the top patent; if is the last one, which correspond sto and , respectively. Note that all tied values will be assigned the average of the rankings). Therefore, the resulting score is close to one (zero) for high-value (low-value) patents. Crucially, differently from the rankings by citation count (see Fig. 2) and (i.e., citation count restricted to the first 10 years after the patent issuance, see SI, Fig. S1) [7], the ranking by NCV is consistent with an age-unbiased ranking (see Fig. 2).
IV.2.2 Patents’ economic value
In view of patent issuance conveys important information to the market, previous studies [23; 22] estimated the US patents’ economic value based on the movements of the applicant firm’s stock prices over the days after the patents were issued. To observe the market’s reaction to the patent grant, the authors adopted a two-day time window after the patent issuance based on the finding that firm’s share turnover increases at most in the first two days after the patent issuance announcement.
To disentangle the component of firm return related to the patent’s economic value from unrelated factors, they assumed that the idiosyncratic stock return for a given firm around the time window that its patent issued is,
(2) |
where equals to the firm’s return minus the return on the market portfolio (to remove market movements), is the value of patent , as a fraction of the firm’s market capitalization and denotes the component of the firm’s stock return that is unrelated to the patent. Then, the economic value of patent is estimated as the product of the estimate of the stock return due to the patent value times the market capitalization () of the applicant firm on the day prior to the patent issuance announcement:
(3) |
where denotes the unconditional probability of a successful patent application, which is approximately 56% according to patents filed between 1996 and 2005 and examined before mid-2013 [58]. If on the same day of issued, patents are issued to the same firm, patent is assigned of the total value. See [23] for complete details of the estimation procedure. We use the ready-made estimated results provided by the authors which is available at https://paper.dropbox.com/doc/Patent-CRSP-match-1926-2017-W3aHAj0Ce4CzKZayqCASj.
The computation of the NEV is similar to the NCV’s one. We compare the value of patents issued in the same year, and patent will obtain a score , where denotes ’s relative ranking position by among patents issued in the same year as . Likewise, NEV ranges in .
IV.3 Matched pair analysis
Matched pair analysis is a form of analysis in which each of the subjects in a treatment group is paired with each of those in a control group on the basis of matching covariates. This technique is widely used in medical and social research to evaluate the effect of a treatment, with the ease of implementation and comprehension. We obtain such pairs via Propensity Score Matching [46], where the propensity score is defined as the probability of treatment assignment conditional on baseline covariates. We implement the matching by a Python package available at http://www.kellieottoboni.com/pscore_match/ (with minor changes to support one-to-one matching).
Take Fig. 5A as an example to explain the matching process. Firstly, we calculate propensity scores for each analyzed firm by applying a logistic regression where the covariates are early productivity and technological value and the dependent variable is whether the firm has top-5% economic value in the early stage (1 if yes, 0 otherwise). Then each firm with top-5% early economic value will be tried to match with one firm with non-top early economic value according to their propensity score, at the same time, we require the matching pairs to have identical SIC industry (the major 10 sectors). Note that we use one-to-one match so that each firm with top early economic value will be matched with at most one firm with non-top early economic value. If the match succeeds (i.e. the two firms have close enough propensity score), the firm with top-5% early economic value will be assigned to the treatment group, the other firm will be assigned to the control group. By going through all firms that have top-5% economic value in the early time, we construct the treatment group and control group, and compare the subsequent performance of firms in the two groups.
Supplementary Information
Supplementary Information can be requested from S.X. ([email protected]).