Fields: On the Visibility of Flows in Digital Business

Traditional investment and management tools fail to capture the complexities of a networked, digital business economy: SWOT analysis are static, accounting ratios are unidimensional. More importantly, those artifacts are unfit to deal with intangibles: by definition, the "material" information-based economies transact on. Here we present a Weak Signals technique Vector Fields Flows to dynamically profile strengths and opportunities in digital businesses, specially those heavily reliant on web search as a source of revenue. We demonstrate how vector fields topology can in fact reveal liquidity changes in online businesses, with immediate applications in finance operations and market prospecting.


Introduction
"It is, of course, mathematically easier to analyse equilibria than trajectories of change."

-John Maynard Smith
Recent research (Ibbotson, Idzorek, 2014) has identified the central role of popularity in asset pricing, arguing for the consideration of popularity outside of a standard risk framework, and differentiating low-popularity equity strategies from low-volatility and low-beta strategies. Particularly, it has been found that besides premiums that are considered permanent (for example, for risk, liquidity, taxability, and so on) and are expected to provide excess returns even after their discovery; as a stock's popularity increases, its price rises, and the investor not only gets the premium but also the return from the increase in popularity. The more transitory popularity characteristics, for example, the stocks that are highly traded, in the news, or there is much excitement about, may be associated with mispricing. As Ibbotson and Idzorek found, in all cases, the movement from the unpopular dimension to the popular dimension Financial Assets and Investing 6 corresponds with relative price increases. However, this sort of mispricing seems to impact shorter term returns but not necessarily long-term returns.
We have seen that this conception can be extended to private companies operating mainly in the digital economy as well, given that the right popularity signals are measured. The character of these signals might be weak in nature (for instance a small-medium size private business doesn't usually get much coverage in the news), but nevertheless possible to acquire, condition and process, to present in a visual context that can facilitate decision making. These weak popularity signals change fast, with a degree of randomness, and are buried in noise (so, they are more similar to what one will see in complex physical systems), which calls for a non-traditional approach for their study. In other words, before we could even begin to assess impact on asset prices changes, we need a device that can allow us to identify general patterns, a phase portrait on revenues or anticipated liquidity changes linked to popularity-computational models of Vector Fields and Flows , have shown particularly fit to the task. We foresee that a refining of these techniques may provide proxies to anticipate changes in revenue streams and the value of intangible assets, therefore enhancing the digital business reach of commercial valuation models such as the VIM Model for Appraising the Trademark of an Unlisted Company (Čižinská, Krabec, 2014).
The study of search volume for finance keywords in Google Trends has suggested that there are patterns of early warning signs in stock market moves, and this is consistent with Herbert Simon's model of rational choice (Preis, Tobias, Moat, 2013).
Online businesses' revenues are exposed to the effects (risks) of unpopularity: When Yelp Inc. (YELP) presented its SEC filings, it clearly stated that "we rely on traffic to our website from search engines like Google, Yahoo! and Bing. If our website fails to rank prominently in unpaid search results, traffic to our website could decline and our business would be adversely affected"; this kind of organic ranking is correlated with the human factors (trust, popularity, and so on) that the search engines algorithms ascribe to websites. After all, most elements contributing to increase in (prospective customer) traffic on the web have a parallel with the real world (website: propertyreal state, organic search: yellow pages, social: word of mouth, backlinks: referring partners, display advertisement: billboards, universal search: media and news, and so on), and therefore are a factor in increasing visibility. Smaller, unlisted companies may be even more exposed to popularity risks: They usually rely on less diversified revenue streams, with incoming traffic concentrated in a few countries and product catalogs focused on a single type of offering usually reliant on a constant stream of new leads (one time sales, software as a service with large churn rates, subscriptions based on freemium upgrades, and so on).
Emerging properties of complex human-machine systems appear to have an effect in the movement of security prices: A fact well known by search engine optimisation industry practitioners is that, for listed companies operating online businesses with websites that are highly reliant on search, a change in search algorithm can positively or negatively impact stock price. When Panda 4.0 -the codename for a Google algorithm change that penalised a number of e-commerce sites by reducing their search ranking, therefore decreasing the probability for visitors to find them, was rolled out on May 22nd 2014, RetailMeNot Inc (SALE:NASDAQ GS) stock dropped 10% or in other words, "Google's change helped to wipe $170 million from their market cap in a single day. Since a high in April their stock had dropped 29% by October" (Allsopp, 2014). For some time it was observed that the impact of search algorithm updates consistently followed that sort of "tsunami" pattern: A first shock wave could erase value just after the change, in anticipation of the penalisation, and the second shock (the big wave) would come around 2 months after, when the impact of the change was evident and earnings had to be reported. However, recently Google has been deploying its algorithm changes in more unpredictable ways, with progressive deployments that start at different dates in different countries, and thus, calling for novel non-linear and multidimensional mapping techniques.
Since signals may come from a variety of direct sources (search, social networks, display advertisement, etc) thus relating to intrinsic strength, and indirect sources (such as the transfer of risk from linking partners exposed to their own search, social network, visibility changes) therefore affected by extrinsic factors, and they represent a continual information stream; we believe that a suitable technique to visualise the topology of the flow can take moving fluids and electromagnetic fields as a model, where each point in space has a velocity vector, taking all the points into what is called a vector field. Validation about the importance of visibility comes from the study of complex networks -we know that mechanisms such as grow and preferential attachment in fact operate in business networks and the world wide web, showing self-organization due to the local decisions made at each individual node, based on information that is biased towards the more visible nodes (seen in Barabási, Albert, Jeong, 1999).
Our objective was to answer the question: Can vector field topology (an analytics and visualization technique commonly used to analyze flows in dynamic systems) help online businesses to anticipate liquidity issues and perform market prospecting in a digital economy?

Methodology and Data
In our study we have used aggregated webdatametrics (indexes, scores, statistics) from billions of data points logged by specialized data mining vendors operating sensors and data collection facilities across the world. We have also used anonymized financial information (ratios) from some of the companies under study. Vector fields and Network graphs are computed and plotted by using the Wolfram Language (Mathematica). All data collected is from 2014. The vendors for the curated datasets are providers to some of the most data intensive online businesses (e.g eBay Inc.), and according to multiple independent reviews, are best in class: Israels' SimilarWeb LTD and Germany's SearchMetrics GmbH. 1

Definitions
Vector fields and flows: A vector field is a map, a function that assigns a vector to each point of Euclidean space. Vector fields arise in differential equations and differential geometry. Flows are generated by vector fields and vice versa. Several vector fields are illustrated in Figure 1.
1 Below we cite the vendor's disclaimer on the source, treatment and reliability of their data: SearchMetrics' Organic search visibility (SEO Visibility):"SEO Visibility is composed of search volume and the position of ranking keywords. Each position is individually measured by a calculated factor from Searchmetrics. Also, the SEO Visibility factors in whether the ranking keywords are navigational or informational. SEO Visibility presents the current trend and historical development of a domain's visibility in search engines. The index reflects how often a website shows up in the search results. While SEO Visibility can relate to a website's real traffic, it is important to remember that traffic can come from many different places online. Therefore SEO Visibility is only an indicator of visibility that comes from a website's organic search channel. SEO Visibility makes it possible to compare the performance of different domains in search engines. Because of the historical data, problems as well as positive changes can be identified. Comparing the SEO Visibility of thematically similar websites or competitors will provide the most value as developments of the market environment will be factored in and market trends easily identified". Source: SearchMetrics.
SimilarWeb's Traffic sources: "SimilarWeb doesn't rely on any single channel for data collection. We work with a wide variety of sources to create the most accurate and reliable picture of the digital world. All of this data is fed into SimilarWeb's data processing servers where we turn billions of daily data points into insightful information. Our data comes from 4 main sources: [1] Panel of Web Surfers -Our User Panel is the largest panel in the industry (tens of millions). Panel data is collected from tens of thousands of browser plugins, desktop software, and mobile apps.
[2] Global Internet Service Provider -We also collect data from local Internet Service Providers (ISPs) in many countries.
[3] Direct Measurement -We have directly measured web traffic from tens of thousands of websites that share their data with SimilarWeb. When directly measured data is available it replaces our estimations to give unparalleled accuracy within our platforms. We also use this data to create highly accurate estimation algorithms.
[4] Web Crawlers -Our web crawlers scan every public website to create a highly accurate map of the digital world. We implement big data technologies on our data center consisting of dozens of high-end servers that analyze tens of terabytes of data every week and more than a billion data points every single day. The volume of data we manage and process makes our insights highly accurate and reliable. Our raw data is treated with in-house algorithms to remove biases, filter out noisy information, and transform it into meaningful insights. The data from our diversified sources is intelligently combined, normalized, and projected to represent the entire Internet population." Source: SimilarWeb. Vector fields are extensively used in science to encode different data sets, e.g.
Velocity, Electricity & Magnetism, Temperature, Stress/Strain (Levine, 2005), and in all sort of physical and biological applications: Draw a map on wind velocity for weather forecasts, visualize the shape and behavior of the electromagnetic field surrounding electronic devices, describe the dynamics of populations in biology, and more recently in evolutionary game theory and economics (Sandholm, 2010). In general, they help in understanding the evolution of the state of abstract physical systems. In a mathematical sense, time independent vector fields are the same thing as autonomous ordinary differential equations (Asimov, 1993), so for analysis and communication purposes we can move altogether to use purely (and equivalent) graphical representations. Simply put, Vector Fields are used in Physics for largescale data analysis in order to achieve multiscale/level-of-detail exploration, and it is one of those few rigorous descriptors of flow dynamics that are parameter free (Chen, 2013). Thus, our initial assumption was that the application of Vector Fields using data obtained by webdatametrics techniques would be well suited to map large amounts of information in network economies, where information is biased towards the more visible nodes (Barabási, Albert, Jeong, 1999).
Flow patterns and Critical points: By observing the topology of a vector field, we present a skeleton of the information, i.e. the defining structure of the vector field. In doing so, we can consider only areas of interest such as critical points or in the unsteady case of bifurcations (Levine, 2005). For instance, in Figure 2 we see typical surface representations of 2-and 3-dimensional fluid flow topology; R1 and R2 denote the real parts of the eigenvalues (the characteristic value related to an eigenvector or characteristic vector of a linear transformation defines a direction that is invariant under the transformation) of the Jacobian (matrix of all first-order partial derivatives of a vector-valued function), I1 and I2 are the imaginary parts. Source: Surface representations of 2-and 3-dimensional fluid flow topology (Helman, Hesselink, 1990) The identification of patterns is nontrivial -for instance nonhyperbolic critical points, such as centers, mean that a vector field is unstable because an arbitrarily small perturbation can change the critical point to a hyperbolic one (Levine, 2005).
Conversely, the dynamics of a hyperbolic trajectory are not easily disturbed.
Therefore, the phase portrait containing those kinds of patterns provides a wealth of information about the dynamics of a system -how it works, its history and where it might go next.
Streamlines: To visualise these flow patterns sometimes it is convenient to represent streamlines instead of the vectors themselves. A streamline is a curve everywhere tangent to the local velocity vector at a given instant -instantaneous lines; convinent to compute mathematically, as in Figure 3.  Source: Imaging vector fields using line integral convolution (Cabral, Leedom, 1993).
Stock and flow dynamics: Another useful definition is the basic "stock and flow" concept from the System Dynamics methodology (Forrester, 1971). Forrester system dynamics methodology for modeling information flows in continuous-time systems is widely used today in research within the social and life sciences (Cellier, 1991;Fisher;2007). Figure 5 depicts the general continuous level model of System Dynamics with a single inflow and a single outflow computes the level by integrating over the difference between inflow and outflow rates; while vectors fields are used to map the topology of the bidimensional "surface" of the visibility field, Forrester device provides a simple way to understand the unidimensional changes in liquidity.  Figure 6 uses the "stock and flow" concept from the System Dynamics methodology (Forrester, 1971;Cellier, 1991;Fisher;2007) to depict those relationships. Then we can define a study period of 6 months, and draw gauges for 2 moments at the beginning and end of the period (Figure 7), as typically represented in financial dashboards and other commercial products.

Source: Company A financials -an example
Note that Company A works as a business unit. A brand that is beyond its asset function acts as its own profit center as well. The static representation is convenient for some purposes -it is a quick pulse for the health of the operation, and it offers some information security (no actual deposits or withdrawals figures are shown). The problem with this approach alone is that an observer has to be in the organization (have complete internal information) to notice disturbances within the flow, and furthermore, he would be unable to perceive external variables that may affect or compensate for sudden changes in flow (such as search engine visibility changes due to traffic and ranking corrections, etc). Viewed from above (top view perspective), and at a distance (as it is usually the case, because we always have incomplete information about our competitors, other market players, and even ourselves), we might get a clearer visual of where the stream is constant and where it is disturbed.
This also allows us to add extra dimensions to our data, beyond the simplistic "money level". This sort of hidden flow will look as in Figure 8 from above, when representing the data as a stream (or money on the surface). can be studied a posteriori. In a way, we are applying a transformation of vast amounts of data points from the digital economy realm to a more tangible expression: Converting digital-to-analog signals, for the convenience of the decision maker that has to quickly understand and act upon the data at hand. Furthermore, the field visualisation is generated programatically, so it can be updated with the most recent data in real time, in a cloud computing environment.
From the slope of the curves we can see what appears to be a flow pattern with tendency for incremental changes in velocity; however the most intriguing criticality occurs during the first 2 weeks of the third month, where an accelerating stream driven by the visibility component of the vector catches the eye. As expected from emerging behavior in complex networks, a simple regression fails to capture the relationship in the data (visibility shows no statistically significant effect on liquidity).
Nevertheless, it is clear that there is some sort of "impulse response" present here, where a spike in visibility in week 12 gives momentum to liquidity (week 13 peak, and the overall raising trend after that, in Figure 9).

Figure 9
Visibility and liquidity trends (weekly). Ratios and indexes are shown in the y-axis and weeks in the x-axis

Source: SearchMetrics and Company A financials
The discovery of this kind of "bursty" behavior is one of the key insights from the modern study of complex systems. The dynamics of a wide range of real systems, from email patterns to earthquakes, display a bursty, intermittent nature, characterized by short timeframes of intense activity followed by long times of no or reduced activity (Goh, Barabási, 2008).
What we are seeing is similar to turning on an extremely bright light in a dark room for a short moment: Suddenly one obtains a wealth of information to navigate the environment, at least for some time (as memory and interest allows). But how can bursts act as triggers to changes on the flow? What are the key signals to control their occurrence, and for how long might their effect have material impact? This is an important point, because any chance of replicability may have immediate positive effect on revenues.
The mechanics of bursting are the subject of a separate study. For now, let us focus on the visualization of the flow, and its utility to inform management and investment desicions. We know that a vector field arises in a situation where, for some reason, there is a direction and magnitude assigned to each point of space (Asimov, 1993).
Particularly, we are interested in the critical points (singularities) in the field, therefore we may find convenient plotting a Line Integral Convolution Plot (Cabral, Leedom, 1993, Laidlaw et al, 2001. Figure 10 Simplified representation of the dynamic flow, using Line Integral Convolution

Source: SearchMetrics and Company A financials
At this point we can tangibly perceive that liquidity follows visibility. Our new visualization as shown in Figure 10 works like "painting with light", only that in this case we have a (liquidity, visibility) field rather than a light field. A quick look into this kind of phase portrait will immediately give a feeling of the opportunity, and provide direction into which questions to ask next. There is also an enhanced information security aspect to this representation, due to the possibilities that it offers to encode information using colors, or even in plain sight hiding of secret messages (for instance by using steganography techniques).

COMPANY B: ANTICIPATED LIQUIDITY CHANGES
Company B is a medium-large size information portal (receiving over 5Million visits a month from desktop computers), that is specialized in technology tutorials. This kind of firm operates mainly a business model of content production & advertisement sales and therefore, relies on a continual stream of fresh traffic to generate revenue. Not having the benefit of internal information on the operations of this private business, how can we anticipate downside pressure on revenue? First, in Figure 11 we notice that most of the traffic (around 94%) comes from organic web search (Google and the like), meaning that listings are appearing in search engines due to relevance to the query, not paid results; the rest of the traffic is divided among direct visits  By looking at the top 5 contributor countries, we find that the United States and Canada account for over 50% of both visibility and traffic, so we now plot the visibility vector field for a period of at least 2 years (in order to capture the possible effect of business cycle). Vector fields have been used extensively in physical and biological applications to encode data sets as an equivalent alternative to differential equations (Asimov, 1993) and more recently in evolutionary game theory and economics (Sandholm, 2010).

Figure 12
Visibility vector field (the United States, Canada) during 27 months. The

time-like surface covers a period of six months (x-axis) over 4 weeks (y-axis)
Source: Searchmetrics In Figure 12, we plot the vectors themselves (not just the streamplot). Therefore we can appreciate both magnitude and direction of the visibility field; we notice that visibility seems to be stronger during the second year, and it seems to have the same direction with a general tendency to increase. In general terms, the importance of mapping visibility country pairs is that it allows capturing of the moments when the value of the indexes reinforce each other, when a weekly drop in one geography is canceled by the surge in another, and so forth; in this specific case the impact is limited, though, since there is a difference of one order of magnitude between the two geographies mapped, the website receives its largest share of traffic from the US. Now that we have a general understanding of the shape of the data we can focus on the area that seems more intriguing, the last ten weeks of the period. We are interested in large movements, apparent changes, and critical points in the flow pattern; in this view we get some hints on differences of density. This is in effect a map of anticipated visitor liquidity and therefore, revenue; this is akin to "turning on" a light at different spots of space (in this plane, time) -search engines will show online presence clearer at some points (and moments) and not so clearly at others, therefore limiting the ability of the online business to attract new visitors, and advertisement partners. A useful analogy from the physical world is that of a dark room where only when light is turned on at certain moments an object is visible and therefore potentially reachable, or that of one retail shop that loses its listing on the yellow pages -it will retain some walk-in traffic for a while as an effect of previous visibility, but eventually its traffic will fade as new customers can not see a clear path to it and have no way to find it.

Conclusions and Future Possible Applications and Further Research
In Mergers & Acquisitions and other investment activities there are at least four distinct phases: Target identification, Due diligence, Negotiation, and Valuation; in the context of digital businesses, the use of advanced analytics is essential in at least the first two of the above mentioned. Topological data analysis deals with the study of the shape of data to extract encoded meaning from large, complex datasets; system dynamics shows how structure determines behavior -shape matters.
Literature on the valuation of intangibles and the role of popularity to boost short term asset performance, supports the case for developing new methods to extract valuable business meaning from the hidden and weak signals present in the digital economy -specially, but not constrained to, private companies where the availability of information for investment and competitive decision-making is limited. We have seen how vector field topology is useful for flow analysis and visualisation, and to provide an intuitive way to quickly identify critical points in circumstances where data is massive, continually changing, and assymetrical, in terms of geographical reach.
We have demonstrated how to approach the construction of a elemental liquidity portrait, both in the case of availability of partial internal information, and availability of external information only. This type of exercise may help online businesses anticipate liquidity issues or track competitors's position changes, and allow investors to assess the quality of prospect leads and portfolio companies operating in the digital economy.
While the present treatment is focused on income and cost, for benchmarking purposes additional dimmensions are needed -for instance, for disambiguation in cases of conflicting valuation results. By using Optimisation Theory and adding brand perception -social networks datametrics -as a third dimension to the branded traffic from web channels, one could reveal a saddle, sink or source topology. That approach might be useful to visually explain why it is easier to recover from a valley due to previous reputation threats, and why in some cases a strong marketing investment in brand building is rather associated with decreasing returns (Čižinská, Krabec, 2014).
A future paper will analyze in detail different types of flow patterns observed in online businesses, their meaning and utility for decision making, as well as this additional social-space dimension.