Predicting company decisicions using knowledge graphs

In a previous post Understanding Company Decisions Using Knowledge Graphs I argued that “using some Graph algorithm” it is possible to predict Google’s acquistion of data analytics company Looker. This post shows how I used “some Graph algorithm” to do that. Translating business reasoning into technical reasoning In the above-linked post I reasoned that: Google acquired Looker because Google’s Cloud competitors are expanding into a segment which is adjacent to Cloud important to Google a segment where Google is not active As you can see, the acquistion happened in a context. The context is that Amazon, Google, and Microsoft are active in the same industries except “data analytics and visualization” where Google is not active but Amazon and Microsoft are. In a graph, this context looks like this: The meaning of the nodes is as follows: yellow nodes: companies red nodes: products green nodes: industry segements With the data model in mind, let’s break the business requirements down and translate them into their technical counterparts. Finding strategic differences between companies using graph visualiziation and calculated ompany similarity Google’s Cloud competitors are expanding into a segment where Google is not active. This means that based on industry segments Google, Amazon, and

Analzing News articles using graph algorithms

When analyzing company investments it is useful to group investors to understand industry patterns. The image below shows such a grouping for some recent invesments in the FinTech-space. The company clusters are immediately visible. Whereas useful, these relationships only capture the most basic data: companyA invests in companyB. More realistically, however, such investments happen in a context; for instance, the above image does not capture the fact that the investors and companies have implict relationships with each other. The image below takes these implict relationships into account: Using Neo4J to analyze groups in this version shows different groups As an example consider Google and Amazon. In the basic version they were not connected… … however, Neo4J’s grouping algorithm put them into the same bucket When we drill down we see why this grouping makes sense: Google and Amazon are both active in the insurance industry. It is worth noting, that this relationship is not explicit but rather through respective partnerships and sub-companies: Google own Verily which in turn has launched Coefficient Insurance Company, a company in the insurance industry. Amazon, on the other side, has partnered with Acko, also a company in the insurance industry.

Using knowledge graphs to replace analysts

Recently CB Insights published a FinTech-report titled The State Of Fintech Q3’20 Report: Investment & Sector Trends To Watch This post shows how a knowledge graph can be used to automatically generate such a report. About knowledge graphs A knowledge graph consists of two parts: a node and a relationship. A node represents an entity (a company, an industry, etc.). A relationship represents how two nodes are connected. In this case, the knowledge graph’s nodes are: companies the offerings that these companies launched the industries in which these companies are the companies that invested in these companies The relationships are: company “IS IN” industry company “INVESTED IN” company company “launched” offering The graph below shows an excerpt of the whole graph: For instance, you see three companies that invested in Revolut that Revolut is in the FinTech-industry that Revolut launched commission-free stock trading With this in mind, the remaining post shows how graph algorithms for anomaly detection can generate the above-mentioned report. Anomaly detection Anomaly detection is used to find rare nodes or relationships which are significantly different from the remaining data. Such nodes or relationships signalize unoccupied industries or segments for innovations or new markets. Consider this insight from

Going beyond the article headline: what knowledge graphs reveal

I recently read this article: Qlik übernimmt From the article it was clear what happened: Qlik a Business Intelligence-company acquired, a Data Integration-company. However, I was wondering: What am I missing? What information is not evident from the article alone. To answer this question I built a knowledge graph. The knowledge graph consists of companies, industries, and company strategies that are related to Qlik and An overview of the knowledge graph is shown in the image below: The meaning of the colors are as follow: orange: represents a company blue: represents an industry red: represents a company strategy There a several reasons why such a knowledge graph can answer the question ‘What am I missing?’. They are explained below. There is more to it than just ‘Qlik acquired’ The image below shows two graphs: in the upper left corner you see the basic graph that represents ‘Qlik acquired’ in the remaining image you see the full picture with all the relationships that Qlik and have It is immediately clear that ‘Qlik acquired’ is too simplistic. If you look at the whole graph you immediately notice: industry outsiders like Google or PwC are active

Understanding company decisions using knowledge graphs

Recently I read this article (Cloudera adds data engineering, visualization to its Data Platform) and wondered why it was relevant. To an industry-expert the implications are clear, but to me – an industry-novice – not. This situation made me wonder how a knowledge graph can be used to automatically answer the “why” behind a company’s decision. Based on an innovation process in patent analysis, I came up with this process: I manually identified a similar article (Tech and Antitrust Follow-up, Google Buys Looker, Salesforce Buys Tableau; paywall) From this article, I constructed the knowledge graph below. I defined explicit relationships (as mentioned in the article) and implicit relationships (in green; these relationships can be inferred from an external knowledge source like Wikipedia) The knowledge graph depicts the following points: Google’s biggest competitors (Microsoft and Amazon) own tools in the data analytics and visualization segment (Looker’s segment) the data analytics and visualization segment is important to Google because that segment is part of the “Big Data movement” (implicit relationship) and “Big Data” is important to Google (implicit relationship) Google does not own any tools in the analytics segment Based on this information, it makes sense that Google acquired a tool in

Is ai to Tableau what vlookup is to excel?

As Ben Thompson from Stratechery wrote on Google’s acquistion of Looker “data analytics and visualization is a large and growing segment in enterprise software”. As Boris Evelson from Forrester points out, BI tools have reached technological maturity in certain areas such as d”atabase connectivity and data ingestion, security, data visualization, and slice-and-dice OLAP capabilities”. At the same time he points out the lack of demand: Fifty-six percent of global data and analytics decision makers (seniority level of manager or above) say their firms are currently in the beginner stage of their insights-driven transformation. Further anecdotal evidence shows that enterprises use no more than 20% of their data for insights, and less than 20% of knowledge workers use enterprise BI applications, still preferring spreadsheets and other shadow IT approaches. The reasons are – as he points out – “the low maturity of the people/process/data”. BI-vendors are trying to solve this issue by extending their solutions into E2E-tools; Considers Pentaho’s integration with Lumada: Lumada’s focus is on covering the entire data lifecycle, from the integration of various data sources to the evaluation of video and IoT data in compliance with DSGVO regulations and their deployment in self-service applications. Pentaho’s plans for its

Data loading processing in the data warehouse to handle deletes

When you are populating your data vault, you might need to delete you stage-tables in an asynchronous way; load -> staging -> integration layer Only – and only – when you have populated the integration layer, you can delete the entries from your load table. One way to achieve this is to implement a delete-tracking-table that will track your deletes. The process is like this: Set up a metadata-table that contains: your target table and its source table After populating a table in the integration layer, you store this information in the delete-tracking-table. Concretely you track: the source table, the table in the integration layer, and the highest load date in your table in the integration layer Initiate the delete-process: for each source table defined in your metadata-table get the lowest load date from the delete-tracking-table. If there is no entry in your delete-tracking-table, use 1753 as a default. Delete every entry from your source table that is lower than this load date.

How Coinbase is building a crypto empire for users’ crypto lifecycles

Recently, Coinbase acquired task-platform Coinbase is an online platform for users and merchants to buy, sell, and accept cryptocurrencies. For these activities Coinbase has three different products: gdax exchange: buying and selling of cryptocurrencies for institutional and professional investors buying and selling of cryptocurrencies for „mainstream“ users Coinbase Commerce: merchants payment systems for accepting cryptocurrency payments is a task-platform where users earn bitcoin for completing tasks. The tasks are offered by blockchain startups doing an ICO and involve things like signing up for newsletters or joining telegram groups. Those blockchain startups are very often in an early phase and serves as a marketing tool for them. Some argue that the acquisition was an acqui-hire for Earn founder and CEO Balaji Srinivasan. And Balaji Srinivasan, who has an impressive track record (among other things as partner at Andreessen Horowitz) is now indeed Coinbase’s CTO. Whereas acqui-hiring Balaji Srinivasan might be the acquisition’s real intention, looking at the acquisition in the context of Coinbase’s other acquisitions and their self-imposed company description shows another perspective, namely that Coinbase is building a “crypto empire“ serving a user’s whole „crypto lifecycle“. Building a crypto empire with, Cipher Browser,, and

Thoughts on Stellar's Randos Per Week in the context of increased crypto awareness

In their “Stellar 2018 Roadmap” (see Thoughts on “Stellar 2018 Roadmap”) Stellar jokingly (at least I hope so) shared the critical indicator for a decentralized protocol” (original emphasis), namely randos per week (r.p.w or rpw; number of random people talking about crypto) and promised equally moonish growth. Although meant as a joke there is some truth in those numbers. The number of average — „non-crypto“ — people talking about it has — at least in my perception — increased in the last couple of weeks and months. More importantly, such popularity metrics are important for the diffusion of cryptoassets; the more people know about it, the greater the likelihood of acceptance. Nevertheless, the recently increased popularity of crypto is not without its caveats. Prevalence of common misconceptions hindering diffusion: Firstly, a lot of the attention is still on getting rich, Crypto being a bubble and people confusing all alts with Bitcoin. As long as these misconceptions prevail, crypto won’t reach mass market adoption. Creation of overhyped interest leading to bursting bubble: On the one side Blockchain and Co. are overhyped to be the next big thing and if possible right now. On the other side, adoption is either low or not perceived because it is happening under the hood. For instance, Stellar’s partnership with Tempo