botnet

BIG DATA: The Beauty of Global Networks of Data Exhaust

As the human world becomes more digital, our connections and interactions are recorded and shared. We go from knowing 150 people and analyzing a few stories a week to 2 billion people sharing hundreds of millions of stories constantly. But humans still need to understand what's going on underneath. In this entry, we want to highlight how massive, machine scale systems are visualized through mathematical methods to tell new stories. These charts -- giant sprawling data webs like airplane traffic patterns etched onto the globe -- are the future of literacy in the machine age.

In the first example, we borrow two images from Google. The Google Cloud team created a service which grabs the entire Ethereum blockchain, backs it up on Cloud, and makes it easier to analyze. The first image shows the Crypto Kitty universe, with color attached to owner of the contract (kitty whales!) and size of the bubble ranking the quality of the asset. We can certainly imagine this done on regular old financial assets. The second visualization is for transactions: points are wallets and lines are asset movement. You can immediately seen wallet clustering, which shows entities that have more frequent transactions between each other closer together. In this way, one can ferret out exchange wallets or bots. Hey there Bitfinex!

41d3c9e9-d3ba-4d2d-a5b0-63c0c486d880[1].png
27be3e7e-65b3-4410-b73b-13fdc5e49df8[1].png


The second source is a ConsenSys write up on decentralized exchanges, and is truly a spectacular chart. Do yourself a favor and click to zoom in. The dataset comes from IDEX, EtherDelta, Bancor, 0x, OasisDex, Kyber Network, and Airswap Protocol -- today's decentralized exchanges. Each point is a trading pair, the width of the line is number of normalized trades, and the line colors signify the exchange used. You can immediately see the most popular trade contracts, as well as exchanges where trading hops through an intermediate token, rather than through ETH itself. We'd love to see this for traditional FX markets, or maybe all trading period!

11eecc6e-2804-4256-83f6-b6ec740186cd[1].png


The last chart is from Geoff Golberg, who mapped out all Twitter accounts engaged in the Ripple XRP community with the purpose of identifying bots. And yep, the 40,000 point cloud has multiple bot armies across the world used to manufacture opinions and drive social engagement. It takes a robust mathematical approach to visualize this information, and a detailed article written by a human to infer the relationships and their activities within the data network. This is a flavor of future skillsets required to thrive in a machine world.

f006c27c-c6e7-481c-a7cb-2326fb20e091[1].png

Source: Google (Ethereum), ConsenSys (Decentralized Exchanges), Medium (XRP Bots)

SOCIAL MEDIA: 15,000 Scammer Twitter Botnet Exposed

37e52c28-5d8c-4c6f-beea-3fcccbc6d0bc[1].png

What's a botnet's favorite activity, when not trying to take down Minecraft servers using thousands of remotely controlled baby monitors? Some good crypto currency scamming on Twitter, of course! We loved a recent paper from Duo Labs that exposed the structure of the botnet running the "ETH Giveaway" scam which tricks people into sending a small amount of currency to an address for "verification" and never sends any money back (not unlike the famous Nigerian price).

The researchers sat on the Twitter API and pulled out data on 88 million public profiles and 576 million tweets. To classify accounts, they used 22 heuristics like posting frequency, content, unique sources, hashtags, account age and others. They trained a machine learning Random Forest model on the data set, using "verified" accounts as controls, and found a 15,000-entity botnet with a three-tiered hierarchical structure. Within this structure, there were (1) individual bots that would post spreading the scam messages, (2) hub accounts that many of the bots followed, and (3) amplification accounts which would like and otherwise engage with these messages. It's a beauty of growth hacking and attention economy manipulation.

Such creatures are inevitable in a digital-first world, no matter how much Twitter tries to fight "dehumanization". Over time, they will only get more sophisticated and invisible, as initiatives like Microsoft's TextWorld teach bots to carry a conversation with humans. Which is why we also have to use machine learning ruthlessly to weed these things out. Such is the responsibility of the attention platforms, like Google, Facebook and Twitter. At the same time, we must not cross the fine line between machine moderation and machine control (looking at you, China). Whoever gets to decide how closely to turn the dials on the algorithm controls the volume of millions of voices across the web.
 

42f8090e-7d88-4765-8bde-e350017ac7f1[1].png
8e922b3b-7fef-4e80-bbc3-b2fa1582126e[1].png

Source: Futurism (Twitter Bots), Duo Labs (Paper), Slate (Dehumanization on Twitter), Microsoft TextWorld

BLOCKCHAIN: Scams in Crypto: 20% of ICOs, 5% of Twitter

cecd07b1-c822-4726-b56f-bfbe22f1d762[1].png

Getting a wrap around just how much scamming and fraud there is in the crypto ecosystem is a challenge -- but not impossible. As the industry continues to put up impressive fund-raising figures (with new issues at about 2% of Ethereum market cap per month), just how much of this will become valuable projects? We've written before about how creative destruction is natural for startups, and that failure rates in the mid 90% are a reasonable outcome. We've also pegged hacking of Bitcoin and Ethereum to have been responsible for about 14% of money supply in those pools. But what about outright theft and lies?

Two ideas. First, the WSJ analyzed 1,450 ICOs and found that 271 or 18% of them are just total raw scams. Fake copied white papers, team member photos taken from stock photo websites, nothing behind the project but malfeasance. Yikes. And another version of the same was The North American Securities Administrators Association going after nearly 70 ICO issuers in a coordinated action of regulators across the US and Canada called "Operation Cryptosweep". Which is a totally sweet name, for what is a really regrettable but required clean-up of the crypto ecosystem. A 20% chance to lose your money, for no philosophically meaningful reason, is the wrong price to pay for good financial technology in our opinion.

And second, don't forget the propaganda bot armies. Sure, they can influence elections and spread misinformation, but we didn't expect that they would be used for financial warfare this quickly. The practice in question is copy-cat accounts on Twitter that look like a Twitter influencer claiming to give out free crypto currency, if only you send them money first. This is hacking of the human kind and we monkeys fall for it all the time. As a comparisons: (1) email phishing maxes out at 0.70%, according to Symantec, and (2) bot automation is at approximately 10% of all activity on Twitter. Given that the crypto ecosystem is more prone to Internet memes and bounty programs, we would expect the rate of phishing to skew higher, say up to 5% for crypto-related conversations. So watch where you point that digital wallet.

26e6c603-3c58-4d2f-856f-1234fd850365[1].png

Source: WSJ (18% scams), NASAA (Operation Cryptosweep), Bloomberg (Bot PhishingHacks at 14%), Autonomous NEXT (Failure rates)

ARTIFICIAL INTELLIGENCE: Machine Vision Calamities

Source: Devumi bot retweet sales

Source: Devumi bot retweet sales

Let's look at how increasing computing power and algorithm efficiency are leading to some pretty wacky technology in the realm of computer vision. The building blocks are as follows. Neural networks can be trained on large data sets of objects to recognize those objects. They run on video cards (GPUs) and power everything from tagging cat photos to Tesla's self-driving cars. The more GPUs, the more things you can recognize, and the better your data and algorithm efficiency, the more accurate your recognition. 

So here's the example -- Amazon and its magic store, Amazon Go. The company has been testing a check-out free shopping experience for a few years, and the acquisition of Whole Foods has only encouraged speculation about the future of food retail. New information has come out about how the technology works. First, a shopper scans an identifier on their phone when entering the store. From that moment on, the hundreds of video cameras on the ceiling watching all the activity in the store track every single shopper and every single product on video. To do this successfully, not only do you need gazillions of hours of footage (i.e., what Amazon is in fact doing), but a massive cloud infrastructure to process the machine vision demands in real time. Good thing there's AWS!

The same neural network that can recognize images can also hallucinate them. Generative neural networks can manufacture images of a type, where the type is their source data set. And if you put an editor on top of that, like an adversary, you can manufacture pretty accurate renditions of whatever it is you want.

Thus, deepfakes. In their current NSFW form (and this is how the trend is being reported), deep fakes use machine vision to swap out the faces of celebrities onto adult entertainment. But that's just the beginning. Using a free desktop app called FakeApp, a derivative of the many mobile face-swap apps, a user can masterfully replace one speaker's face with that of another. And the effects can be good enough to look better than a multi-million dollar 3D rendering by the best Hollywood studios.

Samantha Cole at Mortherboard, which broke this article, goes on to say -- "An incredibly easy-to-use application for DIY fake videos—of sex and revenge porn, but also political speeches and whatever else you want—that moves and improves at this pace could have society-changing impacts in the ways we consume media. The combination of powerful, open-source neural network research, our rapidly eroding ability to discern truth from fake news, and the way we spread news through social media has set us up for serious consequences."

Yeah, it's not great. Especially when such messages can be validated for peanuts on social networks using cheap bot armies. According to the New York times, the going rate for 25,000 fairly active Twitter bots is $225. Want to know where the profile descriptions and pictures come from that make these bots look like real users? Stolen identities from humans. 

Source: Top frame shows rendered Carrie Fischer in Star Wars, bottom one uses FakeApp

Source: Top frame shows rendered Carrie Fischer in Star Wars, bottom one uses FakeApp

SOCIAL MEDIA: World's Largest Botnet Born from Minecraft

Source: Minecraft

Source: Minecraft

This is a lego piece for the future. On the Internet (we're there right now!), a distributed denial-of-service attack ("DDoS") is when a group of computers access a server so many times that traffic spikes and the server crashes, taking down whatever it is hosting. So for example, if you don't like the NY Times, just overwhelm it with robots and bring the site offline. These robots, collectively a botnet, don't have to be particularly good computers -- one could for example hack into thousands of baby monitors over WiFi and then point them at a target.

In 2016, a tremendously powerful botnet attacked the internet infrastructure of the United States, like never before. It used 600,000 Internet of Things devices. Where did this weapon come from? The answer is the video game Minecraft. In 2014, the virtual sandbox had 100 million registered players and a GDP of $400 million. Part of these economics is hosting Minecraft servers for local communities, and the corrollary of that is that executing a DDoS attack against a competitor makes you a modern-day Minecraft mafia monopoly. The 21-year old creators of this infamous botnet built it to snipe out other video game tycoons and make more money on their Minecraft servers. Later, they used the same botnet to defraud advertisers (selling hundreds of thousands of clicks and traffic that came from robots, not humans).

At some point, the creators open sourced the software and it spread through the dark web. That means any black hat hacker can get the code, change it up, and try to create its own infection of IoT devices. We know that, for example, North Korea is pretty good at cyber attacks and is now hacking crypto currency infrastructure. The links between 21-year old computer savants, video games, Internet money, and international geopolitical power struggles are here to stay. Which world is more powerful?