Songza taps Weather Channel data to suggest mood-enhancing music
Category Archives: Uncategorized
A Review of the Data Broker Industry: Collection, Use, and Sale of Consumer Data for Marketing Purposes
The United States Senate Committee on Commerce, Science, and Transportation’s inquiry sought answers to four basic questions:
- What data about consumers does the data broker industry collect?
- How specific is this data?
- How does the data broker industry obtain consumer data?
- Who buys this data and how is it used?
Based on review of the company responses and other publicly available information, this Committee Majority staff report finds:
- Data brokers collect a huge volume of detailed information on hundreds of millions of consumers.
- Data brokers sell products that identify financially vulnerable consumers.
- Data broker products provide information about consumer offline behavior to tailor online outreach by marketers.
- Data brokers operate behind a veil of secrecy.
My discussion with Stephen Quinn on CBC Radio Vancouver’s Early Edition regarding efforts by the California Department of Motor Vehicles to regulate self-driving vehicles by 2015.
California Preparing for Self-Driving Cars by 2015
California Preparing for Self-Driving Cars by 2015
Self-driving cars sound like fantasy to many, but regulators are laying the groundwork for the technology to hit the roads next year.
Predicting crime using Twitter and kernel density estimation
Predicting crime using Twitter and kernel density estimation
Research by Matthew S. Gerber:
Abstract
Twitter is used extensively in the United States as well as globally, creating many opportunities to augment decision support systems with Twitter-driven predictive analytics. Twitter is an ideal data source for decision support: its users, who number in the millions, publicly discuss events, emotions, and innumerable other topics; its content is authored and distributed in real time at no charge; and individual messages (also known as tweets) are often tagged with precise spatial and temporal coordinates. This article presents research investigating the use of spatiotemporally tagged tweets for crime prediction. We use Twitter-specific linguistic analysis and statistical topic modeling to automatically identify discussion topics across a major city in the United States. We then incorporate these topics into a crime prediction model and show that, for 19 of the 25 crime types we studied, the addition of Twitter data improves crime prediction performance versus a standard approach based on kernel density estimation. We identify a number of performance bottlenecks that could impact the use of Twitter in an actual decision support system. We also point out important areas of future work for this research, including deeper semantic analysis of message content, temporal modeling, and incorporation of auxiliary data sources. This research has implications specifically for criminal justice decision makers in charge of resource allocation for crime prevention. More generally, this research has implications for decision makers concerned with geographic spaces occupied by Twitter-using individuals.
LibraryBox is an open source, portable digital file distribution tool based on inexpensive hardware that enables delivery of educational, healthcare, and other vital information to individuals off the grid.
Discussing the NSA TURBINE initiative with Rick Cluff on CBC Radio Vancouver’s Early Edition
How the NSA Plans to Infect ‘Millions’ of Computers with Malware
Top-secret documents reveal that the National Security Agency is dramatically expanding its ability to covertly hack into computers on a mass scale by using automated systems that reduce the level of human oversight in the process.
The classified files – provided previously by NSA whistleblower Edward Snowden – contain new details about groundbreaking surveillance technology the agency has developed to infect potentially millions of computers worldwide with malware “implants.” The clandestine initiative enables the NSA to break into targeted computers and to siphon out data from foreign Internet and phone networks.
The covert infrastructure that supports the hacking efforts operates from the agency’s headquarters in Fort Meade, Maryland, and from eavesdropping bases in the United Kingdom and Japan. GCHQ, the British intelligence agency, appears to have played an integral role in helping to develop the implants tactic.
In some cases the NSA has masqueraded as a fake Facebook server, using the social media site as a launching pad to infect a target’s computer and exfiltrate files from a hard drive. In others, it has sent out spam emails laced with the malware, which can be tailored to covertly record audio from a computer’s microphone and take snapshots with its webcam. The hacking systems have also enabled the NSA to launch cyberattacks by corrupting and disrupting file downloads or denying access to websites.
The implants being deployed were once reserved for a few hundred hard-to-reach targets, whose communications could not be monitored through traditional wiretaps. But the documents analyzed by The Intercept show how the NSA has aggressively accelerated its hacking initiatives in the past decade by computerizing some processes previously handled by humans. The automated system – codenamed TURBINE – is designed to “allow the current implant network to scale to large size (millions of implants) by creating a system that does automated control implants by groups instead of individually.”
In a top-secret presentation, dated August 2009, the NSA describes a pre-programmed part of the covert infrastructure called the “Expert System,” which is designed to operate “like the brain.” The system manages the applications and functions of the implants and “decides” what tools they need to best extract data from infected machines.