THE 50 BEST FREE DATASETS FOR MACHINE LEARNING
What are some open datasets for machine learning? We at Gengo decided to create the ultimate cheat sheet for high quality datasets. These range from the vast (looking at you, Kaggle) or the highly specific (data for self-driving cars). First, a couple of pointers to keep in mind when searching for datasets. According to Dataquest: A dataset shouldn’t be messy, because you don’t want to spend a lot of time cleaning data. A dataset shouldn’t have too many rows or columns, so it’s easy to work with.
Read moreLIQUID-AIR ENERGY STORAGE: THE LATEST NEW “BATTERY” ON THE UK GRID
A first-of-its-kind energy-storage system has been added to the grid in the UK. The 5MW/15MWh system stores energy in an unusual way: it uses excess electricity to cool ambient air down to -196°C (-320°F), where the gases in the air become liquid. That liquid is stored in an insulated, low-pressure container. When there’s a need for more electricity on the grid, the liquid is pumped back to high pressure where it becomes gaseous againand warmed up via a heat exchanger. The hot gas can then be used to drive a turbine and produce electricity. This new LAES system is being built by a company called Highview Power in Bury, near Manchester, UK, and it’s connected to the Pilsworth Landfill gas site, a power plant that burns methane from the landfill to create electricity.
Read moreTHE LIFESPAN OF A LIE – THE STANFORD PRISON EXPERIMENT
It was late in the evening of August 16th, 1971, and twenty-two-year-old Douglas Korpi, a slim, short-statured Berkeley graduate with a mop of pale, shaggy hair, was locked in a dark closet in the basement of the Stanford psychology department, naked beneath a thin white smock bearing the number 8612, screaming his head off. It was a defining moment in what has become perhaps the best-known psychology study of all time. Whether you learned about Philip Zimbardo’s famous “Stanford Prison Experiment” in an introductory psych class or just absorbed it from the cultural ether, you’ve probably heard the basic story.
Read moreIMPROVING LANGUAGE UNDERSTANDING WITH UNSUPERVISED LEARNING
We’ve obtained state-of-the-art results on a suite of diverse language tasks with a scalable, task-agnostic system, which we’re also releasing. Our approach is a combination of two existing ideas: transformers and unsupervised pre-training. These results provide a convincing example that pairing supervised learning methods with unsupervised pre-training works very well; this is an idea that many have explored in the past, and we hope our result motivates further research into applying this idea on larger and more diverse datasets.
Read moreATTACKS AGAINST MACHINE LEARNING – AN OVERVIEW
At a high level, attacks against classifiers can be broken down into three types: Adversarial inputs, which are specially crafted inputs that have been developed with the aim of being reliably misclassified in order to evade detection. Adversarial inputs include malicious documents designed to evade antivirus, and emails attempting to evade spam filters. Data poisoning attacks, which involve feeding training adversarial data to the classifier.
Read moreWHY DO NEURAL NETWORKS GENERALIZE SO POORLY?
Deep convolutional network architectures are often assumed to guarantee generalization for small image translations and deformations. In this paper we show that modern CNNs (VGG16, ResNet50, and InceptionResNetV2) can drastically change their output when an image is translated in the image plane by a few pixels, and that this failure of generalization also happens with other realistic small image transformations. Furthermore, the deeper the network the more we see these failures to generalize.
Read moreINVENTOR SAYS GOOGLE IS PATENTING WORK HE PUT IN THE PUBLIC DOMAIN
When Jarek Duda invented an important new compression technique called asymmetric numeral systems (ANS) a few years ago, he wanted to make sure it would be available for anyone to use. So instead of seeking patents on the technique, he dedicated it to the public domain. Since 2014, Facebook, Apple, and Google have all created software based on Duda’s breakthrough.
Read moreA NEW WAY TO FIND ALIEN CIVILIZATIONS: LOOK FOR THEIR SATELLITES
If an extraterrestrial civilization were to turn its scientific instruments toward Earth, what would it see? If its radio telescopes were on par with the kind we use today, it might notice large amounts of carbon dioxide in our atmosphere, a biomarker indicative of an advanced industrial civilization. If its radio telescopes were far more powerful than our own, it might be able to detect the faint electromagnetic radiation given off by our television broadcasts or radar defense networks scanning for launched nuclear warheads.
Read moreOUR CANCER PREVENTING GENES REVEALED
In our bodies, we all have genes working hard to prevent cancer. If they don’t do their job properly, rogue cells can mutate and develop into the life-threatening disease. The malfunction of one so-called “super tumour suppressor gene” known as p53 causes at least half of all cancers. When it works, p53 regulates how a cell reacts to various stresses and can instruct a rogue cell to die or stop multiplying. Researchers have known about the significance of p53 in protecting us from cancer for about 30 years, but until now no one has explained how it prevents cancer development. In a world-first, Melbourne scientists have found that a special group of genes that function in the body’s normal DNA repair process are critical to p53’s effectiveness in preventing the development of cancer.
Read moreNEARLY 80% OF JAPAN’S AIRBNBS REMOVED IN RESPONSE TO NEW HOME-SHARE LAW
And then there were 13,800. That’s how many Airbnbs are left operating in all of Japan, after the home-sharing site removed at least 62,000 homes, apartments, and rooms this weekend from its inventory, reports Japan’s Nikkei newspaper. The move is in response to the country’s new home share–or minpaku—law that comes into effect June 15, requiring Airbnbs be registered with the government and limited to a certain number of rented days.
Read more