Massively Speed-Up your Learning Algorithm, with Stochastic Thinning

You have to see it to believe it! Imagine a technique where you randomly delete as many as 80% of your observations in the training set, without decreasing the predictive power (actually improving it in many cases), and reducing computing time by an order of magnitude. In its simplest version, that’s what stochastic thinning does. Here, performance improvement is measured outside the training set, on the validation set also called test data. I illustrate this method on a real-life dataset, in the context of regression and neural networks. In the latter, it speeds up the training stage by a noticeable factor. The thinning process applies to the training set, and may involve multiple tiny random subsets called fractional training sets, representing less than 20% of the training data when combined together. It can also be used for data compression, or to measure the strength of a machine learning algorithm.

Predicted vs. observed (diagonal is perfect fit), after calibration (right plot)

I also show the potential limitations of the new technique, and introduce the concepts of leading or influential observations (those kept for learning purposes) and followers (observations dropped from the training set). The word “influential observations” should not be confused with its usage in statistics, although in both cases it leads to explainable AI. The neural network used in this article offers replicable results by controlling all the sources of randomness, a property rarely satisfied in other implementations.

Neural nets, stochastic convergence: different seeds lead to different local optima

If you are new to neural networks and deep learning or manage a group of engineers developing or using such tools, the full technical article (13 pages including 6 pages of Python code) will give you a quick overview of the issues and benefits surrounding these methods, and a solid high-level introduction to the subject including how to discover and overcome — or leverage — the problems faced.

Download the Article

The technical article, entitled Massively Speed-Up your Learning Algorithm, with Stochastic Thinning, is accessible in the “Free Books and Articles” section, here. It contains links to my GitHub files, to easily copy and paste the code. The text highlighted in orange in this PDF document are keywords that will be incorporated in the index, when I aggregate all my related articles into books about machine learning, visualization and Python. The text highlighted in blue corresponds to external clickable links, mostly references. And red is used for internal links, pointing to a section, bibliography entry, equation, and so on.

To not miss future articles, sign-up to our newsletter, here.

About the Author

Vincent Granville is a pioneering data scientist and machine learning expert, co-founder of Data Science Central (acquired by TechTarget in 2020), founder of MLTechniques.com, former VC-funded executive, author and patent owner. Vincent’s past corporate experience includes Visa, Wells Fargo, eBay, NBC, Microsoft, and CNET. Vincent is also a former post-doc at Cambridge University, and the National Institute of Statistical Sciences (NISS).

Vincent published in Journal of Number Theory, Journal of the Royal Statistical Society (Series B), and IEEE Transactions on Pattern Analysis and Machine Intelligence. He is also the author of multiple books, including “Intuitive Machine Learning and Explainable AI”, available here. He lives in Washington state, and enjoys doing research on spatial stochastic processes, chaotic dynamical systems, experimental math and probabilistic number theory.

The post Massively Speed-Up your Learning Algorithm, with Stochastic Thinning first appeared on Machine Learning Techniques.

Massively Speed-Up your Learning Algorithm, with Stochastic Thinning

Table of Contents

Download the Article

About the Author

Trending Articles

Practice Sheet of Right form of verbs for HSC Students

Download: FK ft Shenky – Nakuyewa ”Prod by: Shenky”

How to win at Markstrat (Markstrat Tips and Tricks) – Vodites

Ominde Commission Report and Recommendations – Ominde Report of 1964

Bureau of Internal Revenue: Regional Offices (Directory)

GO 53 on Enhancement of Ex-gratia upto 5 Lakhs Toddy Tappers in Telangana

Cakewalk CA-2A Leveling Amplifier v2.0.1.97 WiN, v2.0.1.96 OSX Incl Keygen

Mp3 Download: Mdu - Kunjenjenjena

How the kill the job , when DTP request running for long hours.

Microsoft Intune から展開しているアプリのアップデートについて

18-year-old girl was beaten for half an hour by two Northampton men in 'an...

Car crash in Dunton Bassett leaves driver in critical condition

Macky 2, Two Others In Road Accident

Application log 00000000000000089514: Could not convert queue DLVST90CLNT

Detroit mafia: D’Anna Brothers agree to plea deal

Delivery block field greyed out using VA02

Muloraki Au

【個人撮影】スマホのプライベート映像♪「中に出さないで///」カラオケ屋での生ハメ撮りが流出ｗ【リベンジポルノ】＠PornHub

BREAKING NEWS: Diamond Platnumz Is Reported Dead After Ghastly Car Accident

FIAT 500 B0111 B0112