Quantcast
Channel: Deep Learning – NextGen AI Technology
Viewing all articles
Browse latest Browse all 16

A New Type of Non-Standard High Performance DNN with Remarkable Stability

$
0
0

I explore deep neural networks (DNNs) starting from the foundations, introducing a new type of architecture, as much different from machine learning than it is from traditional AI. The original adaptive loss function introduced here for the first time, leads to spectacular performance improvements via a mechanism called equalization.

To accurately approximate any response, rather than connecting neurons with linear combinations and activation between layers, I use non-linear functions without activation, reducing the number of parameters, leading to explainability, easier fine tune, and faster training. The adaptive equalizer – a dynamical subsystem of its own – eliminates the linear part of the model, focusing on higher order interactions to accelerate convergence.

One example involves the Riemann zeta function. I exploit its well-known universality property to approximate any response. My system also handles singularities to deal with rare events or fraud detection. The loss function can be nowhere differentiable such as a Brownian motion. Many of the new discoveries are applicable to standard DNNs. Built from scratch, the Python code does not rely on any library other than Numpy. In particular, I do not use PyTorch, TensorFlow or Keras.

Example of model fitting discussed in the paper

Table of contents

The free 16-page technical paper is organized as follows. A good chunk of it consists of illustrations and Python code. The index contains most terms associated to DNNs and may be a good starting point to find topics most relevant to you. I introduce new concepts such as sub-layers, sub-epochs, equalizer, chaotic descent, reparameterization, and ghost parameters. In some examples, I use swarm optimization or temperature with decay.

You don’t need to know the mathematical expression of the loss function to compute the partial derivatives. The material covered is very extensive yet presented in a very compact format. If you are a data scientist still unconvinced about the power of deep neural networks, reading this article will change your mind and explains everything to get started without a steep learning curve.

Here are the main entries in the paper:

1. From standard to non-standard deep neural networks

. . . Architecture of a non-standard DNN
. . . Gradient descent algorithm
. . . Reparameterization and latent features
. . . Distillation, cross-validation, and noise injection
. . . Adaptive equalizer and other optimization techniques

2. Case studies

. . . The workhorse of non-standard DNNs
. . . A new type of universal approximation
. . . Dealing with singularities and extreme events
. . . Large language models

3. Model evaluation

. . . Model with multidimensional input data
. . . An interesting universal function
. . . Example with singularities

4. Addendum

. . . Python code
. . . Bibliography
. . . Index

Conclusion

This original non-standard DNN architecture turns black boxes into explainable AI with intuitive parameters. The focus is on speed (fewer parameters, faster convergence), robustness, and versatility with easy, rapid fine-tune and many options. It does not use PyTorch or similar libraries: you have full control over the code, increasing security and customization. Tested on synthetic data batches for predictive analytics, high dimensional curve fitting and noise filtering in various settings, it incorporates many innovative features. Some of them, like the equalizer, have a dramatic impact on performance and can also be implemented in standard DNNs. The core of the code is simple and consists of fewer than 200 lines, relying on Numpy alone.

Another example of model fitted in the paper (orbits)

Get your free copy

The PDF with many illustrations is available as paper 55, here. It also features the replicable Python code (with link to GitHub), the data generated by the code, the theory, and various options including for evaluation. The blue links in the PDF are clickable once you download the document from GitHub and view it in any browser. Keywords highlighted in orange are index keywords.

To no miss future articles, subscribe to my AI newsletter, here

About the Author

Towards Better GenAI: 5 Major Issues, and How to Fix Them

Vincent Granville is a pioneering GenAI scientist, co-founder at BondingAI.io, the LLM 2.0 platform for hallucination-free, secure, in-house, lightning-fast Enterprise AI at scale with zero weight and no GPU. He is also author (Elsevier, Wiley), publisher, and successful entrepreneur with multi-million-dollar exit. Vincent’s past corporate experience includes Visa, Wells Fargo, eBay, NBC, Microsoft, and CNET. He completed a post-doc in computational statistics at University of Cambridge.


Viewing all articles
Browse latest Browse all 16

Trending Articles