Deep Learning is not the AI future

Everyone now is learning, or claiming to learn, Deep Learning (DL), the only field of Artificial Intelligence (AI) that went viral. Paid and free DL courses count 100,000s of students of all ages. Too many startups and products are named “deep-something”, just as buzzword: very few are using DL really. Most ignore that DL is the 1% of the Machine Learning (ML) field, and that ML is the 1% of the AI field. Remaining 99% is what’s used in practice for most tasks. A “DL-only expert” is not a “whole AI expert”.

DL is not synonym of AI! The most advertised AI tools by Google, Facebook etc are mainly or only DL, so the wide public thinks that all the new AI records are (and will be) done with DL only. This is not true. Decision Trees like XGBoost are not making headlines, but silently beat DL at many Kaggle tabular data competitions. The media implied that AlphaGo is DL-only, but it’s a Monte Carlo tree search + DL, an evidence that pure DL was not enough to win. Many reinforcement learning tasks are solved with Neuroevolution’s NEAT, no backpropagation. There is “deep misinformation” in AI.

I am not saying that DL is not solving the tasks: DL is impressive. Trees and other algorithms don’t beat DL often, and there is no DL substitute to solve some tasks, but why no one expects non-DL systems to be (re)discovered in future to beat DL? Perhaps also solving the legal nightmare of DL decisions, that even if correct, can’t be explained when legally questioned? Also I would like to read in the press about DL issues like “catastrophic forgetting”, the tendency to abruptly forget previously learned information upon learning new information, and about the daily fight against “overfitting”. About “intelligence”: DL will simply believe the training data given, without understand what’s true or false, real or imaginary, fair or unfair. Humans believe fake news too, but only up to a certain level, and even kids know that movies are fiction, not real. For more details, if you got time, read my longer article: AI (Deep Learning) explained simply.

Everyone 20 years ago was learning HTML, the markup language to write web pages by hand, considered enough at the time to become a dot com billionaire. Like others, I learned each tech when seemed useful: HTML, mobile apps, DL, and I invite everyone to continue learning new things across all the life time. In fact, you don’t simply learn one tech once in a life! If you learn DL, you don’t get a lifetime AI know how. The 1995 HTML became outdated and not enough: CSS, Javascript and server languages took over. In the same way, DL will be outdated and not enough too. Most popular mobile apps contain no HTML at all, so who knows if future AI apps will contain DL or not?

Really, DL is a 1980s tech, older than HTML: Trained with more data, 1970s “neural networks with hidden layers” gave better results, then was renamed as DL and hyped. In 1992 I briefly checked some neural network source codes, together with other stuff like fractals and cellular automata. Like almost everyone else, I dismissed DL at the time as an academic math puzzle with no practical uses. Instead, I focused on learning what gave immediate results: 3D for video games, then internet, and so on. But we was all wrong, DL can do amazing things with big data! I got fascinated in 2015 by Deep Dream, then by GANs etc. Still, DL it’s not the last, perfect AI science we can invent.

The ancient DL was already studied extensively and updated across decades to solve more tasks more accurately, but no DL version (Convolutional, RNN, RNN + LSTM, GANs etc.) can explain its own decisions. While DL will surely solve more tasks and kill more jobs in future, unlikely will solve all, or reserve surprising updates capable of discussing a legally valid defense about the fairness of its own decisions.

(philosophers Plato and Aristotle talking: DL is unable to understand them)

Future AI should explore other, new or old but overlooked ways, not just focus 100% on DL. A DL limit is that considers truth simply what it spots more frequently in the data, and false what’s statistically more rare, or opposite of what’s more frequent. The DL fairness comes not from DL itself, but from the humans selecting and preparing the DL data. A DL can read texts and translate between texts. But if a DL model is trained over 100 books, 40 telling how hate, war, death, destruction are bad, and 60 books telling that Hitler’s Nazi ideas was correct, the DL will end up 100% Nazi. The DL will never figure out on its own that world domination or killing Jews, gays and disabled people is bad, if that’s the most popular opinion in the training data. No wonder that DL will also not explain his own decisions, except that: “I’ve read often that this is right, so it should be right”. Even small kids understand on their own who’s the bad guys in a movie, but not DL, unless humans teach it explicitly first. The DL specific things like gradient descent with backpropagation are cool, as well as custom DL hardware, but that’s mostly statistics and geometry, so probably will not be in the AI of 2037.

For many tasks, Deep Learning AI is or will become illegal, not compliant. Who collects data about citizens of the 28 European countries, should follow the General Data Protection Regulation (GDPR) by May 25, 2018. This is the date when DL will be abandoned for several apps in EU, causing AI startups to quickly replace DL with whatever else, or risking to be fined. Fines for noncompliance are 4% of global revenue, including USA revenue. GDPR, about automated decision-making, requires the right to an explanation, and to prevent discriminatory effects based on race, opinions, health, etc. Laws similar to GDPR exist or are planned worldwide, it’s only matter of time. The US Fair Credit Reporting Act requires to disclose all of the factors that adversely affected the credit score of the consumer, for a maximum of 4 factors allowed. DL factors are normally thousands or millions, not just 4, how to simplify into 4? AI, like bitcoin ICOs, started ignoring regulation, but laws and fines always come.

DL systems taking more relevant decisions than telling if an image is a cat, or where to add bunny ears to selfies, will be replaced with non-DL systems. The AI will have to be accountable, so different from DL, with outcomes you can explain to average judges and users in simple, legally valid words. DL complexity, that looks like “magics” to judges and users, is a legal risk: not a cool feature. DL will advice or alert humans, for example detecting sicknesses from medical images, to be verified by a medical doctor, but this is only partial automation. Users getting rejected from the AI (denied a loan, job, etc.) will ask an explanation. Using human staff to give explanations on demand seems compliant, but the staff’s answers will be easy to challenge, being only made up excuses: not even the staff will know the real rejection causes. Other AI systems that can provide human language explanations to users and support staff, will be ultimately the only ones used, and fully automating all the sensible fields.

GDPR includes the “right to an explanation“, for example why a job or a loan is denied. DL gives results with no natural (legal) language explanations. Pages of DL variables are available, but not acceptable by judges or users, since not even the best mathematicians or other algorithms can figure out and simplify into words a DL model. No one knows how to modify DL to give simple human-like explanations, so DL can’t be made compliant! This issue affects also several other AI and Machine Learning algorithms, but not all or as much as DL. Some more explainable systems will be used in regulated decisions in place of both DL and humans. Decision trees also become not explainable if boosted or in ensemble, but there will be other, new or rediscovered AIs.

DL will be of secondary importance: not abandoned like magnetic tapes or cathode TVs, but kept just to give clues and opinions to humans. Only the humans will take the decisions officially, also making up some legally enforceable written explanation, like if they understood what’s going on. But if DL is just a tool for humans, and can’t take decisions directly, it’s not saving costs or killing jobs. Even where humans must take the final decision, tool AIs explaining their advice will be more useful than tool AIs giving responses without causes or reasons. Then, other explainable automatic AIs, when invented, will be cheaper, faster, and obviously replace both DL and humans. Since DL was invented in 1960s-1980s then rediscovered in 2010s, probably the base of better future AIs is already described by some researchers somewhere, but being not DL, no one will care to check and develop that for years. Until rediscovered and hyped.

Explainability was already a big issue before of any specific laws and before DL. In antitrust cases, companies like Google are asked why a product rather than others is shown in the top of search results. This was before DL too: many other algorithms also mix data in a crazy way to get results, so no human can easily reconstruct the decision reasons. Judges are told that engineers don’t know exactly, and pages of linear algebra are given as evidence. This can’t end well: billion dollars of fines was ruled in multiple cases, with warnings to change systems, even before a specific law existed. Class action lawsuits of users automatically denied jobs, loans, refunds etc, against automated decision units of stores, banks, insurances, etc. will be the norm, and being unable to explain will mean “no defense“, being fined, and a brand’s public relations disaster.

GDPR, about automated decision-making, also requires to prevent discriminatory effects based on race, opinions, health status, etc. But DL models trained from user-generated data like social media and news (rather than ground truth data like medical or financial records), always contain evil biases implicitly. As told before, DL can read a lot of texts and data, and mimic its contents, but will not critically understand it. DL will just believe what’s spotted more often, underline patterns and trends found in data, and so: amplify the human biases. The DL decisions end up always more discriminatory, racist, sexist than the average sample in the training data. This issue happens in all the ML algorithms trained from user-generated data, but DL model bias is one of the hardest to test, detect, control and tune. It is so hard to fix, that rather than try to patch it, simply caused the abrupt cancellation of many DL experiments already, from chat bots went nazi and hateful, to apps whitening black face photos in “beauty” filters.

(DL is a neural network, you can’t edit separately each answer’s outcome)

You can’t fix a discriminatory, racist or sexist DL model by trying to balance it with patches after the training. DL is a neural network, and unlike some other AI methods, you can’t edit specific answers with local surgery, you must retrain all with different, 100% balanced and fair data, unfindable in pure state in the wild internet. You should create fair data yourself by hiring a dedicated human staff. But the cost of creating vast amounts of de-biased data edited by human experts, just to train an AI model, makes not worth to replace humans with AI in first place! Further, even if you had trained a DL model that really is fair, you have no evidence to convince a judge or a user about the fairness of any decision, since the DL will give no explanations.

DL will keep as best choice for consumer apps or games not posing legal risks. People losing game plays against bots will unlikely convince a judge to fine the AI company because it can’t explain how the AI won. People unhappy of how FaceApp edited their selfie photo into older, younger, or opposite sex, will unlikely convince a judge to fine FaceApp because it can’t explain how the AI decided the new looks (but a “race change” filter was removed after massive protests, no judge needed). The safe DL decisions market is very limited: users, companies and governments can convince judges to fine in all the cases where the decision outcome can make a financial difference or be discriminatory. So, all that’s more than just a game or good taste joke. Existing non-DL methods can replace DL where needed, and new methods will be (re)discovered, so the AI progress will continue nicely. Especially if everyone will study (and invest into) all the old and new algorithms of the whole AI and Machine Learning sciences, not only DL. A “DL-only expert” is not a “whole AI lifetime expert”.

Except DL being “illegal” to use for the most useful tasks it can solve, it’s also unable to solve several tasks requiring the abstract reasoning to figure out what’s fair and unfair in the data seen, and to explain the logic of its own decisions. Even for tasks not requiring explanation where DL seems the best system, like image recognition, DL is not as safe as human eyes. You can fool DL with “adversarial examples“: photos of something, like a cat, with invisible perturbations added, can fool the DL into seeing other, like a dog. All humans will still see a cat, but the DL will instead see a dog or whatever the hacker secretly embedded. This can be exploited in street signs to hack current self-driving cars. New AI systems resisting this hack will replace DL.

The author of Keras, the most popular DL library, in his post “The limitations of deep learning”, said: “the only real success of DL has been the ability to map space X to space Y using a continuous geometric transform, given large amounts of human-annotated data.” In layman terms, I would say that this means: DL can be trained to recognize cat photos without understand what is a cat, and to be racist without knowing of being racist. DL can recognize cats or be racist, which is impressive and at times useful, but DL can’t explain why a photo shows a cat, or if a decision was racist.

In “The future of deep learning” the Keras author describes his vision of a new system where DL is only in “geometric modules”, that should interact with not yet existing “algorithmic modules” and “meta learners”. This would increase the solvable tasks, but probably still failing to explain the decisions, due to the DL modules. It’s like when we can’t explain, in words, certain feelings or images computed in our brain. Humans explain all, but with mostly made up, oversimplified excuses, that everyone seems to believe as accurate. Machines are instead unfairly asked to be really accurate. Other experts are drafting new AI systems that do not include DL at all, but they lack funds: everyone invests in DL only now, and DL mania will continue for a while. No one knows what will be the next big AI thing, but unlikely will be DL 2.0.

Only who sells DL software and hardware, despite the conflict of interest, is interviewed and listened in the AI debates. Have you noticed any legitimate “natural intelligence” experts, like psychologists and philosophers, supporting DL?

If you have neither AI urgency or time to study, wait the next AI system to be ready and study it directly, skipping DL 1.0. Else, if you have AI urgency and/or time to study, be sure to cover the whole AI and the many Machine Learning fields, not DL only.

Thanks for reading. If you feel good, please click the like and share buttons. But before to comment, especially if you just paid an expensive DL course or you disagree, please read first my longer article in full: AI (Deep Learning) explained simply. If interested in the sci-fi AI and robots safety debate, read: Will AI kill us all after taking our jobs?

Originally published on Linkedin.

By Fabio Ciucci, Founder, CEO at Anfy srl


The inaugural Data Protection World Forum (DPWF) will be held on November 20th & 21st 2018 at the ExCeL London which will provide a broader focus across the data protection and privacy space amidst the progressive tightening of global data protection laws.

Ahead of the end of year event, DPWF has launched a series of intensive workshops.

Further information on the DPWF and workshop details are available at: https://www.dataprotectionworldforum.com/