Some reflections on artificial intelligence in medicine

Rinckside 2018; 29,5: 11-13.

he point of artificial intelligence is that it “learns” on its own and becomes an – or even the one and only – expert. However, artificial intelligence is not as simple an approach as it's sold today, and artificial intelligence or expert systems are not recent ideas – they come and go since the 1940s, or even since the 18th century with Maelzel’s chess-playing automaton, The Turk.

The reliance on advanced scientific theories and modes of reasoning and the utilization of scientific methodology, specifically observation, can easily lead to tunnel vision or wrong conclusions as it's known from the 19th century "ratiocination".

In 1843, the English philosopher John Stuart Mill differentiated in his book "A System of Logic, Ratiocinative and Inductive" induction from ratiocination, and developed principles of inductive reasoning:

"Reasoning, in the extended sense in which I use the term, and in which it is synonymous with Inference, is popularly said to be of two kinds: reasoning from particulars to generals, and reasoning from generals to particulars; the former being called Induction, the latter Ratiocination or Syllogism … The meaning intended by these expressions is, that Induction is inferring a proposition from propositions less general than itself, and Ratiocination is inferring a proposition from propositions equally or more general [1]."

Two years earlier, Edgar Allan Poe described the same approach in his short story "The Murder in the Rue Morgue":

“But it is in matters beyond the limits of mere rule that the skill of the analyst is evinced. He makes, in silence, a host of observations and inferences. So, perhaps, do his companions; and the difference in the extent of the information obtained, lies not so much in the validity of the inference as in the quality of the observation. The necessary knowledge is that of what to observe [2].”

A hundred years later

A little more than a hundred years later, in 1958, the New York Times reported in an article that ...

“The Navy revealed the embryo of an electronic computer today that it expects will be able to walk, talk, see, write, reproduce itself and be conscious of its existence … The Navy said the Perceptron would be the first non-living mechanism 'capable of receiving, recognizing and identifying its surrounding without any human training or control.' The 'brain' is designed to remember images and information it has perceived itself … It is expected to be finished in about a year [3].”

It didn't work due to “technical limitations”.

The most famous first medical application of AI was MYCIN, a program developed in the 1970s at Stanford University in California [4].

MYCIN, as Bruce G. Buchanan and Edward H. Shortliffe described it in a recapitulation of the project, was a software that embodied some intelligence and provided data on the extent to which intelligent behavior could be programmed. The intention was to identify bacteria causing severe infections, such as bacteremia and meningitis, and to recommend antibiotics at the right dosage for a patient. As with other AI programs, its development was slow and not always in a forward direction.

It worked, but it also didn’t, and was never used in practice – not only because computing power was insufficient, but rather for an inherent problem of AI: the knowledge of a human expert cannot be translated into digitizable rule bases. Additionally, AI is not immune against human prejudice that always exists – wittingly or unwittingly. Such preconceptions cannot be filtered out because of AI’s lack of a critical mind. Buchanan described this problem in a conclusion:

“There are many 'soft' or ill-structured domains, including medical diagnosis, in which formal algorithmic methods do not exist. In diagnostic tasks there are several sources of uncertainty besides the heuristic rules themselves. There are so-called clinical algorithms in medicine, but they do not carry the guarantees of correctness that characterize mathematical or computational algorithms. They are decision flow charts in which heuristics have been built into a branching logic [5].”

The flaws

AI is mindless, lacks consciousness and curiosity. These are fundamental flaws, distinguishing it from real intelligence. Although meant to be a “science” by its fathers, AI is not a real science; it’s closer to computer gambling and tinkering than to creating a fundamentally reliable support system for highly specific tasks.

Artificial intelligence is mindless. This is a fundamental flaw.

Neural AI networks are good at – crudely – classifying pictures not only in radiology, meanwhile they encompass the entire spectrum of medical imaging, including for example nuclear medicine, dermatology, and microscopy. The are known for years as CAD, computer-assisted diagnosis.

A typical example is a recent paper by a dermatology group at Heidelberg University. They used deep learning neural networks for the detection of melanomas. The British newspaper The Guardian summarized the press release from Heidelberg with the headline: “Computer learns to detect skin cancer more accurately than doctors”. The authors of the paper concluded: “Most dermatologists were outperformed by the neural networks. Irrespective of any physicians' experience, they may benefit from assistance by a neural networks’ image classification [6].”

In an editorial accompanying the dermatology article in Annals of Oncology, the commentators were more careful and raised some additional concrete questions.

“This is the catch; for challenging lesions where machine-assisted diagnosis would be most useful, the reliability is lowest.” They also point out: “Whilst dermatology is a visual specialty, it is also a tactile one. Subtle melanomas may become more apparent with touch as they feel firm or look shiny when stretched [7].”

Legal responsibility

Another main problem of AI is that the overwhelming majority of its users do not understand and cannot follow its black-box judgments and its reasoning to reach certain choices. Interestingly, there also are a number of reports that developers of AI software did not understand why their algorithms reach certain results and decisions; the algorithms are impenetrable.

Thus, the well-meant “right to an explanation” of decisions made by an AI expert system concerning a person, passed as a European law in the General Data Protection Regulation (GDPR), can hardly be fulfilled because if even some creators are unable to find inherent flaws in their source code they won’t be able to explain it to their “victims”. I wonder what the legal consequences will be.

It is a principle of information technology that convenience and security are generally mutually exclusive. Once again the question arises whether the limits of what is ethically permissible are being shifted because something is technically possible. However, financial and career interest often override established values of the medical profession. More so, there are other interests in forcing the introduction of AI by groups and institutions owing no allegiance and acknowledging no responsibility to patients, doctors or the people in general.

At this point we are faced with another question – who is really responsible and accountable for the quality of the results? The radiologist, the hospital’s administrator, the software engineer who wrote the source code, the company that sold the software? The companies will reject any responsibility, stating that the AI software was delivered free of defects. Even if the customer will get access to the source code, nobody will ever be able to prove that the algorithm has a flaw. You bought a pig in a poke – and are stuck with it.

Understanding AI

There are other problems. In a recent overview of AI in AME the author stated:

“The accuracy of these algorithms is dependent on two important factors: the type of algorithms used and also the acquisition parameters applied by the modality. If the algorithm is to be accurate, it is really important the acquisition parameters are standardized prior to application of the algorithm [8].”

This is a major dilemma of AI and deep learning. In many instances, the calculated parameter data are incorrect, as we have seen in “MR fingerprinting” and related methodologies. These values cannot be reliably reproduced, thus they shouldn’t be used in a neural network [9]. Deep learning can lead to the description of complex relationships that might only exist because they are based on artifacts or wrong presumptions.

Simple tasks are easily solved by AI, multi-layered tasks are far more complicated to be worked out. During the last ten years, neural networks have shown promises. Still, AI doesn’t mean an understanding, thinking, and comprehending computer, but programmed if-then ordered decisions. At the present stage, artificial intelligence is more real incompetence that easily can run wild and lose control than helpful support in diagnosis.

AI is also claimed to be objective. But there is no objectivity or neutrality in AI, its decisions are not necessarily knowledge based, but biased. More so, quantifying algorithms freeze a state of the past because they use old data.

Artificial imaging programs are useless if applied randomly without a well-defined and sharply delineated aim. Many approaches to explain results of AI are based on hypotheses which are still to be proved, and much research in this field is empirical and heuristic.

Still, AI will come on the market; it’s business value is enormous. By the way: If AI should work, even limping and stuttering, other disciplines will take over radiology in those fields which they find attractive – because with fast AI results it’s easy and makes money. Anyone can use it, from technologists to physicians in clinical disciplines. Radiologists are not needed for this.

References

1. Mill JS. Of inference, or reasoning, in general. In: A system of logic, ratiocinative and inductive, being a connected view of the principles of evidence, and the methods of scientific investigation. Volume I, London: John W. Parker Publisher. 1843. p. 223
2. Poe EA. The Murders in the Rue Morgue. Philadelphia (USA): Graham's Magazine 1841.
3. UPI. New Navy device learns by doing – Psychologist shows embryo of computer designed to read and grow wiser. New York Times. 7 July 1958. p. 25
4. Buchanan BG. A (very) brief history of artificial intelligence. AI Magazine 2005; 26,4: 53-60.
5. Buchanan BG, Shortliffe EH (eds). Rule-based expert systems: The MYCIN experiments of the Stanford Heuristic Programming Project. Reading, MA, U.S.A.: Addison Wesley. 1984. pg. 683.
6. Haenssle HA, Fink C, Schneiderbauer R, Toberer F, Buhl T, Blum A, Kalloo A, Hassen ABH, Thomas L, Enk A, Uhlmann L; Reader study level-I and level-II Groups. Man against machine: diagnostic performance of a deep learning convolutional neural network for dermoscopic melanoma recognition in comparison to 58 dermatologists. Annals of Oncology; 28 May 2018. doi:10.1093/annonc/mdy166
7. Mar VJ, Soyer HP. Artificial intelligence for melanoma diagnosis: How can we deliver on the promise? Annals of Oncology; 28 May 2018. doi.org/10.1093/annonc/mdy193.
8. Rinck PA. Relaxation time measurements in medical diagnostics. In: Rinck PA. Magnetic resonance in medicine. A critical introduction. 12th ed. BoD, Norderstedt, Germany. 2018. pp. 87-92.
9. Dugar N. AI algorithms begin to loom large in radiology. AME 27 June 2018.

Citation: Rinck PA. Some reflections on artificial intelligence in medicine. Rinckside 2018; 29,5: 11-13.

A digest version of this column was published as:
Why radiology must take care when it comes to AI.
Aunt Minnie Europe. Maverinck. 26 September 2018.

Rinckside • ISSN 2364-3889
is published both in an electronic and in a printed version. It is listed by the German National Library.

→ Print version (pdf).

The Author

Rinck is my last name, and a rink is an area in which a combat or contest takes place, rinkside means “by the rink”; in a double meaning “Rinckside” means the page by Rinck.

Sometimes I could also imagine “Rincksighs”, “Rincksights”, or “Rincksites” ... More

Contact