Hacker News | Rule-Based Expert Systems: The Mycin Experiments (1984)

Rule-Based Expert Systems: The Mycin Experiments (1984)(shortliffe.net)

76 points by mindcrime 14 hours ago | 16 comments

bane 11 hours ago
One of my first jobs was helping build an expert system for a, even today, complex computational linguistics problem. The company had a rich corporate library full of academic books on expert systems, decision trees, first gen (pre-winter) AI, and some early books on early ML approaches. I remember seeing this book in particular and its evocative title caused me to look deeper into the library than I would have normally.
Our core system was built of thousands upon thousands of hand-crafted rules informed by careful statistical analysis of hundreds of millions of entries in a bulk data system.
Part of my job was to build the system that analyzed the bulk data and produced the stats, and the other part was carefully testing and fixing the rulesets for certain languages. It was mind-numbing work, and looking back we were freakishly close to all the bit and pieces needed for then bleeding-edge ML had we chosen to go that way.
However, we chose expert systems because it gave us tremendous insight into what was happening, and the opportunity to debug and test things at an incredibly granular scale. It was fully possible to say "the system has this behavior because of xyz" and it was fully possible to tune the system at individual character levels of finesse.
Had we wanted to dive into ML, we could have used this foundation as a bootstrap into building a massive training set. But the founders biased towards expert systems and I think, at the time, it was the right choice.
The technology was acquired, and I wonder if the current custodians use it for those obvious next-step purposes.
[-]
- nowittyusername 9 hours ago
  Your post is making me think maybe there is quite a lot of lost knowledge out there somewhere that maybe has pertinence in modern day agentic AI system building use. I am currently experimenting in building my own AI system that uses LLM's as the "engine" but the "harness" around the said LLM will do most of the heavy lifting. It will have internal verification systems, grounding information , metadata, etc... And I find myself making a lot of automated scripts as part of that process as I have a personal motto that its always better to automate everything possible with scripts first and only use LLM's as a last resort or for things you can script away. And that is making me look more and more in to old techniques that have been long established way back when...
  [-]
  - mindcrime 2 hours ago
    Your post is making me think maybe there is quite a lot of lost knowledge out there somewhere that maybe has pertinence in modern day agentic AI system building use.
    I agree with that. In fact, that mindset is what led me to this book in the first place. I was exploring an older book on OPS5[1] and saw this book mentioned, and started looking for it and found that it is freely available online. Seemed like something the HN crowd might enjoy, so here we are.
    And that is making me look more and more in to old techniques that have been long established way back when...
    I suspect that there is some meat on that bone. I'm exploring this particular area as well. I think there's some opportunity for hybridization between LLM's / GenAI and some of these older approaches.
    [1]: https://en.wikipedia.org/wiki/OPS5
    [-]
    mark_l_watson 41 minutes ago
    I spent several years working with OPS5 in the 1980s. The Common Lisp code, especially the Rete network stuff, was fairly straight forward to modify and generally work with. Good times.
  - ozim 8 hours ago
    Well from what I remember from university most of expert systems went into bust because they were promising what ML today promise.
    The maintenance of the rules or for you scripts for complex tasks is much more work than anyone is willing to commit to. Also big problem was finding out tacit knowledge and no one was able to code that reliably in.
    ML today is promising you won’t have to hand code the rules you just push data and system finds out what the rules are and then can handle new data.
    I don’t have to code the rules to check if there is a cat in the picture - that definitely works. Making rules on data that is not so often found on the internet that’s still going to be a hassle. Rules change and world change and for example knowledge cut off is I think still a problem.
    In the end yes you can build nice system for some use case where you plugin LLM for classification and you most likely will make money on it. This just won’t be „what was promised” so AGI and we are stuck with this promise and a lot of people won’t accept less than that.
  - arethuza 6 hours ago
    I wonder what would happen if you used an LLM to write the rules?
andrehacker 11 hours ago
Ah, the early days of AI.
If a book or movie is ever made about the history of AI, the script would include this period of AI history and would probably go something like this…
(Some dramatic license here, sure. But not much more than your average "based on true events" script.)
In 1957, Frank Rosenblatt built a physical neural network machine called the Perceptron. It used variable resistors and reconfigurable wiring to simulate brain-like learning. Each resistor had a motor to adjust weights, allowing the system to "learn" from input data. Hook it up to a fridge-sized video camera (20x20 resolution), train it overnight, and it could recognize objects. Pretty wild for the time.
Rosenblatt was a showman—loud, charismatic, and convinced intelligent machines were just around the corner.
Marvin Minsky, a jealous academic peer of Frank, was in favor of a different approach to AI: Expert Systems. He published a book (Perceptrons, 1969) which all but killed research into neural nets. Marvin pointed out that no neural net with a depth of one layer could solve the "XOR" problem.
While the book's findings and mathematical proof were correct, they were based on incorrect assumptions (that the Perceptron only used one layer and that algorithms like backpropagation did not exist).
As a result, a lot of academic AI funding was directed towards Expert Systems. The flagship of this was the MYCIN project. Essentially, it was a system to find the correct antibiotic based on the exact bacteria a patient was infected with. The system thus had knowledge about thousands and thousands of different diseases with their associated symptoms. At the time, many different antibiotics existed, and using the wrong one for a given disease could be fatal to the patient.
When the system was finally ready for use... after six years (!), the pharmaceutical industry had developed “broad-spectrum antibiotics,” which did not require any of the detailed analysis MYCIN was developed for.
The period of suppressing Neural Net research is now referred to as (one of) the winter(s) of AI.
--------
As said, that is the fictional treatment. In reality, the facts, motivations, and behavior of the characters are a lot more nuanced.
[-]
- mindcrime 2 hours ago
  If a book or movie is ever made about the history of AI, the script would include this period of AI history and would probably go something like this…
  I would love to see a "Halt and Catch Fire" style treatment of this era.
  Marvin Minsky, a jealous academic peer of Frank, was in favor of a different approach to AI: Expert Systems. He published a book (Perceptrons, 1969) which all but killed research into neural nets. Marvin pointed out that no neural net with a depth of one layer could solve the "XOR" problem.
  I think a lot of people have an impression - an impression that I shared until recently - that the Perceptrons book was a "hit piece" aimed at intentionally destroying interest in the perceptron approach. But having just finished reading the Parallel Distributed Processing book and being in the middle of reading Perceptrons right now, I no longer fully buy that. Now the effect may well have been what is widely known. But Minsky and Papert don't really seem to be as "anti-perceptron" as the "received wisdom" suggests.
- Animats 9 hours ago
  Not that wrong.
  I went through Stanford CS when those guys were in charge. It was starting to become clear that the emperor had no clothes, but most of the CS faculty was unwilling to admit it. It was really discouraging. Peak hype was in "The fifth generation: artificial intelligence and Japan's computer challenge to the world" (1983), by Feigenbaum. (Japan at one point in the 1980s had an AI program which attempted to build hardware to run Prolog fast.)
  Trying to use expert systems for medicine lent an appearance of importance to something that might work for auto repair manuals. It's mostly a mechanization of trouble-shooting charts. It's not totally useless, but you get out pretty much what you carefully put in.
  [-]
  - microtonal 8 hours ago
    Not exactly an expert system, but during my PhD I contributed to a natural language parsing/generation system for Dutch written mostly in Prolog with some C++ for performance reasons. The only statistical component was a maxent ranker for disambiguation and fluency ranking.
    No statistical dependency parser came near it accuracy-wise until BERT/RoBERTa + biaffine parsing.
    [-]
    arnsholt 3 hours ago
    Oh yeah, the good hand crafted grammars are really good. For my PhD I worked in a group that was deep in the DelphIN/ERG collaboration, and they did some amazing things with that.
  - mamp 9 hours ago
    To be fair the performance of rules or Bayesian networks or statistical models wasn't the problem (performance compared to existing practice). DeDombal showed in 1972 that a simple Bayes model was better than most ED physicians in triaging abdominal pain.
    The main barrier to scaling was workflow integration due to lack of electronic data, and if it was available, interoperability (as it is today). The other barriers were problems with maintenance and performance monitoring, which are still issues today in healthcare and other industries.
    I do agree the 5th Generation project never made sense, but as you point out they had developed hardware to accelerate Prolog and wanted to show it off and overused the tech. Hmmm, sounds familiar...
  - Zafira 6 hours ago
    The early history of AI/cybernetics seems poorly documented. There are a few books, some articles and some oral histories about what was going on with McCulloch and Pitts. It makes one wonder what might have been with a lot of things. Including if Pitts had lived longer, been able to get out of the rut he found himself in the end (to put it mildly) and hadn’t burned his PhD dissertation, but perhaps one of the more interesting comments that is directly relevant to all this lies in this fragment from a “New Scientist” article[1]:
    > Worse, it seems other researchers deliberately stayed away. John McCarthy, who coined the term “artificial intelligence”, told Piccinini that when he and fellow AI founder Marvin Minsky got started, they chose to do their own thing rather than follow McCulloch because they didn’t want to be subsumed into his orbit.
    [1] https://www.newscientist.com/article/mg23831800-300-how-a-fr...
- mamp 9 hours ago
  Don’t attribute to jealousy that can be adequately explained by vanishing gradients.
  BTW the ad hoc treatment of uncertainty in Mycin (certainty factors) motivated the work of Bayesian network.
cess11 5 hours ago
MYCIN was a high profile project, but rule engines in this vein come in handy quite often. With some basic Prolog knowledge and a decent implementation pretty much any set of data files can quickly be turned into an ad hoc 'expert system' lite.
The simplest example and the one I usually bring up is log files, where the primary delimiter is \n and the secondary is likely some whitespace, which can easily be replaced with Prolog delimiters and a bit of decoration. This turns the data into Prolog code which can be consulted as is and complemented with rules abstracting complex queries.
Something similar can be done with JSON files.
https://www.scryer.pl/
https://www.metalevel.at/prolog/dcg
https://www.metalevel.at/prolog/expertsystems
anthk 6 hours ago
Common Lisp should have those too in the Paradigms of Artificial Intelligence programming book.
Code:
https://github.com/norvig/paip-lisp
Book
https://archive.org/details/github.com-norvig-paip-lisp_-_20...
As for the interpreter, SBCL works fine everywhere; if not, pick ECL and CCL.