So whenever I look at clinical software, the first thing I want to know is where the clinical team sits, and whether they genuinely understand how the thing works. The pharmacist, the doctor, the nurse who signs it off: have they seen inside the algorithm, or is it a black box to them?
The follow-on matters just as much. Can they run their own tests against it, whenever they like, on whatever combinations of input happen to worry them? And are those tests kept, so they run again every time the algorithm, or the app around it, changes?
The test harness
This is the part I have always built in. Test harnesses around the application, sometimes inside it, sometimes firing thousands of input combinations through the interface from the outside and checking the output is what we expect. You cannot do that by hand.
Nobody understands every interaction in a complex application; there can be billions of them. What you can get on top of is the trends and the edge cases, and once you have found one you make sure it keeps passing when something else moves.
A machine can write the code. It cannot put its name to the result. Someone still has to.
I have designed, built and helped populate these harnesses on the products I have worked on. That is what lets the clinical team put their name to the software and actually mean it.
AI changes how quickly the code arrives. Who answers for it when it goes wrong has not changed at all.