For The Artifact I did some really extensive spreadsheets to figure out probabilities of the multiple successes in some really complex situations. I did tons of spreadsheets for the Engineer's Resource (which is now part of the Player's Handbook). I had to explore thousands of possibilities and do it within my lifetime!
I used to sort of write out examples for The Artifact's blog, where I'd either write out a short vignette of a something that happened in a game or stories I'd like to do. It wasn't exactly for the purpose of development, but it probably figured into development.
As for testing with others, yeah, we've done a lot of test sessions where it's tuned to a specific situation. A lot of times it's things that the players would likely never get to do but there are stats for it in the book. This has pointed out some big flaws in the older editions. Other times it guided the development of rules.
Most of the time I just work through this kind of thing in my head. That is unless I'm doing some kind of probability analysis.