My first pass results on a running App::Prove::History on a moderate sized test suite (with deliberate failures) are pleasantly surprising.
Without saving state via App::Prove::History:
Test Summary Report
-------------------
t/aggregate.t (Wstat: 2304 Tests: 7781 Failed: 9)
Failed tests: 2967, 2973, 2979, 2990, 5610, 5623, 5625
5793-5794
TODO passed: 5553, 5681, 5736, 5749
Non-zero exit status: 9
t/xml_fixtures_validate.t (Wstat: 17664 Tests: 342 Failed: 69)
Failed tests: 7-8, 10-13, 16-19, 27-35, 38, 40-49, 56-59
63, 65, 70, 72, 74, 77, 90, 92, 94, 97-109
173, 191, 226, 254, 280-287, 341
Non-zero exit status: 69
Files=28, Tests=9201, 496 wallclock secs ( 1.97 usr 0.23 sys + 386.81 cusr 16.08 csys = 405.09 CPU)
Result: FAIL
With saving state:
Test Summary Report
-------------------
t/xml_fixtures_validate.t (Wstat: 17664 Tests: 342 Failed: 69)
Failed tests: 7-8, 10-13, 16-19, 27-35, 38, 40-49, 56-59
63, 65, 70, 72, 74, 77, 90, 92, 94, 97-109
173, 191, 226, 254, 280-287, 341
Non-zero exit status: 69
t/aggregate.t (Wstat: 2304 Tests: 7781 Failed: 9)
Failed tests: 2967, 2973, 2979, 2990, 5610, 5623, 5625
5793-5794
TODO passed: 5553, 5681, 5736, 5749
Non-zero exit status: 9
Files=28, Tests=9201, 506 wallclock secs ( 2.03 usr 0.27 sys + 390.21 cusr 16.26 csys = 408.77 CPU)
Result: FAIL
496 wallclock seconds versus 506 wallclock seconds. I expect that my decision to keep everything lightweight and simple probably helped here, but I'm surprised that there's not more of a performance hit. In fact, without a few more test runs, a 2% slowdown might just be a fluke (even though there has to be some performance hit).