I implement VoiceMod in my performances by introducing a Video Delay (Async) filter of 333ms to my camera source in OBS. However, for some voices (like AI personalities) this is almost not a long enough delay to ensure good lip sync. For other voices (like "better mic" and other filter / EQ focused voices) it's far too much delay for good lip sync.
Counter intuitive as it may seem, it would be hugely beneficial for performers that want to lip sync to have a configurable "additional latency" parameter for each voice. That way I could test- with my own setup- how much "actual latency" each voice has and then add an additional offset so all voices have roughly the same actual latency. Then I could set my video delay to the same and have nearly perfect lip sync regardless of the selected voice.