Most mutual data (MMI) has turn out to be one of many two de facto strategies for sequence-level coaching of speech recognition acoustic fashions. This paper goals to isolate, determine and produce ahead the implicit modelling choices induced by the design implementation of ordinary finite state transducer (FST) lattice based mostly MMI coaching framework. The paper significantly investigates the need to keep up a preselected numerator alignment and raises the significance of determinizing FST denominator lattices on the fly. The efficacy of using on the fly FST lattice determinization is mathematically proven to ensure discrimination on the speculation degree and is empirically proven by coaching deep CNN fashions on a 18K hours Mandarin dataset and on a 2.8K hours English dataset. On assistant and dictation duties, the method achieves between 2.3-4.6% relative WER discount (WERR) over the usual FST lattice based mostly method.