the theory here is that counting the number of sentences generated is
kind of silly, if we're already specifying min/max word counts, we
probably just want to fall into that range, and not really care how many
sentences we get
meanwhile, we were overloading max_sentences to also calculate how long
any one sentence must be, which is kind of a weird thing to derive, so
we're going to drop the max_sentences language and call this more what
it is, a bias towards the number of sentences that might be seen