One problem with steganography is that the embedding of hidden text in the covertext changes the statistical characteristics of the covertext. With large amounts of covertext, it becomes obvious. Niels Provos addressed this in Outguess by changing other bits in the covertext to minimize the impact of the embedding on the chi-square test. Would it be easier to embed undetectably if we can generate the covertext ourselves. Definitely! Mybal.pl does this. Supply it with an ASCII text and it computes the probabilities of characters following every sequence of characters in the text. Supply it with a key, a message to embed and a word, and it will generate a covertext starting with that word. The covertext has exactly the same probability distribution as the orginal text, but the message can be extracted from it, if the key is known. How does it work? Mybal takes the word to start with, interprets it as a sequence of chars and checks which chars would be next in the sequence, and how probable each of them are. It then throws a biased die (a PRNG seeded with the key) to decide which char is next. It appends that char and interprets the result as another sequence and so on. If the list of possible next characters contains two chars with the same probability and the keyed random number generator chooses one of them mybal looks for the next message bit to embed. If it's a zero, then the randomly chosen char is appended. If it's a one, the other equally likely char is appended. This guarantees that the probability distribution is always the same as in the orginal.
To extract the message, mybal starts with the first word and walks along the covertext, always checking the list of possible next chars. If the char in the covertext has the same probability as another char in the list, then a message bit could be embedded with that char. To check which bit it was, mybal uses the keyed PRNG to generate the text itself and thus sees which char it would have chosen on a one or zero bit.