Today I made some improvements to LDJamerator:
- Filter out parentheses unless they’re part of an emote, because they’re otherwise unlikely to appear as matched pairs without a lot of extra logic.
- Filter out double quotes for the same reason as parentheses.
- Make it so that the next word after a colon or emote is not generated from the Markov chain, but instead pulled randomly from the entire corpus.
- Prevent generated tweets from accidentally being verbatim copies of one of the 1500 source tweets. I use hashing to quickly determine whether a generated tweet is an exact copy, so anything less than an exact copy will not be caught right now.
I wanted to regenerate the corpus but it looks like the Twitter API won’t let me get old tweets, and most of the current #ldjam tweets are bot-trash or self promo, so it doesn’t result in good jamerations at all. I’ll have to wait until the next LD to generate a new corpus, but this should still provide lots of fun in the meantime.