Commit Graph

76 Commits

Author SHA1 Message Date
Brian S. Stephan 14f2a027fe Markov: preliminary support for the bot to conditionally shut it self up (and recover from that) 2011-04-30 15:43:59 -05:00
Brian S. Stephan 42d414a0a4 Markov: consolidate _reply_to_line and _reply into _generate_line 2011-04-30 15:37:16 -05:00
Brian S. Stephan 9ec73c4aa6 Markov: this is kind of embarrassing. remove a duplicate index. 2011-04-27 21:38:52 -05:00
Brian S. Stephan 6070ddc950 Markov: when looking up the start-of-sentence chain, get one random one
when finding a key for (__start1,__start2), instead of fetcihng all
(which can be a lot, in chatty channels and/or over time), get the
max ID in the table, pick a random ID between 1,max, and pick the
first id >= to it, and use that. just as random, nowhere near as
intensive.
2011-04-23 21:24:23 -05:00
Brian S. Stephan 6ef7865dba Markov: remove unused _get_chain_beginnings 2011-04-23 20:59:26 -05:00
Brian S. Stephan 7f922dd2c9 Markov: remove the 'starts' dictionary 2011-04-23 16:27:07 -05:00
Brian S. Stephan 116251398e Markov: index on markov_chain(k1,k2) 2011-04-23 16:25:01 -05:00
Brian S. Stephan 305625044a Markov: track the context of said lines
a context is a meta-classification ('banter, 'secrets', whatever)
based on targets (channels or nicknames). when a line is being
learned from a known target, the chains are placed in that context.

this is for allowing one brain to have multiple personalities, in
a sense, for large networks or cases where there may be a more
sanitized set of channels and a couple channels where everyone lets
it rip. a later enhancement would have sentence creation choose from
context-less chains (and contexts matching the current target), but
i need to go back to the drawing board on that one a bit.

ramble ramble ramble
2011-04-23 16:07:32 -05:00
Brian S. Stephan 5885983afd Markov: when learning lines, don't include the part direct addressing
e.g. if i say 'dr_botzo: hello dude', he only learns 'hello dude'.
this is mainly being done because the bot's name being in the brain
so many times was getting kind of silly, especially in channels that
have lots of conversations with the bot
2011-04-22 19:40:36 -05:00
Brian S. Stephan 5913a95165 Markov: append a stop if we have nothing to append from a chain
somehow a chain led us down a path where there are no values for
the keys in the chain. if that happens, just abort.

i'm not quite sure how this could happen
2011-03-17 17:24:11 -05:00
Brian S. Stephan 2b8f0d2843 Markov: don't crash when learning a sentence that's only whitespace 2011-03-14 13:14:56 -05:00
Brian S. Stephan 7a53aaa9a1 Markov: properly output unicode chains 2011-02-25 20:59:57 -06:00
Brian S. Stephan 87073d7fd3 Markov: cache the first word in markov chains
this eliminates the expensive database hit on every request for a line.
the cache is loaded when the module loads and learning new lines should
add the appropriate word to the list. seemed like a pretty good compromise
2011-02-24 21:06:29 -06:00
Brian S. Stephan 1712a7db53 Markov: use sqlite backend for brain
this keeps us from having the entire markov chain in memory and
having to do the pickling and so on. in many ways, this is a good
thing.

in one way, this is a bad thing. each line on irc will create a
__start1,__start2 item in the database, which means starting a
chain will be an expensive process. (approx 3 seconds, from irc
logs of 600,000 K lines). following selects run much faster, but
the first one is dog slow. a later commit should hopefully fix this.
2011-02-24 20:39:32 -06:00
Brian S. Stephan 2aa369add7 rewrite recursion/alias code for the 500th time.
more of a moving of the code, actually, it now exists in (an overridden)
_handle_event, so that recursions happen against irc events directly,
rather than an already partially interpreted object.

with this change, modules don't need to implement do() nor do we have a
need for the internal_bus, which was doing an additional walk of the
modules after the irc event was already handled and turned into text. now
the core event handler does the recursion scans.

to support this, we bring back the old replypath trick and use it again,
so we know when to send a privmsg reply and when to return text so that
it may be chained in recursion. this feels old hat by now, but if you
haven't been following along, you should really look at the diff.

that's the meat of the change. the rest is updating modules to use
self.reply() and reimplementing (un)register_handlers where appropriate
2011-02-17 01:08:45 -06:00
Brian S. Stephan 28f450ab5d Markov: improve min_size by implementing min_search_tries
if the end of a chain has been reached via __end, but min_size
has not been satisfied, discard the last couple elements in the
chain and try again. use min_search_tries so we don't do this
forever.
2011-01-25 20:42:52 -06:00
Brian S. Stephan 7b4b86dc0d Markov: add support for requesting desired min/max size of a reply
note that since the min_size support is kind of crude at the moment,
this only partially works
2011-01-25 20:25:15 -06:00
Brian S. Stephan c732466129 Merge branch 'master' of git.incorporeal.org:dr.botzo 2011-01-24 16:51:52 -06:00
Brian S. Stephan 2f3feb093d have !markov learn echo the text it learned, in case someone wants to chain it with other commands for some reason 2011-01-24 16:51:05 -06:00
Brian S. Stephan 18fc614a4a assorted whitespace nitpicking 2011-01-20 14:15:10 -06:00
Brian S. Stephan 7c05f60ffd Markov: implement a min_size, which tries to make a chain of at least min_size words.
note that this isn't guaranteed, if the chain is such that the
current tuple has nowhere to go but to the end of the line, then
it will follow it --- it doesn't try to go back and rebuilt a different
chain or anything.
2011-01-19 18:44:07 -06:00
Brian S. Stephan ac0429569e Markov: size -> max_size, since I'm going to try adding a min_size soon 2011-01-19 18:35:01 -06:00
Brian S. Stephan 176ca25c68 Markov: increase the default max length from 25 words to 100 words.
it's expected that, usually, the chain will have hit an end before this.
2011-01-19 18:32:15 -06:00
Brian S. Stephan d592d3f3bb Markov: regexes should only match start of line --- add ^ 2011-01-19 10:20:20 -06:00
Brian S. Stephan 3283fac1ff Markov: remove some debugging i forgot to clean out before the initial commit 2011-01-18 22:51:40 -06:00
Brian S. Stephan 8dd223f778 Markov: a module to implement a chatterbot via markov chains.
yeah, we have MegaHAL, but i can't find a good implementation in
python that actually works and is stable, so we'll implement a
simple thing ourselves. works pretty much like MegaHAL does, but
without the string corruption.

original code provided by ape, care of mike bloy
2011-01-18 22:30:59 -06:00