dr.botzo

Commit Graph

Author	SHA1	Message	Date
Brian S. Stephan	033631e5c2	no longer encode/decode UTF8 stuff when going to/from database seems safe so far (famous last words)	2012-07-27 16:34:57 -05:00
Brian S. Stephan	e1356496eb	Markov: don't encode('utf8') the stuff out of the database it seems unnecessary now? i guess i have to change this in all the modules now, including this one because i probably missed something	2012-07-27 15:24:56 -05:00
Brian S. Stephan	7bd5558f05	ENGINE=InnoDB CHARACTER SET utf8 COLLATE utf8_bin for case-sensitivity	2012-07-27 14:57:41 -05:00
Brian S. Stephan	1a36becead	convert to a MySQL backend WARNING! there's no going back now. this change is huge but it was overdue. WARNING! the database backend is now mysql. modules that should use a database but don't yet were left untouched, they'll come later. scripts haven't been converted yet, though i'm pretty sure i'll need to soon. while i was going through everything, connection/cursor idioms were cleaned up, as were a bunch of log messages and exception handling. this change is so gross i'm happy things appear to be working, which is the case --- all modules are lightly tested.	2012-07-27 02:18:01 -05:00
Brian S. Stephan	9654f4de98	switch to use python's logging, with config file i'm not entirely happy about	2012-07-15 21:32:12 -05:00
Brian S. Stephan	2b0b7abd58	Markov: unicode fixes and improvements	2012-07-15 01:11:21 -05:00
Brian S. Stephan	2650824dbd	Markov: correct the documentation on min_size/max_size in _generate_line	2012-07-14 09:22:37 -05:00
Brian S. Stephan	d94d7f0c88	Markov: register ._generate_line as markov_generate_line	2012-04-05 21:24:41 -05:00
Brian S. Stephan	07744a0f66	indicate recursion better by adding _recursing to Event for simplicity's sake, this was added to the extlib/irclib rather than subclassing. because i'm lazy. anyway, check that flag instead of doing the event._target = None hack, since that hack was breaking Markov. for an unrelated reason (what to learn and not learn), update Markov also remove an unused method that was getting in my way while coding this	2012-03-29 20:07:32 -05:00
Brian S. Stephan	7d41564d02	Markov: allow for auto-context insertion this should result in no chains having a null context --- if no pre-existing context is created, one is created for the channel/nick and used. this makes, for example, arbitrary queries "private" to that nick (again unless that has been overridden). shouldn't affect much of anything, but adding this made the context-less learning code obsolete, which is fine since it was never used anyway	2012-03-19 00:12:29 -05:00
Brian S. Stephan	26bc8bec34	Markov: rebuild the tables, use the context stuff in a better fashion this time the module will drop your old tables if you have them, so if there's data there, be sure to back them up and figure out some migration strategy (probably annoying and probably having to script it). the big change is that each line is associated to a context now, and channels are also associated to contexts. this should allow for a better partitioning of multiple brains, and changing which channels point to which brain. also caught in the wake is some additional logging verbosity, and a change to no longer lower() everything learned. the script to dump a file into the database has also been updated with the above changes	2012-02-28 23:23:14 -06:00
Brian S. Stephan	8c1ffc54ba	Markov: drop the max id stuff, get a bunch of chains and pick one randomly. cooler this way.	2011-10-21 17:01:09 -05:00
Brian S. Stephan	e3ef3f48dc	Markov: add support for temporarily disabling chatter by supplying a negative chance	2011-10-21 16:59:57 -05:00
Brian S. Stephan	cda1d43606	Markov: index on (v, context) and other enhancements for the last commit reduce some infinite loop possibilities, and add an index with the old <= id trick to speed up the searching for backwards chains	2011-10-16 21:13:27 -05:00
Brian S. Stephan	42962bc48d	Markov: add support for starting in the middle of a chain and working backwards this only makes sense if we have a target word set, which we usually do. start with the target word and go backwords, finding k2s that lead to it (and that lead to that k2, and so on) until we get to the start-of-chain value, when we know we're done working backwards. then resume the normal appending logic probably needs some work, probably a bit slow on huge databases. analysis pending, but this appears to work	2011-10-16 20:19:51 -05:00
Brian S. Stephan	50fbbbfedd	Markov.py: tweaking the shut up check, this has been pretty good for a while	2011-09-20 01:20:27 -05:00
Brian S. Stephan	4566d1734e	change the default sqlite timeout to 30 seconds this should make the bot wait longer for table locks, assuming i read the docs right	2011-07-01 18:42:49 -05:00
Brian S. Stephan	a51f0cb54c	Markov: refer to the actual target from a chatter target when shutting up	2011-07-01 18:42:04 -05:00
Brian S. Stephan	678350fe5d	Markov: trivial change to allow for more advanced randomness later	2011-06-22 19:00:01 -05:00
Brian S. Stephan	7220025f0a	Markov: randomly say something to a list of approved channels check interval is every 10 minutes, rows in markov_chatter_target have a 1 in chance chance of leading to a line being generated, every 10 minutes. (so an interval of 144 = 10 min * 6 * 24 = one line per day, on average)	2011-06-20 22:49:25 -05:00
Brian S. Stephan	1e87fe59d8	even more close connections from get_db()	2011-06-20 22:34:27 -05:00
Brian S. Stephan	152ef2a1ad	Module: remove the timer stuff, since individual modules can do this better themselves Markov, Twitter: switch to forking a thread ourselves, and check every second whether or not to quit. this is the "better" part above, as now we can instantly quit the thread rather than waiting for all the timers to fire and expire	2011-06-20 21:18:55 -05:00
Brian S. Stephan	df3de56c4c	Markov: don't add chains if the context is null that should only be possible on non-pub/privmsgs, or if there is a [subcommand] being analyzed. in any event, don't learn it.	2011-06-16 21:25:22 -05:00
Brian S. Stephan	a8031909b4	Markov: bite the bullet and make each markov chain automatically assigned a context (channel/query) still kind of testing this, but i think it's easiest	2011-06-15 12:29:18 -05:00
Brian S. Stephan	a0588869f3	Markov: add selecting by context, in order to segregate chains by channel adding chains by context has existed for a while, this should allow for querying for chains with null context or the current context. lightly tested	2011-06-14 22:10:57 -05:00
Brian S. Stephan	57be7f8026	Markov: remove some cruft that is now obsolete	2011-06-14 21:08:01 -05:00
Brian S. Stephan	90be2d1855	Markov: trying a simpler form of shut up check	2011-05-03 22:13:49 -05:00
Brian S. Stephan	5e8e93beba	Markov: clean up the whole "need to create our own db object" thing	2011-05-01 10:41:59 -05:00
Brian S. Stephan	03d0d6bc2d	Markov: shut up if we've been too chatty in too short a period of time. track all lines seen and all lines said by Markov. every 30 seconds, if there have been more than 20 such lines, and Markov is responsible for roughly half of them, then shut up for 30 seconds, because the bot probably got stuck talking to another bot. this should mean that such a reply infinite loop can't happen for more than a minute. i'm not entirely sure on the 30 sec/20 lines ratio. this may need tuning.	2011-05-01 10:38:46 -05:00
Brian S. Stephan	7692d295f6	Markov: don't clobber existing database objects in the forked thread	2011-05-01 10:26:06 -05:00
Brian S. Stephan	a73aec8ff0	Markov: remove debugging noise that snuck in via `42d414a0a4`	2011-05-01 10:11:04 -05:00
Brian S. Stephan	1945637752	Markov: add support for chatter targets, channels we log messages to or randomly speak in	2011-05-01 10:05:37 -05:00
Brian S. Stephan	14f2a027fe	Markov: preliminary support for the bot to conditionally shut it self up (and recover from that)	2011-04-30 15:43:59 -05:00
Brian S. Stephan	42d414a0a4	Markov: consolidate _reply_to_line and _reply into _generate_line	2011-04-30 15:37:16 -05:00
Brian S. Stephan	9ec73c4aa6	Markov: this is kind of embarrassing. remove a duplicate index.	2011-04-27 21:38:52 -05:00
Brian S. Stephan	6070ddc950	Markov: when looking up the start-of-sentence chain, get one random one when finding a key for (__start1,__start2), instead of fetcihng all (which can be a lot, in chatty channels and/or over time), get the max ID in the table, pick a random ID between 1,max, and pick the first id >= to it, and use that. just as random, nowhere near as intensive.	2011-04-23 21:24:23 -05:00
Brian S. Stephan	6ef7865dba	Markov: remove unused _get_chain_beginnings	2011-04-23 20:59:26 -05:00
Brian S. Stephan	7f922dd2c9	Markov: remove the 'starts' dictionary	2011-04-23 16:27:07 -05:00
Brian S. Stephan	116251398e	Markov: index on markov_chain(k1,k2)	2011-04-23 16:25:01 -05:00
Brian S. Stephan	305625044a	Markov: track the context of said lines a context is a meta-classification ('banter, 'secrets', whatever) based on targets (channels or nicknames). when a line is being learned from a known target, the chains are placed in that context. this is for allowing one brain to have multiple personalities, in a sense, for large networks or cases where there may be a more sanitized set of channels and a couple channels where everyone lets it rip. a later enhancement would have sentence creation choose from context-less chains (and contexts matching the current target), but i need to go back to the drawing board on that one a bit. ramble ramble ramble	2011-04-23 16:07:32 -05:00
Brian S. Stephan	5885983afd	Markov: when learning lines, don't include the part direct addressing e.g. if i say 'dr_botzo: hello dude', he only learns 'hello dude'. this is mainly being done because the bot's name being in the brain so many times was getting kind of silly, especially in channels that have lots of conversations with the bot	2011-04-22 19:40:36 -05:00
Brian S. Stephan	5913a95165	Markov: append a stop if we have nothing to append from a chain somehow a chain led us down a path where there are no values for the keys in the chain. if that happens, just abort. i'm not quite sure how this could happen	2011-03-17 17:24:11 -05:00
Brian S. Stephan	2b8f0d2843	Markov: don't crash when learning a sentence that's only whitespace	2011-03-14 13:14:56 -05:00
Brian S. Stephan	7a53aaa9a1	Markov: properly output unicode chains	2011-02-25 20:59:57 -06:00
Brian S. Stephan	87073d7fd3	Markov: cache the first word in markov chains this eliminates the expensive database hit on every request for a line. the cache is loaded when the module loads and learning new lines should add the appropriate word to the list. seemed like a pretty good compromise	2011-02-24 21:06:29 -06:00
Brian S. Stephan	1712a7db53	Markov: use sqlite backend for brain this keeps us from having the entire markov chain in memory and having to do the pickling and so on. in many ways, this is a good thing. in one way, this is a bad thing. each line on irc will create a __start1,__start2 item in the database, which means starting a chain will be an expensive process. (approx 3 seconds, from irc logs of 600,000 K lines). following selects run much faster, but the first one is dog slow. a later commit should hopefully fix this.	2011-02-24 20:39:32 -06:00
Brian S. Stephan	2aa369add7	rewrite recursion/alias code for the 500th time. more of a moving of the code, actually, it now exists in (an overridden) _handle_event, so that recursions happen against irc events directly, rather than an already partially interpreted object. with this change, modules don't need to implement do() nor do we have a need for the internal_bus, which was doing an additional walk of the modules after the irc event was already handled and turned into text. now the core event handler does the recursion scans. to support this, we bring back the old replypath trick and use it again, so we know when to send a privmsg reply and when to return text so that it may be chained in recursion. this feels old hat by now, but if you haven't been following along, you should really look at the diff. that's the meat of the change. the rest is updating modules to use self.reply() and reimplementing (un)register_handlers where appropriate	2011-02-17 01:08:45 -06:00
Brian S. Stephan	28f450ab5d	Markov: improve min_size by implementing min_search_tries if the end of a chain has been reached via __end, but min_size has not been satisfied, discard the last couple elements in the chain and try again. use min_search_tries so we don't do this forever.	2011-01-25 20:42:52 -06:00
Brian S. Stephan	7b4b86dc0d	Markov: add support for requesting desired min/max size of a reply note that since the min_size support is kind of crude at the moment, this only partially works	2011-01-25 20:25:15 -06:00
Brian S. Stephan	c732466129	Merge branch 'master' of git.incorporeal.org:dr.botzo	2011-01-24 16:51:52 -06:00

1 2

58 Commits