Commit Graph

86 Commits

Author SHA1 Message Date
Brian S. Stephan 9b7cbadce6 rename Module.new_sendmsg() -> Module.sendmsg()
remove the deprecated method as well, of course
2013-02-09 15:11:38 -06:00
Brian S. Stephan 1415f740fb remove usages of Module.sendmsg()
we're switching to an idiom where the bot is only on one connection, so
we don't need to care about connection tracking. new_sendmsg accordingly
doesn't take a connection argument. now i can remove the old sendmsg

caught in the wake, a bunch of connections being passed here and there
can be removed, changing some module method signatures and such. there
might be more to remove still
2013-02-09 15:05:44 -06:00
Brian S. Stephan 5314dadc07 Markov: massive rewrite of the chainer
a bunch of logic is moved around, some queries are improved, max_size
does what it's actually supposed to do. all in all this is a much
clearer chainer, even if the actual results are more or less the same.

it's probably a bit faster in most cases but slower in situations when
all the seed words have been consumed and it needs to do
__start1,__start2 chains (since there's so many of them, it's rather
slow). otherwise, it tries to use seed words in sentences, combining
multiple sentences when possible. there's a lot more in the periphery,
but that's the general idea
2013-02-09 14:44:45 -06:00
Brian S. Stephan 5d90c98fb2 Markov: actually use the working backwards results
thinko, there were code paths where the working backwards results were
discarded. don't discard them.
2013-02-08 02:13:15 -06:00
Brian S. Stephan 0b6d5e3f44 Markov: always update hit_word
whether or not we went backwards and forwards, or just forwards, this
cycle of the loop, end the iteration by calling the end of the sentence
our hit word. if it was our seed word, this will trigger a new seed
selection
2013-02-08 02:11:29 -06:00
Brian S. Stephan e7bed15ee8 Markov: _retrieve_random_v_for_k1_and_k2_with_pref
get one random v for a k1,k2 via SQL. prefer a word to show up in the
results, though there's no guarantee it will. this simplifies the
general looking forward case, and could possibly even work ok on the new
sentence stuff, though i haven't tried to update that portion of the
code yet
2013-02-08 02:07:57 -06:00
Brian S. Stephan db221a3c06 Markov: keep start2 from leaking out of backfill
only add the reverse-search result to list of words if it isn't __start2
(and if it is __start2, just carry on, giving the code one last chance
to find something else)
2013-02-08 02:02:44 -06:00
Brian S. Stephan 5a55227cf9 Markov: _retrieve_random_k2_for_value
rather than getting all k2s for a value from the database, then walking
the list and picking one at random, pick one for a value at random via
a query

this simplifies the code, and is (usually) faster than the old way,
which has been removed. it would be even faster if it weren't for that
context_id stuff, but so it goes
2013-02-08 01:15:32 -06:00
Brian S. Stephan 232eeccbcb Markov: let backwards chainer go randomly longer
the code, in a kind of trial state, would very quickly stop trying to
work backwards. (part of this was for performance reasons, i believe.)
since that seems to have proven stable, let's mess with it --- the
backwards chainer can now go backwards a random distance, rather than
just what almost always turned out to be 2
2013-02-08 00:21:08 -06:00
Brian S. Stephan 60ac4d25bd Markov: some minor formatting/pylint cleanups 2013-02-07 23:51:41 -06:00
Brian S. Stephan 8d6d66333b Module: don't pass DrBotServerConnection to init
another "this is unnecessary" change, obviously impacting all the
modules that override __init__ as well as the base class. again, they
can use the DrBotIRC instance for anything, which is (with one
exception) only for add/remove_global_handler, which i'm planning on
working my way off of anyway
2012-12-19 21:06:53 -06:00
Brian S. Stephan 3e76f75bba Module: remove reply(), use DrBotIRC's
obviously this means all of the modules changed to accomodate. this is
one of many steps to reduce the number of times we pass connections and
servers and other such info around, when it's mostly unnecessary because
modules have a reference to DrBotIRC
2012-12-19 20:51:35 -06:00
Brian S. Stephan 9ec74d0e35 Markov: off by one while counting up to min_size 2012-10-05 17:09:04 -05:00
Brian S. Stephan 02729377d8 Markov: more anti-stop bugfixes 2012-09-17 16:23:42 -05:00
Brian S. Stephan c064f6ebe1 Markov: check for start2-only lists correctly while working backwards
what i was doing before had practically no chance of working right,
so that's fun
2012-07-30 10:25:13 -05:00
Brian S. Stephan e8e4354358 Markov: many working backwards bugfixes wrapped together 2012-07-29 22:36:11 -05:00
Brian S. Stephan bf850592df Markov: bugfix in the anti-address chaining 2012-07-29 17:53:56 -05:00
Brian S. Stephan b327bcab71 Markov: trivial code cleanup 2012-07-29 17:46:14 -05:00
Brian S. Stephan 14fd5721c1 Markov: trivial debugging fix 2012-07-29 15:44:43 -05:00
Brian S. Stephan 26ec854c67 Markov: try to avoid "nick:" starts to extra chaining
when starting another sentence because the main one is too short,
do a bit of work in an attempt to avoid "nick: blah" starts, since
they're fairly common. instead we just ignore nick: and start with
"blah blah"
2012-07-29 15:43:15 -05:00
Brian S. Stephan ad1de23a7c Markov: remove inaccurate debug logging 2012-07-29 15:41:36 -05:00
Brian S. Stephan 988fe8729a Markov: add punctuation between chains
when starting a second (or Nth) chain because the results so far
are too short, add punctuation to the end of the chain, just to
make things feel a bit more natural
2012-07-29 09:43:06 -05:00
Brian S. Stephan 390e925360 Markov: rewrite backwards/forwards chainer
this clarifies a bunch of sections and seems slightly faster

target_word (which would be randomly selected from the input every
time) is replaced with seed_words, a shuffled list from the input.
this is to eliminate accidental reuse of the target word, which
would result in chains like X X X X X X X X X X X X X because
it'd keep targeting X

the rest of this is mostly just debug cleanup, though to simplify
the backwards code it only tries to find one target word
2012-07-29 09:39:07 -05:00
Brian S. Stephan 9ca37c3990 Markov: clarify what's going on in _get_suitable_word_from_choices 2012-07-29 09:36:56 -05:00
Brian S. Stephan f15238a37e Markov: abort new chain tack-on if even that's giving us __stop 2012-07-28 14:01:05 -05:00
Brian S. Stephan a6f4827a41 Markov: start new chains if the existing one is too short 2012-07-28 13:55:54 -05:00
Brian S. Stephan ced165cff4 Markov: debug logging 2012-07-28 13:32:58 -05:00
Brian S. Stephan 8b2269c441 pyflakes cleanups 2012-07-27 20:38:45 -05:00
Brian S. Stephan 033631e5c2 no longer encode/decode UTF8 stuff when going to/from database
seems safe so far (famous last words)
2012-07-27 16:34:57 -05:00
Brian S. Stephan e1356496eb Markov: don't encode('utf8') the stuff out of the database
it seems unnecessary now? i guess i have to change this in all
the modules now, including this one because i probably missed something
2012-07-27 15:24:56 -05:00
Brian S. Stephan 7bd5558f05 ENGINE=InnoDB CHARACTER SET utf8 COLLATE utf8_bin for case-sensitivity 2012-07-27 14:57:41 -05:00
Brian S. Stephan 1a36becead convert to a MySQL backend
WARNING!
there's no going back now. this change is *huge* but it was overdue.
WARNING!

the database backend is now mysql. modules that should use a database
but don't yet were left untouched, they'll come later. scripts haven't
been converted yet, though i'm pretty sure i'll need to soon.

while i was going through everything, connection/cursor idioms were
cleaned up, as were a bunch of log messages and exception handling. this
change is so gross i'm happy things appear to be working, which is
the case --- all modules are lightly tested.
2012-07-27 02:18:01 -05:00
Brian S. Stephan 9654f4de98 switch to use python's logging, with config file i'm not entirely happy about 2012-07-15 21:32:12 -05:00
Brian S. Stephan 2b0b7abd58 Markov: unicode fixes and improvements 2012-07-15 01:11:21 -05:00
Brian S. Stephan 2650824dbd Markov: correct the documentation on min_size/max_size in _generate_line 2012-07-14 09:22:37 -05:00
Brian S. Stephan d94d7f0c88 Markov: register ._generate_line as markov_generate_line 2012-04-05 21:24:41 -05:00
Brian S. Stephan 07744a0f66 indicate recursion better by adding _recursing to Event
for simplicity's sake, this was added to the extlib/irclib rather
than subclassing. because i'm lazy. anyway, check that flag instead
of doing the event._target = None hack, since that hack was breaking
Markov.

for an unrelated reason (what to learn and not learn), update Markov

also remove an unused method that was getting in my way while coding this
2012-03-29 20:07:32 -05:00
Brian S. Stephan 7d41564d02 Markov: allow for auto-context insertion
this should result in no chains having a null context --- if no pre-existing
context is created, one is created for the channel/nick and used. this makes,
for example, arbitrary queries "private" to that nick (again unless that has
been overridden). shouldn't affect much of anything, but adding this made
the context-less learning code obsolete, which is fine since it was never used
anyway
2012-03-19 00:12:29 -05:00
Brian S. Stephan 26bc8bec34 Markov: rebuild the tables, use the context stuff in a better fashion this time
the module will drop your old tables if you have them, so if there's data there,
be sure to back them up and figure out some migration strategy (probably annoying
and probably having to script it).

the big change is that each line is associated to a context now, and channels
are also associated to contexts. this should allow for a better partitioning
of multiple brains, and changing which channels point to which brain.

also caught in the wake is some additional logging verbosity, and a change to
no longer lower() everything learned.

the script to dump a file into the database has also been updated with the above
changes
2012-02-28 23:23:14 -06:00
Brian S. Stephan 8c1ffc54ba Markov: drop the max id stuff, get a bunch of chains and pick one randomly. cooler this way. 2011-10-21 17:01:09 -05:00
Brian S. Stephan e3ef3f48dc Markov: add support for temporarily disabling chatter by supplying a negative chance 2011-10-21 16:59:57 -05:00
Brian S. Stephan cda1d43606 Markov: index on (v, context) and other enhancements for the last commit
reduce some infinite loop possibilities, and add an index with the old <= id trick
to speed up the searching for backwards chains
2011-10-16 21:13:27 -05:00
Brian S. Stephan 42962bc48d Markov: add support for starting in the middle of a chain and working backwards
this only makes sense if we have a target word set, which we usually do.
start with the target word and go backwords, finding k2s that lead to it
(and that lead to that k2, and so on) until we get to the start-of-chain
value, when we know we're done working backwards. then resume the normal
appending logic

probably needs some work, probably a bit slow on huge databases. analysis
pending, but this appears to work
2011-10-16 20:19:51 -05:00
Brian S. Stephan 50fbbbfedd Markov.py: tweaking the shut up check, this has been pretty good for a while 2011-09-20 01:20:27 -05:00
Brian S. Stephan 4566d1734e change the default sqlite timeout to 30 seconds
this should make the bot wait longer for table locks, assuming i
read the docs right
2011-07-01 18:42:49 -05:00
Brian S. Stephan a51f0cb54c Markov: refer to the actual target from a chatter target when shutting up 2011-07-01 18:42:04 -05:00
Brian S. Stephan 678350fe5d Markov: trivial change to allow for more advanced randomness later 2011-06-22 19:00:01 -05:00
Brian S. Stephan 7220025f0a Markov: randomly say something to a list of approved channels
check interval is every 10 minutes, rows in markov_chatter_target
have a 1 in chance chance of leading to a line being generated,
every 10 minutes. (so an interval of 144 = 10 min * 6 * 24 = one line
per day, on average)
2011-06-20 22:49:25 -05:00
Brian S. Stephan 1e87fe59d8 even more close connections from get_db() 2011-06-20 22:34:27 -05:00
Brian S. Stephan 152ef2a1ad Module: remove the timer stuff, since individual modules can do this better themselves
Markov, Twitter: switch to forking a thread ourselves, and check every
second whether or not to quit. this is the "better" part above, as
now we can instantly quit the thread rather than waiting for all
the timers to fire and expire
2011-06-20 21:18:55 -05:00