Changeset 358

Show
Ignore:
Timestamp:
04/22/08 09:24:07 (7 months ago)
Author:
thai
Message:

update documentation for tweetserch/py, refactor search/twitter.py to remove redundant code

Location:
trunk/tutorial/tweetsearch/py
Files:
5 modified

Legend:

Unmodified
Added
Removed
  • trunk/tutorial/tweetsearch/py/readme.txt

    r357 r358  
    1 HOW TO RUN 
     1========================== 
     2= HOW TO RUN 
     3========================== 
    24 
    3 1. install django (http://www.djangoproject.com), apache, mod_python, and other necessary components 
     50. install django, see http://www.djangoproject.com for details 
    46 
    5 2. catch the tweet by running ./search/twitter.py, you can see the processs in /tmp/twitter.log 
     71. install thrudb, see http://www.thrudb.org for details 
    68 
    7 3. config apache with mod_python. Here's a sample conf: 
     92. install thrudex/thrudoc python libraries: 
    810 
    9 <VirtualHost *> 
    10         ServerName tweetsearch.local 
    11         DocumentRoot /path/to/thrudb/tutorial/tweetsearch/py 
    12         <Location "/"> 
    13                 SetHandler python-program 
    14                 PythonHandler   django.core.handlers.modpython 
    15                 SetEnv  DJANGO_SETTINGS_MODULE py.settings 
    16                 SetEnv  PYTHON_EGG_CACHE /tmp 
    17                 PythonDebug On 
    18                 PythonPath      "['/path/to/thrudb/tutorial/tweetsearch/py'] + sys.path" 
    19         </Location> 
     11$ cd thrudb/tutorial 
     12$ make 
     13$ sudo cp -pvr gen-py/Thrudex gen-py/Thrudoc /usr/lib/python2.5/site-packages/ 
    2014 
    21 </VirtualHost> 
     15if you use python2.4, the last command should look like: 
     16$ sudo cp -pvr gen-py/Thrudex gen-py/Thrudoc /usr/lib/python2.4/site-packages/ 
    2217 
    23 remember add this line to /etc/hosts: 
    24 127.0.0.1 tweetsearch.local 
     183. review thrudex/thrudoc configuration in thrudex.conf and thrudoc.conf respectively. if everything's okie, let's start thrudb: 
    2519 
    26 4. start apache, thrudex and thrudoc 
     20$ cd thrudb/tutorial 
     21$ ./thrudbctl start 
    2722 
    28 5. that's it! Contact me if you have any problem. 
     234. start grabbing tweets from http://www.twitter.com: 
     24$ cd thrudb/tutorial/tweetsearch/py 
     25$ ./search/twitter.py 
     26 
     27The application will be running as a daemon. You can see the processs in /tmp/twitter.log, thrudex.log, and thrudoc.log 
     28 
     295. start the django application 
     30 
     31$ cd thrudb/tutorial/tweetsearch/py 
     32$ python manage.py runserver 
     33 
     34the application is available at http://localhost:8000/ 
     35 
     366. that's it! Contact me if you have any problem. 
    2937 
    3038Thai Duong (thaidn@gmail.com). 
    31  
    32  
    33  
    34  
    35  
    36    
  • trunk/tutorial/tweetsearch/py/search/twitter.py

    r356 r358  
    2828THRUDEX_INDEX  = "tweets"; 
    2929 
    30 class TweetCatcher(object): 
     30class TweetManager(object): 
    3131    def __init__(self, since_id=None): 
    3232        self.connect_to_thrudoc() 
     
    7272        self.thrudoc.put(THRUDOC_BUCKET, str(tweet["id"]), cjson.encode(tweet))     
    7373 
    74     def run(self): 
     74    def grab_tweet(self): 
    7575        while True: 
    7676            # the random paramater used to avoid http caching by upstream provider 
     
    9797                        continue  
    9898            print "loaded %s tweets, last since_id %s" % (self.count, self.since_id) 
    99  
    100 class TweetManager(object): 
    101     def __init__(self): 
    102         self.connect_to_thrudoc() 
    103         self.connect_to_thrudex() 
    104  
    105     def connect_to_thrudoc(self): 
    106         socket = TSocket('localhost', THRUDOC_PORT) 
    107         transport = TFramedTransport(socket) 
    108         protocol = TBinaryProtocol(transport) 
    109         self.thrudoc = Thrudoc.Client(protocol) 
    110         transport.open() 
    111         self.thrudoc.admin("create_bucket", THRUDOC_BUCKET) 
    112  
    113     def connect_to_thrudex(self): 
    114         socket = TSocket('localhost', THRUDEX_PORT) 
    115         transport = TFramedTransport(socket) 
    116         protocol = TBinaryProtocol(transport) 
    117         self.thrudex = Thrudex.Client(protocol) 
    118         transport.open() 
    119         self.thrudex.admin("create_index", THRUDEX_INDEX) 
    12099     
    121100    def search_tweet(self, terms, offset=0, limit=10): 
     
    148127            doc.key    = ele.key 
    149128            docs.append(doc) 
    150  
    151         return docs 
    152     
     129        return docs    
    153130         
    154131if __name__ == "__main__": 
    155132    import daemonize as dm 
    156133    dm.daemonize('/dev/null','/tmp/twitter.log','/tmp/twitter.log') 
    157     tc = TweetCatcher() 
    158     tc.run() 
     134    tc = TweetManager() 
     135    tc.grab_tweet() 
  • trunk/tutorial/tweetsearch/py/settings.py

    r357 r358  
    55 
    66ADMINS = ( 
    7     ('Thai Duong', 'thai@meetaa.com'), 
     7    ('Thai Duong', 'thaidn@gmail.com'), 
    88) 
    99 
     
    4646# trailing slash. 
    4747# Examples: "http://foo.com/media/", "/media/". 
    48 ADMIN_MEDIA_PREFIX = '' 
     48ADMIN_MEDIA_PREFIX = '/media/' 
    4949 
    5050# Make this unique, and don't share it with anybody. 
  • trunk/tutorial/tweetsearch/py/urls.py

    r357 r358  
    88    # (r'^admin/', include('django.contrib.admin.urls')), 
    99    # catch all 
    10     (r'^.*$', 'py.search.views.search')   
     10    (r'^.*$', 'py.search.views.search'),    
    1111)