TwitterSearch

TwitterSearch is a search engine for Tweets.

TwitterSearch DOESN'T USE ANY EXTERNAL INFORMATION RETRIVAL LIBARARY like Lucene.

Feature: Twitter Search

Using TwitterSearch, you can search tweets from a given twitter user's timeline. Also, the search time range could be set to a specific time spec. The result ranking algorithm combine a tf-idf algorithm and tweets' impact score (calculating with each tweet's retweet count and favorite count)

Feature: Keyword Retrieval

Using the tf-idf algorithm and tweet impact score, TwitterSearch can give analysis about a certain users tweet keyword during a certain time spec.

Feature: Related Users

TwitterSearch can analyze a certain twitter user's timeline to give a related users list. We can use this users list to search for twitter or make a keyword retrieval on a more wide document(tweets) range. However, it may lead to API prohibition since the data would be large.

Notice:

I use Twitter API to crawl data with my personal twitter API key, so pls keep the API key in TwitterSearch/conf/setting.py secret.
Frequent using Twitter API would lead to a temporary prohibition according to Twitter Corp.

Install and Use

Hbase require a Java version over 1.8, and YOU SHOULD SET $JAVAHOME correctly

# using install.sh to install the requirments
./install.sh 
# using start_hbase.sh to start hbase database
./start_hbase.sh.
# reset databases (need to reset when you first use)
python reset_database.py
# use unitest to test the enviroment
python ./unittest.py

# using stop_hbase.sh to shutdown hbase database
./stop_hbase.sh

Trouble Shooting

When runing into “Resource u'corpora/wordnet' not found. Please use the NLTK” problem You may need to download wordnet corpus manualy

python -m nltk.downloader wordnet

Library dependencies

nltk
tweepy (twitter API library for python)
happybase (for hbase connection)
flask (web end)

Name		Name	Last commit message	Last commit date
Latest commit History 24 Commits
TwitterSearch		TwitterSearch
.gitignore		.gitignore
README.md		README.md
install.sh		install.sh
requirments.txt		requirments.txt
reset_database.py		reset_database.py
run.py		run.py
run_test.py		run_test.py
start_hbase.sh		start_hbase.sh
stop_hbase.sh		stop_hbase.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

TwitterSearch

Feature: Twitter Search

Feature: Keyword Retrieval

Feature: Related Users

Notice:

Install and Use

Trouble Shooting

Library dependencies

About

Uh oh!

Releases

Packages

Languages

huodahaha/TwitterSearch

Folders and files

Latest commit

History

Repository files navigation

TwitterSearch

Feature: Twitter Search

Feature: Keyword Retrieval

Feature: Related Users

Notice:

Install and Use

Trouble Shooting

Library dependencies

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages