Background
}Goal: Mesh
the viewing of RSS Articles with related Twitter Message
◦Use Machine Learning
to determine the most significant Tweets based on User’s preferences.
◦Determine the best ML
model to use that can handle dynamic content
Process
}Article Processing
◦Clean Article Text
◦Keyword Extraction
(Regular Expressions)
Proper
Nouns
Twitter
Hash Tags / User Names
}Keyword Analysis
}Twitter Querying
Multinomial Naive Bayes
}Training / Learning
◦Consider T to be the
Set of Tweets used in Training
1.) Extract Keywords
from Tweets (K)
2.) For Each T
2a.) Calculate the Prior Based on previous
User Feedback
2b.) For Each Extracted Keyword In T
3.) Aggregate Keyword Weights and compute the Conditional Probability
}Classification
1.Extract
the Keywords from the Article
2.For Each
Classifying Tweet
2a. Go Through Each Keyword in the Article
3a. Calculate the Conditional Probability
4a. Sum this value for all Keywords in a Tweet
3. Sort Tweets by this Value to get Ranking
Implementation
}MALLET
Java-based package for statistical natural language processing, document classification, clustering, topic modeling, information extraction, and other machine learning applications to text.
'Project' 카테고리의 다른 글
PDF merger (0) | 2011.02.21 |
---|---|
Full of Sheep Traveler ( Ticket Booking System ) (0) | 2011.02.21 |
Courses Recommend System (0) | 2011.02.21 |
Scotland Yard - Online Board Game (0) | 2011.02.21 |
Orderly / Training Control System (0) | 2011.02.20 |