230 likes | 315 Views
Ufa State Aviation Technical University. Grigory A. Makeev. Distributed Collaborative Filtering System as a Prototype of a New Information Messaging Media. Paranoia: a web-based blog and RSS aggregation system. Ufa, 200 7. Information messaging.
E N D
Ufa State Aviation Technical University Grigory A. Makeev Distributed Collaborative Filtering System as a Prototype of a New Information Messaging Media Paranoia: a web-based blog and RSS aggregation system Ufa, 2007
Information messaging A person, being an element of a social system, needs to obtain adequate information to interact with others. Thus we suppose that every person wishes to get information messages: • Important from his own point of view (selectivity); • In time (operativeness); • Most of existing important ones (pervasion); However, natural limitations are evident: • Importance can be estimated only by user himself; • Messages are too many to handle in time; • Messages are too many to process them all;
Hypothesis: collaboration At least until the semantics of natural languages can be processed effectively, importance of a message would always initially be estimated manually, by a human user. • One single user has to process messages manually • Many collaborative users can effectively process a large set of messages, exchanging important messages they find • May a message importance be estimated only once? • May a user use/trustan estimation of an arbitrary user(s)?
Recommender systems • Search engines: • Google • Web-based recommender systems: • GroupLens • IOwl • Online stores: • Amazon • Ebay • Resources with elements of social networks • General drawbacks of existing collaborative filtering systems: • recommendations are built using data from all users, thus result has a bad selectivity; • centralization; • vulnerability on logical and physical layers; • users lack control on the process; • users lack the explanation of the results; • systems do not allow an objective efficiency estimation.
An approach on collaborative filtering Data structures: messages • UsersU1,U2,…,Ui; • EveryUicontrolsa peer of a p2p-network, identified by a pair of security keys; • EveryUimanages a set of messagesMi; • Ifa messageis inMi, Uiis said to recommend this message; • Only user Uimay manage messages of Mi set; • Other users may retrieveMi, receiving a recommendation of Ui
An approach on collaborative filtering Data structures: rates • UsersU1,U2,…,Ui; • Every Uicontrols a set of ratesRi – pairs of (Uj,vij);vij [0,1]which may have an additional information, such as a channel; • Only userUimay manage rates inRi; • Other users may retrieveRi
An approach on collaborative filtering Extending rates set and message aggregation • Every user rates a limited number of users directly, that he knows of, or that he is somewhat sure of; • Transitivity allows us to extend a set of users, included in collaborative filtering for a particular user; • Transitive rate is computed with a special function TRF(Ui,Uj,Uk) to be found • Messages, retrieved from all users included in a filtering process, are sorted by how many users recommended it and what their value was; • Aggregation function AMF(m, R*i) is also to be found
A proposed scheme of collaborative filtering Stage 1 1. User evaluates an extended rates set of a sufficient depth.
A proposed scheme of collaborative filtering Stages 2-3 2. Retrieving messages from many peers, user evaluates an extended messages setM*I – unsorted result of collaborative filtering; 3. Calculating a value of every message, user evaluates an extended messages set MR*i – sorted result of collaborative filtering.
A proposed scheme of collaborative filtering Stages 4-5 4. User corrects his own set of messages Mi; 5. User corrects his own set of rates;
Advantages of the approach • Features of the system implementing the approach proposed: • Decentralization • Anonymity of authors • Authors can prove themselves and ownership on the message • Selectivity • Controllability • Explainability • Flood resistance • Antagonistic societies can co-exist and even collaborate
Results of the formal analysis and experiments • Criteria of controllability and persistency on users and messages found and formalized; • Several transitivity functions TRF and message aggregation function AMF found, examined to conform criteria found and the best one chosen; • A system of virtual users created, seeking and exchanging important messages: • Messages considered numbers; • Every user had a favourite number; • Users constructed their trusted neighbours in the making, starting with random rates set, or a preset one; • Users aim at collecting most favourable messages; • An objective efficiency of the system is calculated; • Dependencies of efficiency on many factors investigated;
Proposed prototype implementation A web-based RSS aggregator • HTTP instead of p2p-network protocols • DNS routing instead of ad-hoc p2p naming and routing protocols • Web-server instead of p2p-node • Users sharing common web-servers instead of users on p2p-nodes • RSS as a message delivery protocol • It looks like a web-based RSS aggregator, but a typical one of them • does not actually “aggregate”, merely “collects” • It looks like a typical web-based collaborative filtering system, but most of them • use “general” reputation, influenced by everyone • are server based, centralized • are not customizeable As a working prototype we propose an open-source (GNU GPL) web-based RSS aggregator – Paranoia, available at http://greg.southural.ru/paranoia/
Proposed prototype implementation An open-source web-based RSS aggregator - Paranoia
Proposed prototype implementation An open-source web-based RSS aggregator - Paranoia
Proposed prototype implementation An open-source web-based RSS aggregator - Paranoia
Proposed prototype implementation An open-source web-based RSS aggregator - Paranoia
Proposed prototype implementation An open-source web-based RSS aggregator - Paranoia
Proposed prototype implementation An open-source web-based RSS aggregator - Paranoia
Proposed prototype implementation Non-trivial features • An environment appears to be very flexible, and many tasks can be solved trivially within: • Administrator notifications: every user automatically rates a local administrator in a channel ‘system’ • Users feedback: local administrator automatically rates every user in a channel ‘feedback’
Proposed prototype implementation Non-trivial features • Comments to messages are merely one’s own messages, stored in a special channel: • Comments to do leave creator’s peer; • Comments are retrieved when needed, following the same rules as any other message; • If comments are retrieved only from trusted peers and are not stored locally: • No one (except trusted peers) can spam the discussion; • Different groups with rates among group fellows can discuss the same message without interfering!
Conclusion • In our opinion messaging systems (news messaging or whatsoever) would evolve gradually: • to be distributed among many storages • to have many initial sources of information • with emphasis to direct witnesses • to implement collaborative filtering • specific for every user • controllable by every user • resistant to most types of malicious behaviour Thank you!