YaCy

YaCy
Original author(s) Michael Christen
Developer(s) YaCy Community
Stable release
1.90 / 4 July 2016 (2016-07-04)
Operating system Cross-platform
Type Overlay network, Search engine
License GPLv2+
Website www.yacy.net/en

YaCy (pronounced "ya see") is a free distributed search engine, built on principles of peer-to-peer (P2P) networks.[1][2] Its core is a computer program written in Java distributed on several hundred computers, as of September 2006, so-called YaCy-peers. Each YaCy-peer independently crawls through the Internet, analyzes and indexes found web pages, and stores indexing results in a common database (so called index) which is shared with other YaCy-peers using principles of P2P networks. It is a free search engine that everyone can use to build a search portal for their intranet and to help search the public internet clearly.

Compared to semi-distributed search engines, the YaCy-network has a decentralised architecture. All YaCy-peers are equal and no central server exists. It can be run either in a crawling mode or as a local proxy server, indexing web pages visited by the person running YaCy on his or her computer. (Several mechanisms are provided to protect the user's privacy). Access to the search functions is made by a locally running web server which provides a search box to enter search terms, and returns search results in a similar format to other popular search engines.

In October 2015, after 11 years since the project launched, a large logic structure of YaCy which provides for all distributed ranking algorithms was declared by its core developers to have always been faulty, effectively impairing any ranking capabilities.[3]

YaCy is available on Windows, Mac and GNU/Linux.

System components

YaCy search engine is based on four elements:[4]

Crawler
A search robot which traverses from web page to web page and analyzes their content.
Indexer
Creates a Reverse Word Index (RWI) i.e. each word from the RWI has its list of relevant URLs and Ranking information. Words are saved in form of word hashes.
Search and Administration interface
Made as a web interface provided by a local HTTP servlet with servlet engine.
Data Storage
Used to store the Reverse Word Index Database utilizing a Distributed Hash Table.

Philosophy

The information society of the 21st century is based on free access to all public information. There is a huge focus on transparency, accountability and accessibility of information. YaCy aims to enable this free access to information effectively and realistically. Therefore, while major search engines of the global corporations are closed systems and their search technology is not transparent and comprehensible, YaCy provides an open-source and free search solution. Everyone can see how information is obtained for the search engine and displayed to the user.

There is a lot of free content on the Internet, such as Wikipedia, free music, data under Creative Commons and other free use licenses, etc. This free content should not only be discoverable using proprietary search engines in an increasingly monopolistic Internet infrastructure because then the monopoly holders decide what information is visible. YaCy believes that free information is truly free if it can be accessed using free software and YaCy fills in the missing link between free information and the user, free search.[5]

A Decentralised Search Engine

The Internet was built on original philosophy of an all-to-all infrastructure. But lately only transmitter-receiver connections have flooded the realm of the World Wide Web. Ideally, each consumer of content on the Web should have the same opportunity to produce content as to consume it. YaCy's goal is to help producers and users of information on the Web operate independently of the centralised search technique by making all content open to all people.

Benefits of the YaCy Philosophy

Civil Rights and Privacy

Ecological

Sociological

Advantages

PDF slides from ApacheCon 2012: A Web Search Appliance with Solr and YaCy

Disadvantages

Homepage of YaCy

YaCy as a Search Appliance:Topic-Oriented Search and Search Engine for Projects

Privacy & Security

Search Engine Technology

YaCy Network

Components of YaCy

YaCy consists of a variety of components that serve the networking, administration and maintenance of the index with blacklists, moderation functions and community communication. The following graph shows components in YaCy:

1.Statistics

2.XML APi

Web search of different components

3.Crawler

with Balancer

4.Web Server

5.Indexing

6.Peer-to-Peer

7.Monitoring

8.Filter & Blacklist

9.Search interface

10.Bookmarks

See also

References

  1. "YaCy takes on Google with open source search engine". The Register. 2011-11-29. Retrieved 2012-04-16.
  2. "YaCy: It's About Freedom, Not Beating Google". PC World. 2011-12-03. Retrieved 2012-04-16.
  3. "YaCy-Bugtracker". Retrieved 2016-03-08.
  4. "YaCy Technology Architecture". YaCy.net. Retrieved 2012-02-14.
  5. "YaCy - The Peer to Peer Search Engine: Philosophy". yacy.net. Retrieved 2016-01-04.
  6. "Search Engine Technology". Retrieved 28 January 2014.
  7. "YaCy crawler cannot parse URI's with IPv6 address in it inside square brackets. -". YaCy-Bugtracker. MantisBT Team. Retrieved 7 April 2014.
Wikimedia Commons has media related to YaCy.
This article is issued from Wikipedia - version of the 11/2/2016. The text is available under the Creative Commons Attribution/Share Alike but additional terms may apply for the media files.