A software program/script that searches documents and files for keywords and returns the results of any files containing those keywords.
The first search engine ever developed is considered Archie, used to find files stored on anonymous FTP sites.
The first text-based search engine is considered Veronica, used as a search tool in Gopher.
Web Search Engine
A software that is designed to search for information(web pages, images and other types of files) on the World Wide Web.
They used crawlers to go through every page, creates a huge index, receives the search request, compares it to the entries in the index, and returns results.
Eg. Google, Duck Duck Go, Bing etc.
Search files/pattern in FS
1. find
2. locate
Search files/pattern in FS
3. grep
4. find + xargs + grep
Search Engine Programs
Whoosh: A fast, featureful full-text indexing and searching library implemented in pure Python.
Xapian: Open source search engine library, which allows developers to easily add advanced indexing and search facilities to their own applications
Text/ Multimedia search engine for
user contents
A search engine for text, image, and audio files.
It searches image files according to date, month, year, locality, city,state, country, and postal code.
It searches textual files according to the given search word and returns path/to/file containing number of iterations of the word.
It searches audio files according to the artist, album, genre and year.
Text Files
Yielding a list of all text files from given directory.
Saving the indexes
Creating indexes of files along with the repetition of words.
Looking for search word in saved indexes
Audio Files
Yielding files ('.mp3', '.ogg', '.wav', '.flac', '.wma') from given directory.
Extracting metadata using pyexiftool from files.
Saving required metadata -- (file_type, file_size, artist, album, genre, year)
Index metadata along with their respective files and save them.
Look forgiven artist, album, genre, year in saved files.
Image Files
Yielding a list of files ('.png', '.tif', '.jpg', '.gif', '.JPEG') from given directory.