Getting started with Mahuta - A Search engine for the IPFS
Mahuta (formerly known as IPFS-Store) is a convenient library and API to aggregate and consolidate files or documents stored by your application on the IPFS network. It provides a solution to collect, store, index and search data used.
Features
- Indexation: Mahuta stores documents or files on IPFS and index the hash with optional metadata.
- Discovery: Documents and files indexed can be searched using complex logical queries or fuzzy/full text search)
- Scalable: Optimised for large scale applications using asynchronous writing mechanism and caching
- Replication: Replica set can be configured to replicate (pin) content across multiple nodes (standard IPFS node or IPFS-cluster node)
- Multi-platform: Mahuta can be used as a simple embedded Java library for your JVM-based application or run as a simple, scalable and configurable Rest API.
Getting Started
These instructions will get you a copy of the project up and running on your local machine for development and testing purposes.
Prerequisites
Mahuta depends of two components: - an IPFS node (go or js implementation) - a search engine (currently only ElasticSearch is supported)
See how to run those two components first run IPFS and ElasticSearch
Java library
- Import the Maven dependencies (core module + indexer)
<dependency>
<groupId>net.consensys.mahuta</groupId>
<artifactId>mahuta-core</artifactId>
<version>${MAHUTA_VERSION}</version>
</dependency>
<dependency>
<groupId>net.consensys.mahuta</groupId>
<artifactId>mahuta-indexing-elasticsearch</artifactId>
<version>${MAHUTA_VERSION}</version>
</dependency>
- Configure Mahuta to connect to an IPFS node and an indexer
Mahuta mahuta = new MahutaFactory()
.configureStorage(IPFSService.connect("localhost", 5001))
.configureIndexer(ElasticSearchService.connect("localhost", 9300, "cluster-name"))
.defaultImplementation();
- Execute high-level operations
IndexingResponse response = mahuta.prepareStringIndexing("article", "## This is my first article")
.contentType("text/markdown")
.indexDocId("article-1")
.indexFields(ImmutableMap.of("title", "First Article", "author", "greg"))
.execute();
GetResponse response = mahuta.prepareGet()
.indexName("article")
.indexDocId("article-1")
.loadFile(true)
.execute();
SearchResponse response = mahuta.prepareSearch()
.indexName("article")
.query(Query.newQuery().equals("author", "greg"))
.pageRequest(PageRequest.of(0, 20))
.execute();
For more info, Mahuta Java API
Spring-Data
- Import the Maven dependencies
<dependency>
<groupId>net.consensys.mahuta</groupId>
<artifactId>mahuta-springdata</artifactId>
<version>${MAHUTA_VERSION}</version>
</dependency>
- Configure your spring-data repository
@IPFSDocument(index = "article", indexConfiguration = "article_mapping.json", indexContent = true)
public class Article {
@Id
private String id;
@Hash
private String hash;
@Fulltext
private String title;
@Fulltext
private String content;
@Indexfield
private Date createdAt;
@Indexfield
private String createdBy;
}
public class ArticleRepository extends MahutaRepositoryImpl<Article, String> {
public ArticleRepository(Mahuta mahuta) {
super(mahuta);
}
}
For more info, Mahuta Spring Data
HTTP API with Docker
Prerequisites
Docker
$ docker run -it --name mahuta \
-p 8040:8040 \
-e MAHUTA_IPFS_HOST=ipfs \
-e MAHUTA_ELASTICSEARCH_HOST=elasticsearch \
gjeanmart/mahuta
Docker Compose
Examples
To access the API documentation, go to Mahuta HTTP API
Create the index article
- Sample Request:
curl -X POST \
http://localhost:8040/mahuta/config/index/article \
-H 'Content-Type: application/json'
-
Success Response:
- Code: 200
Content:
- Code: 200
{
"status": "SUCCESS"
}
Store and index an article and its metadata
- Sample Request:
curl -X POST \
'http://localhost:8040/mahuta/index' \
-H 'content-type: application/json' \
-d '{"content":"# Hello world,\n this is my first file stored on **IPFS**","indexName":"article","indexDocId":"hello_world","contentType":"text/markdown","index_fields":{"title":"Hello world","author":"Gregoire Jeanmart","votes":10,"date_created":1518700549,"tags":["general"]}}'
-
Success Response:
- Code: 200
Content:
- Code: 200
{
"indexName": "article",
"indexDocId": "hello_world",
"contentId": "QmWHR4e1JHMs2h7XtbDsS9r2oQkyuzVr5bHdkEMYiqfeNm",
"contentType": "text/markdown",
"pinned": true,
"indexFields": {
"title": "Hello world",
"author": "Gregoire Jeanmart",
"votes": 10,
"createAt": 1518700549,
"tags": [
"general"
]
},
"status": "SUCCESS"
}
Search by query
- Sample Request:
curl -X POST \
'http://localhost:8040/mahuta/query/search?index=article' \
-H 'content-type: application/json' \
-d '{"query":[{"name":"title","operation":"CONTAINS","value":"Hello"},{"name":"author.keyword","operation":"EQUALS","value":"Gregoire Jeanmart"},{"name":"votes","operation":"GT","value":"5"}]}'
-
Success Response:
- Code: 200
Content:
- Code: 200
{
"status": "SUCCESS",
"page": {
"pageRequest": {
"page": 0,
"size": 20,
"sort": null,
"direction": "ASC"
},
"elements": [
{
"metadata": {
"indexName": "article",
"indexDocId": "hello_world",
"contentId": "Qmd6VkHiLbLPncVQiewQe3SBP8rrG96HTkYkLbMzMe6tP2",
"contentType": "text/markdown",
"pinned": true,
"indexFields": {
"author": "Gregoire Jeanmart",
"votes": 10,
"title": "Hello world",
"createAt": 1518700549,
"tags": ["general"]
}
}
}
],
"totalElements": 1,
"totalPages": 1
}
}
- Kauri original title: Getting started with Mahuta - A Search engine for the IPFS
- Kauri original link: https://kauri.io/getting-started-with-mahuta-a-search-engine-for-th/874b1fe11d00406bbbef053405fd4538/a
- Kauri original author: Grégoire Jeanmart (@gregjeanmart)
- Kauri original Publication date: 2019-12-10
- Kauri original tags: ipfs-store, elasticsearch, ipfs, mahuta, search-engine, caching
- Kauri original hash: QmbaS915QBSxes6g5WU5jFVZVuv4T13xxTVd4uSkWbnsrD
- Kauri original checkpoint: Qmekp5iiDi5N5M4KdtAVGBEJEF3ahMgWYZJqL7s1qmkQ9g