gregjeanmart

5 min read - Posted 10 Dec 19

Getting started with Mahuta - A Search engine for the IPFS

Mahuta (formerly known as IPFS-Store) is a convenient library and API to aggregate and consolidate files or documents stored by your application on the IPFS network. It provides a solution to collect, store, index and search data used.

Features

  • Indexation: Mahuta stores documents or files on IPFS and index the hash with optional metadata.
  • Discovery: Documents and files indexed can be searched using complex logical queries or fuzzy/full text search)
  • Scalable: Optimised for large scale applications using asynchronous writing mechanism and caching
  • Replication: Replica set can be configured to replicate (pin) content across multiple nodes (standard IPFS node or IPFS-cluster node)
  • Multi-platform: Mahuta can be used as a simple embedded Java library for your JVM-based application or run as a simple, scalable and configurable Rest API.

Mahuta.jpg


Getting Started

These instructions will get you a copy of the project up and running on your local machine for development and testing purposes.

Prerequisites

Mahuta depends of two components: - an IPFS node (go or js implementation) - a search engine (currently only ElasticSearch is supported)

See how to run those two components first run IPFS and ElasticSearch

Java library

  1. Import the Maven dependencies (core module + indexer)
<dependency>
    <groupId>net.consensys.mahuta</groupId>
    <artifactId>mahuta-core</artifactId>
    <version>${MAHUTA_VERSION}</version>
</dependency>
<dependency>
    <groupId>net.consensys.mahuta</groupId>
    <artifactId>mahuta-indexing-elasticsearch</artifactId>
    <version>${MAHUTA_VERSION}</version>
</dependency>
  1. Configure Mahuta to connect to an IPFS node and an indexer
Mahuta mahuta = new MahutaFactory()
    .configureStorage(IPFSService.connect("localhost", 5001))
    .configureIndexer(ElasticSearchService.connect("localhost", 9300, "cluster-name"))
    .defaultImplementation();
  1. Execute high-level operations
IndexingResponse response = mahuta.prepareStringIndexing("article", "## This is my first article")
    .contentType("text/markdown")
    .indexDocId("article-1")
    .indexFields(ImmutableMap.of("title", "First Article", "author", "greg"))
    .execute();
    
GetResponse response = mahuta.prepareGet()
    .indexName("article")
    .indexDocId("article-1")
    .loadFile(true)
    .execute();
    
SearchResponse response = mahuta.prepareSearch()
    .indexName("article")
    .query(Query.newQuery().equals("author", "greg"))
    .pageRequest(PageRequest.of(0, 20))
    .execute();

For more info, Mahuta Java API

Spring-Data

  1. Import the Maven dependencies
<dependency>
    <groupId>net.consensys.mahuta</groupId>
    <artifactId>mahuta-springdata</artifactId>
    <version>${MAHUTA_VERSION}</version>
</dependency>
  1. Configure your spring-data repository
@IPFSDocument(index = "article", indexConfiguration = "article_mapping.json", indexContent = true)
public class Article {
    
    @Id
    private String id;

    @Hash
    private String hash;

    @Fulltext
    private String title;

    @Fulltext
    private String content;

    @Indexfield
    private Date createdAt;

    @Indexfield
    private String createdBy;
}



public class ArticleRepository extends MahutaRepositoryImpl<Article, String> {

    public ArticleRepository(Mahuta mahuta) {
        super(mahuta);
    }
}

For more info, Mahuta Spring Data

HTTP API with Docker

Prerequisites
Docker
$ docker run -it --name mahuta \ 
    -p 8040:8040 \
    -e MAHUTA_IPFS_HOST=ipfs \
    -e MAHUTA_ELASTICSEARCH_HOST=elasticsearch \
    gjeanmart/mahuta
Docker Compose

Docker-compose sample file

Examples

To access the API documentation, go to Mahuta HTTP API

Create the index article
  • Sample Request:
curl -X POST \
  http://localhost:8040/mahuta/config/index/article \
  -H 'Content-Type: application/json'
  • Success Response:

    • Code: 200
      Content: { "status": "SUCCESS" }
Store and index an article and its metadata
  • Sample Request:
curl -X POST \
  'http://localhost:8040/mahuta/index' \
  -H 'content-type: application/json' \
  -d '{"content":"# Hello world,\n this is my first file stored on **IPFS**","indexName":"article","indexDocId":"hello_world","contentType":"text/markdown","index_fields":{"title":"Hello world","author":"Gregoire Jeanmart","votes":10,"date_created":1518700549,"tags":["general"]}}'
  • Success Response:

    • Code: 200
      Content: { "indexName": "article", "indexDocId": "hello_world", "contentId": "QmWHR4e1JHMs2h7XtbDsS9r2oQkyuzVr5bHdkEMYiqfeNm", "contentType": "text/markdown", "pinned": true, "indexFields": { "title": "Hello world", "author": "Gregoire Jeanmart", "votes": 10, "createAt": 1518700549, "tags": [ "general" ] }, "status": "SUCCESS" }
Search by query
  • Sample Request:
curl -X POST \
 'http://localhost:8040/mahuta/query/search?index=article' \
 -H 'content-type: application/json' \
 -d '{"query":[{"name":"title","operation":"CONTAINS","value":"Hello"},{"name":"author.keyword","operation":"EQUALS","value":"Gregoire Jeanmart"},{"name":"votes","operation":"GT","value":"5"}]}'
  • Success Response:

    • Code: 200
      Content:
{
  "status": "SUCCESS",
  "page": {
    "pageRequest": {
      "page": 0,
      "size": 20,
      "sort": null,
      "direction": "ASC"
    },
    "elements": [
      {
        "metadata": {
          "indexName": "article",
          "indexDocId": "hello_world",
          "contentId": "Qmd6VkHiLbLPncVQiewQe3SBP8rrG96HTkYkLbMzMe6tP2",
          "contentType": "text/markdown",
          "pinned": true,
          "indexFields": {
            "author": "Gregoire Jeanmart",
            "votes": 10,
            "title": "Hello world",
            "createAt": 1518700549,
            "tags": ["general"]
          }
        }
      }
    ],
    "totalElements": 1,
    "totalPages": 1
  }
}
Created with Sketch.Content is"CC-BY-SA 4.0" licensed
Article On-chain
0 Comments
Related Articles
(1/5) Getting started with Elastic Stack for monitoring on Kubernetes

Introduction In this article, we will learn how to set up a monitoring stack for your Kubernetes environment (k8s in short). This kind of solution allows your team to gain visibility on your infrastructure and each application with a minimal impact on the existing. The goal of observability is to provide tools to operators responsible of running the production to detect undesirables behaviours (service downtime, errors, slow responses) and have actionable information to find the root cause of an

(5/5) Collect traces with Elastic APM for monitoring Kubernetes

Elastic APM is an application performance monitoring system built on the Elastic Stack. It allows you to monitor software services and applications in real time — collect detailed performance information on response time for incoming requests, database queries, calls to caches, external HTTP requests, and more. This makes it easy to pinpoint and fix performance problems quickly. Elastic APM is OpenTracing compliant which means you can take advantages of the large range of libraries already avail